Confirmatory vs Exploratory Data Analysis
-
Confirmatory Analysis
- Inferential Statistics – Deductive Approach
- Heavy reliance on probability models
- Must accept untestable assumptions
- Look for definite answers to specific questions
- Emphasis on numerical calculations
- Hypotheses determined at outset
- Hypothesis tests and formal confidence interval estimation
- Advantages
- Provide precise information in the right circumstances
- Well-established theory and methods
- Disadvantages
- Misleading impression of precision in less than ideal circumstances
- Analysis driven by preconceived ideas
- Difficult to notice unexpected results
- Inferential Statistics – Deductive Approach
-
Exploratory Analysis
- Descriptive Statistics – Inductive Approach
- Look for flexible ways to examine data without preconceptions
- Attempt to evaluate validity of assumptions
- Heavy reliance on graphical displays
- Let data suggest questions
- Focus on indications and approximate error magnitudes
- Advantages
- Flexible ways to generate hypotheses
- More realistic statements of accuracy
- Does not require more than data can support
- Promotes deeper understanding of processes
- Statistical learning
- Disadvantages
- Usually does not provide definitive answers
- Difficult to avoid optimistic bias produced by overfitting
- Requires judgement and artistry – can’t be cookbooked
- Descriptive Statistics – Inductive Approach