How to choose the right statistical test

Question

QA Hub Editorial · Accepted Answer

Short answer

Choosing the right statistical test depends on the type of data, number of groups, sample size, and whether distributional assumptions are met.

Steps

Identify the measurement scale of your variables: nominal, ordinal, interval, or ratio.
Determine the number of groups and whether comparisons are paired or independent.
Check assumptions such as normality and equal variance.
Select a parametric test if assumptions hold; otherwise choose a non-parametric equivalent.
Verify that the test addresses the specific research question and effect of interest.

Tips

Use flowcharts or decision trees to guide test selection systematically.
When in doubt, perform both parametric and non-parametric tests and compare conclusions.
Consider permutation tests for exact inference with complex designs.
Document your rationale for test selection in analysis reports.

Common issues

Using t-tests for more than two groups without correcting for multiple comparisons.
Applying parametric tests to ordinal data with small samples.
Ignoring paired structures and treating repeated measures as independent.
Confusing one-tailed and two-tailed hypotheses leading to incorrect p-values.

Example

import pandas as pd
import numpy as np

df = pd.DataFrame({'sales': [100, 150, 200, np.nan]})
df['sales'] = df['sales'].fillna(df['sales'].median())
print(df.describe())

This snippet creates a DataFrame, handles a missing value with the median, and prints summary statistics common in exploratory analysis.

Short answer

Steps

Tips

Common issues

Example

Related Questions

What is the difference between precision and recall

What is feature engineering and why is it important

How to preprocess data for machine learning models

How to document a data pipeline

How to optimize slow data pipelines

How to version datasets for reproducibility