How to apply the central limit theorem

Question

QA Hub Editorial · Accepted Answer

Short answer

The central limit theorem states that the sampling distribution of the mean approaches a normal distribution as sample size grows, regardless of the population distribution shape.

Steps

Collect independent random samples of size n from the population.
Compute the mean for each sample.
Plot the distribution of these sample means.
Observe that as n increases, the distribution becomes increasingly normal and its variance shrinks.
Use this normality to construct confidence intervals and perform hypothesis tests on the mean.

Tips

The theorem applies to sums and means, not necessarily to other statistics.
Larger samples are needed for highly skewed or heavy-tailed distributions.
Independence of observations is crucial; violations can invalidate the result.
The theorem justifies using z-tests and t-tests for large samples even when the underlying data is non-normal.

Common issues

Applying the theorem to small samples from highly non-normal populations.
Violating independence through clustered or time-series sampling.
Confusing the distribution of sample means with the distribution of individual observations.
Ignoring finite sample biases when n is moderate.

Example

import pandas as pd
import numpy as np

df = pd.DataFrame({'sales': [100, 150, 200, np.nan]})
df['sales'] = df['sales'].fillna(df['sales'].median())
print(df.describe())

This snippet creates a DataFrame, handles a missing value with the median, and prints summary statistics common in exploratory analysis.

Short answer

Steps

Tips

Common issues

Example

Related Questions

What is a normal distribution and why it matters

How to use confidence intervals

How to perform A/B testing analysis

How to understand and use p-values correctly

How linear regression makes predictions

How to interpret correlation coefficients