Collection of notes for various classes I've taken.
The Central Limit Theorem (CLT) is one of the most fundamental concepts in statistics. It states that the sampling distribution of the sample mean ($\bar{X}$) will be approximately normally distributed, regardless of the original population’s distribution, as long as the sample size ($n$) is sufficiently large.
In simple terms: If you take many random samples from any population (even a skewed one) and calculate the mean of each sample, the histogram of those sample means will form a bell-shaped (normal) distribution.
For the CLT to apply, the following conditions are generally required:
Let’s say we draw a sample of size $n$ from a population that has:
The Central Limit Theorem describes the resulting distribution of all possible sample means ($\bar{X}$):
The mean of the sampling distribution is equal to the original population mean.
\[\mu_{\bar{X}} = \mu\]The standard deviation of the sampling distribution is called the Standard Error (SE). It is the population standard deviation divided by the square root of the sample size.
\(\sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}}\) This formula shows that as the sample size $n$ increases, the standard error $\sigma_{\bar{X}}$ decreases. This means our sample means will be clustered more tightly around the true population mean $\mu$.
As $n$ gets large, the distribution of $\bar{X}$ approaches a normal distribution:
\[\bar{X} \sim N\left(\mu_{\bar{X}}, \sigma_{\bar{X}}^2\right) \quad \text{which is} \quad \bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right)\]Because the sampling distribution of $\bar{X}$ is normal, we can use a z-score to find probabilities. The z-score formula is modified to use the parameters of the sampling distribution (its mean and standard error).
Z-Score for a Sample Mean:
\[Z = \frac{\bar{X} - \mu_{\bar{X}}}{\sigma_{\bar{X}}} = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}}\]Interpretation: This z-score tells us how many standard errors a particular sample mean ($\bar{X}$) is away from the population mean ($\mu$).
Problem: The average weight of a certain species of apple is $\mu = 150$ grams, with a standard deviation of $\sigma = 15$ grams. The distribution of weights is unknown and may be skewed.
If we take a random sample of $n = 36$ apples:
1. Describe the sampling distribution of the sample mean ($\bar{X}$).
2. What is the probability that the mean weight of the sample ($\bar{X}$) is 153 grams or less?