Nate's Notes

Collection of notes for various classes I've taken.

Buy Me A Coffee

October 6

Normal Approximation to the Binomial Distribution

Calculating probabilities for a Binomial Random Variable ($X \sim B(n, p)$) can be computationally intensive when the number of trials ($n$) is very large.

The Normal Approximation allows us to use the Normal distribution (which is continuous) to estimate probabilities for the Binomial distribution (which is discrete) when $n$ is large.

The Condition (Rule of Thumb)

To ensure the approximation is accurate, the binomial distribution should be reasonably symmetric (not too skewed). We can use the normal distribution if both of the following conditions are met:

This checks that there are at least 5 expected “successes” ($np$) and 5 expected “failures” ($n(1-p)$).

Finding the Normal Parameters

If the condition is met, we can approximate $X \sim B(n, p)$ with a Normal random variable $Y \sim N(\mu, \sigma^2)$, where:

Such that \(Y \sim N(np, np(1-p))\)

The Continuity Correction

This is the most critical step. We are using a continuous distribution (Normal) to model a discrete distribution (Binomial).

The continuity correction bridges this gap by representing the discrete integer $k$ as a continuous interval from $k - 0.5$ to $k + 0.5$. We add or subtract 0.5 from the discrete value(s) to include the “full bar” of the binomial histogram.

Correction Rules

Let $X$ be the discrete Binomial variable and $Y$ be the continuous Normal variable.

Discrete (Binomial) Continuous (Normal) with Correction
$P(X = k)$ $P(k - 0.5 \le Y \le k + 0.5)$
$P(X \le k)$ $P(Y \le k + 0.5)$
$P(X < k)$ $P(Y \le k - 0.5)$ (Same as $P(X \le k-1)$)
$P(X \ge k)$ $P(Y \ge k - 0.5)$
$P(X > k)$ $P(Y \ge k + 0.5)$ (Same as $P(X \ge k+1)$)

Mnemonic:

Summary of Steps

  1. Identify $n$ and $p$ from the binomial problem.
  2. Check Condition: Verify that $np \ge 5$ and $n(1-p) \ge 5$.
  3. Find Parameters: Calculate the mean $\mu = np$ and standard deviation $\sigma = \sqrt{np(1-p)}$.
  4. Apply Continuity Correction: Adjust your discrete value(s) $k$ to the continuous interval using the 0.5 correction.
  5. Calculate Z-Score(s): Use the adjusted value(s), $\mu$, and $\sigma$: \(Z = \frac{Y - \mu}{\sigma}\)
  6. Find Probability: Use a standard Z-table to find the probability associated with your calculated z-score(s).

Example

Problem: A fair coin is tossed 400 times. What is the probability of getting exactly 210 heads?

  1. Identify $n$ and $p$:
    • $n = 400$
    • $p = 0.5$ (fair coin)
  2. Check Condition:
    • $np = 400 \times 0.5 = 200$. This is $\ge 10$.
    • $n(1-p) = 400 \times (1 - 0.5) = 200$. This is $\ge 10$.
    • Condition is met. We can use the approximation.
  3. Find Parameters:
    • $\mu = np = 200$
    • $\sigma = \sqrt{400 \times 0.5 \times 0.5} = \sqrt{100} = 10$
    • So, $Y \sim N(200, 10^2)$.
  4. Apply Continuity Correction:
    • We want $P(X = 210)$.
    • Using the correction: $P(210 - 0.5 \le Y \le 210 + 0.5) = P(209.5 \le Y \le 210.5)$.
  5. Calculate Z-Scores:
    • For $Y = 209.5$: $Z_1 = \frac{209.5 - 200}{10} = \frac{9.5}{10} = 0.95$
    • For $Y = 210.5$: $Z_2 = \frac{210.5 - 200}{10} = \frac{10.5}{10} = 1.05$
  6. Find Probability:
    • We need $P(0.95 \le Z \le 1.05) = P(Z \le 1.05) - P(Z \le 0.95)$.
    • From Z-table: $0.8531 - 0.8289 = 0.0242$.