Sampling Distributions and the Central Limit Theorem
How sample statistics behave across many samples
Suppose you want to know the average height of all adults in your country. You cannot measure everyone, so you take a sample of 100 people and calculate their average height. Now here is a question that might not have occurred to you: if you took a different sample of 100 people, would you get exactly the same average?
Almost certainly not. Different samples give different results. This variability is not a flaw in your method. It is an unavoidable feature of sampling. But if your sample average keeps changing from sample to sample, how can you trust any single estimate?
This is where one of the most remarkable theorems in all of mathematics comes to the rescue: the Central Limit Theorem. It tells us exactly how sample averages behave across many samples, and it explains why statistical inference works at all. Understanding this theorem is like getting a backstage pass to how polling, medical research, quality control, and countless other fields draw reliable conclusions from limited data.
Core Concepts
Statistic vs. Parameter: A Quick Review
Before diving into sampling distributions, let us make sure we are clear on two terms that sound similar but mean very different things.
A parameter is a number that describes some characteristic of an entire population. Parameters are usually unknown because we rarely have access to the whole population. We use Greek letters for parameters:
- $\mu$ (mu) = population mean
- $\sigma$ (sigma) = population standard deviation
- $p$ = population proportion
A statistic is a number calculated from a sample. Statistics are known because we calculate them from our actual data. We use Roman letters or symbols with “hats” for statistics:
- $\bar{x}$ (x-bar) = sample mean
- $s$ = sample standard deviation
- $\hat{p}$ (p-hat) = sample proportion
The goal of statistical inference is to use statistics (what we know from our sample) to make conclusions about parameters (what we want to know about the population).
Sampling Variability: Different Samples Give Different Results
Imagine a population of 1,000 college students. Their average sleep per night is $\mu = 6.8$ hours with a standard deviation of $\sigma = 1.2$ hours. You do not know these population values. Instead, you randomly select 50 students and find their sample mean is $\bar{x} = 6.5$ hours.
If your classmate independently selects a different random sample of 50 students, she might get $\bar{x} = 7.1$ hours. Another sample might yield $\bar{x} = 6.7$ hours. This variation from sample to sample is called sampling variability.
Sampling variability is not a mistake. It is a natural consequence of the fact that any sample includes only some members of the population. The question is: how much variation should we expect? Can we predict the pattern of this variation?
The Sampling Distribution of a Statistic
Here is a thought experiment. Imagine you could take every possible sample of size $n$ from a population, calculate the statistic (say, the sample mean) for each sample, and then make a histogram of all those sample means.
This histogram represents the sampling distribution of the statistic. A sampling distribution shows:
- All possible values the statistic could take
- How likely each value is (how often it would occur across all possible samples)
The sampling distribution is a probability distribution for the statistic. It tells you how the statistic behaves across repeated sampling.
Important: The sampling distribution is not about the data in any single sample. It is about how the statistic varies if you were to take many, many samples.
Sampling Distribution of the Sample Mean $\bar{x}$
The most important sampling distribution is that of the sample mean. Let us build intuition for what it looks like.
Suppose you take samples of size $n$ from a population with mean $\mu$ and standard deviation $\sigma$. The sampling distribution of $\bar{x}$ has these properties:
1. Center (Mean of the sampling distribution):
The mean of all possible sample means equals the population mean: $$\mu_{\bar{x}} = \mu$$
This is reassuring. It says that sample means are “centered” on the true population mean. Some samples will overestimate, some will underestimate, but on average, sample means hit the target. We say that $\bar{x}$ is an unbiased estimator of $\mu$.
2. Spread (Standard deviation of the sampling distribution):
The standard deviation of all possible sample means is: $$\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$$
This quantity $\sigma_{\bar{x}}$ is called the standard error of the mean. It measures how much sample means typically vary from the population mean.
Notice the $\sqrt{n}$ in the denominator. As sample size increases, the standard error decreases. Larger samples give more consistent results. If you quadruple your sample size, you cut the standard error in half.
3. Shape:
If the population itself is Normally distributed, then the sampling distribution of $\bar{x}$ is also exactly Normal, regardless of sample size.
But what if the population is not Normal? This is where the Central Limit Theorem enters the picture.
The Central Limit Theorem (CLT)
The Central Limit Theorem is one of the most important results in statistics. It says:
For a random sample of size $n$ from any population with mean $\mu$ and finite standard deviation $\sigma$, the sampling distribution of $\bar{x}$ becomes approximately Normal as $n$ increases, regardless of the shape of the original population.
In other words:
- Start with any population distribution (skewed, uniform, bimodal, whatever)
- Take samples of size $n$ and calculate $\bar{x}$ for each
- When $n$ is large enough, these sample means follow an approximately Normal distribution
The approximation gets better as $n$ increases. How large is “large enough”? That depends on how non-Normal the original population is:
- If the population is already approximately Normal, even small samples work fine
- If the population is moderately skewed, $n \geq 30$ usually suffices
- If the population is heavily skewed with extreme outliers, you may need $n \geq 50$ or more
The Central Limit Theorem formula:
For large $n$, the sampling distribution of $\bar{x}$ is approximately: $$\bar{x} \sim N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)$$
This means $\bar{x}$ follows a Normal distribution with mean $\mu$ and standard deviation $\frac{\sigma}{\sqrt{n}}$.
Why the CLT Is Remarkable
Stop and appreciate how extraordinary this theorem is. The original population could have any shape at all. It could be highly skewed. It could have multiple peaks. It could be nothing like a bell curve. Yet when you average together $n$ observations, those averages form a beautifully symmetric, bell-shaped Normal distribution.
This explains why the Normal distribution appears so often in practice. Many measurements are themselves averages or sums of smaller effects. Your height is influenced by many genes and environmental factors. The time to complete a complex task is the sum of many sub-task times. The Central Limit Theorem tells us why these composite measurements tend toward Normality.
Standard Error: Measuring Sampling Variability
The standard error of a statistic is the standard deviation of its sampling distribution. It measures how much the statistic typically varies from sample to sample.
For the sample mean: $$\text{Standard Error of } \bar{x} = \sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$$
In practice, we usually do not know $\sigma$ (the population standard deviation), so we estimate it using the sample standard deviation $s$: $$\text{Estimated Standard Error of } \bar{x} = \frac{s}{\sqrt{n}}$$
Key insight about standard error and sample size:
| Sample Size $n$ | Standard Error = $\frac{\sigma}{\sqrt{n}}$ |
|---|---|
| 25 | $\frac{\sigma}{5} = 0.20\sigma$ |
| 100 | $\frac{\sigma}{10} = 0.10\sigma$ |
| 400 | $\frac{\sigma}{20} = 0.05\sigma$ |
| 2,500 | $\frac{\sigma}{50} = 0.02\sigma$ |
To cut the standard error in half, you must quadruple the sample size. This is why larger samples are more reliable, but the improvement shows diminishing returns.
Why the CLT Matters for Inference
The Central Limit Theorem is the foundation of most statistical inference. Here is why it matters so much:
1. We can use Normal distribution techniques
Even when the population is not Normal, we can use Normal probabilities to analyze sample means (for large samples). This makes calculations tractable.
2. We can construct confidence intervals
A 95% confidence interval for $\mu$ is approximately: $$\bar{x} \pm 1.96 \times \frac{\sigma}{\sqrt{n}}$$
This formula relies on the fact that $\bar{x}$ is approximately Normal, which the CLT guarantees.
3. We can perform hypothesis tests
Tests about population means use the fact that $\bar{x}$ is approximately Normal to calculate p-values and make decisions.
4. It works regardless of population shape
This is crucial for real-world applications. We rarely know the exact shape of the population we are sampling from. The CLT tells us we do not need to know. With a large enough sample, the sampling distribution of $\bar{x}$ is approximately Normal anyway.
Sampling Distribution of a Proportion $\hat{p}$
The Central Limit Theorem also applies to sample proportions. If you are counting successes in a sample (like the proportion of voters who support a candidate), the sample proportion $\hat{p}$ also has a sampling distribution.
For a sample of size $n$ from a population with true proportion $p$:
Mean of the sampling distribution: $$\mu_{\hat{p}} = p$$
Standard error of the sampling distribution: $$\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}$$
Shape (by the CLT):
For large samples, $\hat{p}$ is approximately Normal: $$\hat{p} \sim N\left(p, \sqrt{\frac{p(1-p)}{n}}\right)$$
Conditions for using the Normal approximation:
The sample size must be large enough that:
- $np \geq 10$ (expect at least 10 successes)
- $n(1-p) \geq 10$ (expect at least 10 failures)
These conditions ensure the sampling distribution is reasonably Normal.
Notation and Terminology
| Term | Meaning | Example |
|---|---|---|
| Sampling distribution | Distribution of a statistic over many samples | Histogram of $\bar{x}$ from all possible samples |
| $\bar{x}$ | Sample mean | Average of data in one sample |
| $\mu_{\bar{x}}$ | Mean of sampling distribution of $\bar{x}$ | Equals $\mu$ (population mean) |
| $\sigma_{\bar{x}}$ | Standard deviation of $\bar{x}$ | $\frac{\sigma}{\sqrt{n}}$ |
| Standard error | Estimated standard deviation of a statistic | $\frac{s}{\sqrt{n}}$ for the sample mean |
| Central Limit Theorem | $\bar{x}$ is approximately Normal for large $n$ | Works regardless of population shape |
| $\hat{p}$ | Sample proportion | Fraction of successes in sample |
| $\sigma_{\hat{p}}$ | Standard deviation of $\hat{p}$ | $\sqrt{\frac{p(1-p)}{n}}$ |
Examples
For each scenario, identify the parameter and the statistic.
a) A researcher wants to know the average commute time for all workers in a city. She surveys 200 workers and finds their average commute is 28 minutes.
b) A quality control inspector wants to know what proportion of all items produced by a factory are defective. He tests 500 items and finds that 23 are defective.
c) A poll aims to determine the percentage of all registered voters who support a ballot measure. The poll of 1,200 voters finds that 54% support it.
Solution:
a) Parameter: $\mu$ = the true average commute time for all workers in the city (unknown)
Statistic: $\bar{x} = 28$ minutes = the average commute time in the sample of 200 workers
b) Parameter: $p$ = the true proportion of all factory items that are defective (unknown)
Statistic: $\hat{p} = \frac{23}{500} = 0.046$ = the proportion defective in the sample
c) Parameter: $p$ = the true proportion of all registered voters who support the measure (unknown)
Statistic: $\hat{p} = 0.54$ = the proportion who support the measure in the sample of 1,200
Notice the pattern: parameters describe the whole population (and are usually unknown), while statistics describe the sample (and are calculated from your data).
A population has a standard deviation of $\sigma = 20$. Calculate the standard error of the sample mean for each sample size.
a) $n = 25$ b) $n = 100$ c) $n = 400$
Solution:
The standard error of the mean is $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$.
a) For $n = 25$: $$\sigma_{\bar{x}} = \frac{20}{\sqrt{25}} = \frac{20}{5} = \boxed{4}$$
b) For $n = 100$: $$\sigma_{\bar{x}} = \frac{20}{\sqrt{100}} = \frac{20}{10} = \boxed{2}$$
c) For $n = 400$: $$\sigma_{\bar{x}} = \frac{20}{\sqrt{400}} = \frac{20}{20} = \boxed{1}$$
Observations:
- When we quadrupled the sample size from 25 to 100, the standard error was cut in half (from 4 to 2)
- When we quadrupled again from 100 to 400, the standard error was cut in half again (from 2 to 1)
- Larger samples produce more consistent estimates, but you need to quadruple the sample size to halve the standard error
The amount of soda in bottles filled by a machine is Normally distributed with mean $\mu = 12.0$ ounces and standard deviation $\sigma = 0.15$ ounces. A quality control inspector randomly selects 36 bottles.
a) Describe the sampling distribution of the sample mean $\bar{x}$. b) What is the probability that the sample mean is less than 11.95 ounces? c) What is the probability that the sample mean is between 11.96 and 12.04 ounces?
Solution:
a) Sampling distribution of $\bar{x}$:
Since the population is Normal, the sampling distribution of $\bar{x}$ is exactly Normal with:
Mean: $\mu_{\bar{x}} = \mu = 12.0$ ounces
Standard error: $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{0.15}{\sqrt{36}} = \frac{0.15}{6} = 0.025$ ounces
Therefore: $\bar{x} \sim N(12.0, 0.025)$
b) Probability that $\bar{x} < 11.95$:
First, convert to a Z-score: $$z = \frac{\bar{x} - \mu_{\bar{x}}}{\sigma_{\bar{x}}} = \frac{11.95 - 12.0}{0.025} = \frac{-0.05}{0.025} = -2.0$$
Using a Z-table: $P(Z < -2.0) = 0.0228$
$$P(\bar{x} < 11.95) = \boxed{0.0228 \text{ or } 2.28%}$$
There is only about a 2.3% chance that a sample of 36 bottles would have an average less than 11.95 ounces, assuming the machine is working correctly.
c) Probability that $11.96 < \bar{x} < 12.04$:
Convert both values to Z-scores: $$z_1 = \frac{11.96 - 12.0}{0.025} = \frac{-0.04}{0.025} = -1.6$$ $$z_2 = \frac{12.04 - 12.0}{0.025} = \frac{0.04}{0.025} = 1.6$$
Using a Z-table:
- $P(Z < 1.6) = 0.9452$
- $P(Z < -1.6) = 0.0548$
$$P(11.96 < \bar{x} < 12.04) = 0.9452 - 0.0548 = \boxed{0.8904 \text{ or } 89.04%}$$
About 89% of all samples of 36 bottles will have a mean between 11.96 and 12.04 ounces.
For each situation, determine whether the Central Limit Theorem allows us to assume the sampling distribution of $\bar{x}$ is approximately Normal. Explain your reasoning.
a) A population is heavily right-skewed. Sample size is $n = 10$.
b) A population is heavily right-skewed. Sample size is $n = 50$.
c) A population is approximately Normal. Sample size is $n = 8$.
d) A population has a uniform distribution (flat). Sample size is $n = 35$.
Solution:
a) Not safe to assume Normality
With a heavily skewed population, we need a larger sample size for the CLT to apply. A sample of $n = 10$ is too small. The sampling distribution will still reflect some of the skewness in the original population.
b) Yes, approximately Normal
With $n = 50$, the CLT applies even for heavily skewed populations. This sample size is large enough that the sampling distribution of $\bar{x}$ will be approximately Normal.
c) Yes, exactly Normal
When the population itself is Normal, the sampling distribution of $\bar{x}$ is exactly Normal for any sample size. The CLT is not even needed here. Even with $n = 8$, the sampling distribution is Normal.
d) Yes, approximately Normal
A uniform distribution is not Normal, but it is symmetric without extreme skewness. For such populations, $n \geq 30$ is typically sufficient. With $n = 35$, the sampling distribution of $\bar{x}$ will be approximately Normal.
Summary of guidelines:
- Population is Normal: Any sample size works
- Population is symmetric or only mildly skewed: $n \geq 15$ is usually sufficient
- Population is moderately skewed: $n \geq 30$ is usually sufficient
- Population is heavily skewed: $n \geq 50$ or more may be needed
A manufacturing company measures the breaking strength of cables. The population of breaking strengths has mean $\mu = 2000$ pounds and standard deviation $\sigma = 100$ pounds.
Two inspectors test cables:
- Inspector A tests samples of 25 cables
- Inspector B tests samples of 100 cables
a) Find the standard error of $\bar{x}$ for each inspector. b) For each inspector, find the probability that the sample mean differs from the population mean by more than 15 pounds. c) Explain why the company might prefer Inspector B’s larger samples despite the extra cost.
Solution:
a) Standard errors:
Inspector A ($n = 25$): $$\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{100}{\sqrt{25}} = \frac{100}{5} = 20 \text{ pounds}$$
Inspector B ($n = 100$): $$\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{100}{\sqrt{100}} = \frac{100}{10} = 10 \text{ pounds}$$
b) Probability that $|\bar{x} - \mu| > 15$:
This is asking for $P(\bar{x} < 1985 \text{ or } \bar{x} > 2015)$.
For Inspector A ($\sigma_{\bar{x}} = 20$):
Convert to Z-scores: $$z = \frac{15}{20} = 0.75$$
The probability of being more than 0.75 standard errors from the mean: $$P(|Z| > 0.75) = 2 \times P(Z > 0.75) = 2 \times (1 - 0.7734) = 2 \times 0.2266 = \boxed{0.4532}$$
There is about a 45% chance Inspector A’s sample mean is more than 15 pounds away from the true mean.
For Inspector B ($\sigma_{\bar{x}} = 10$):
Convert to Z-scores: $$z = \frac{15}{10} = 1.5$$
The probability of being more than 1.5 standard errors from the mean: $$P(|Z| > 1.5) = 2 \times P(Z > 1.5) = 2 \times (1 - 0.9332) = 2 \times 0.0668 = \boxed{0.1336}$$
There is only about a 13% chance Inspector B’s sample mean is more than 15 pounds away from the true mean.
c) Why larger samples are better:
Inspector B’s larger samples provide much more reliable estimates:
- Inspector A has a 45% chance of being off by more than 15 pounds
- Inspector B has only a 13% chance of being off by more than 15 pounds
For quality control decisions, this difference matters enormously. If the company uses sample means to decide whether a batch meets specifications, Inspector A’s smaller samples could lead to rejecting good batches or accepting bad ones almost half the time. Inspector B’s larger samples give much more confidence in the results.
The cost of testing more cables must be weighed against the cost of making wrong decisions based on unreliable estimates.
A political campaign believes that 60% of voters in a district support their candidate (so $p = 0.60$). They plan to conduct a poll.
a) If they survey $n = 400$ voters, describe the sampling distribution of $\hat{p}$. b) What is the probability that the sample proportion is less than 0.55? c) How large a sample would they need for the standard error to be at most 0.02?
Solution:
a) Sampling distribution of $\hat{p}$:
First, check conditions for Normal approximation:
- $np = 400 \times 0.60 = 240 \geq 10$ ✓
- $n(1-p) = 400 \times 0.40 = 160 \geq 10$ ✓
The conditions are satisfied, so $\hat{p}$ is approximately Normal.
Mean: $\mu_{\hat{p}} = p = 0.60$
Standard error: $\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.60 \times 0.40}{400}} = \sqrt{\frac{0.24}{400}} = \sqrt{0.0006} = 0.0245$
Therefore: $\hat{p} \sim N(0.60, 0.0245)$ approximately
b) Probability that $\hat{p} < 0.55$:
Convert to a Z-score: $$z = \frac{\hat{p} - p}{\sigma_{\hat{p}}} = \frac{0.55 - 0.60}{0.0245} = \frac{-0.05}{0.0245} = -2.04$$
Using a Z-table: $P(Z < -2.04) \approx 0.0207$
$$P(\hat{p} < 0.55) = \boxed{0.0207 \text{ or about } 2.1%}$$
If the true support is 60%, there is only about a 2% chance a poll of 400 voters would show less than 55% support.
c) Sample size for standard error $\leq 0.02$:
We need $\sigma_{\hat{p}} \leq 0.02$:
$$\sqrt{\frac{p(1-p)}{n}} \leq 0.02$$
$$\frac{0.60 \times 0.40}{n} \leq 0.0004$$
$$\frac{0.24}{n} \leq 0.0004$$
$$n \geq \frac{0.24}{0.0004} = 600$$
They would need to survey at least $\boxed{600}$ voters to achieve a standard error of 0.02 or less.
This smaller standard error means their estimate will typically be within about 4 percentage points of the true value (using the rough rule that estimates are usually within 2 standard errors of the truth).
Key Properties and Rules
Properties of the Sampling Distribution of $\bar{x}$
| Property | Formula | Notes |
|---|---|---|
| Mean | $\mu_{\bar{x}} = \mu$ | Sample means are centered on population mean |
| Standard deviation (Standard error) | $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$ | Decreases as sample size increases |
| Shape (population Normal) | Exactly Normal | For any sample size |
| Shape (population not Normal) | Approximately Normal | For large $n$ (CLT) |
The Central Limit Theorem Summary
Statement: For random samples of size $n$ from any population with mean $\mu$ and standard deviation $\sigma$, the sampling distribution of $\bar{x}$ approaches $N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)$ as $n$ increases.
When is $n$ “large enough”?
| Population Shape | Minimum $n$ for CLT |
|---|---|
| Normal | Any $n$ (exact, not approximate) |
| Symmetric, no outliers | $n \geq 15$ |
| Moderately skewed | $n \geq 30$ |
| Heavily skewed | $n \geq 50$ or more |
Properties of the Sampling Distribution of $\hat{p}$
| Property | Formula | Conditions |
|---|---|---|
| Mean | $\mu_{\hat{p}} = p$ | Always |
| Standard error | $\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}$ | Always |
| Shape | Approximately Normal | When $np \geq 10$ and $n(1-p) \geq 10$ |
Standard Error Formulas
| Statistic | Standard Error | Estimated Standard Error |
|---|---|---|
| Sample mean $\bar{x}$ | $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$ | $SE = \frac{s}{\sqrt{n}}$ |
| Sample proportion $\hat{p}$ | $\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}$ | $SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$ |
Effect of Sample Size on Standard Error
To reduce standard error by a factor of $k$, you must multiply sample size by $k^2$.
| Goal | Required Change in $n$ |
|---|---|
| Cut SE in half | Quadruple $n$ |
| Cut SE to one-third | Multiply $n$ by 9 |
| Cut SE to one-tenth | Multiply $n$ by 100 |
Real-World Applications
Why Larger Samples Are More Reliable
The Central Limit Theorem explains quantitatively why statisticians prefer larger samples. With a larger $n$, the standard error $\frac{\sigma}{\sqrt{n}}$ is smaller, meaning sample means cluster more tightly around the true population mean.
Consider a manufacturer testing light bulb lifetimes. If the true average lifetime is 1,000 hours with a standard deviation of 100 hours:
- With $n = 25$: Standard error = 20 hours, so sample means typically range from about 960 to 1040 hours
- With $n = 100$: Standard error = 10 hours, so sample means typically range from about 980 to 1020 hours
- With $n = 400$: Standard error = 5 hours, so sample means typically range from about 990 to 1010 hours
The larger sample gives you a much sharper estimate of the true average.
Political Polling and Margin of Error
When you see a poll result like “54% of voters support the measure, margin of error plus or minus 3 percentage points,” you are seeing the Central Limit Theorem in action.
The margin of error is approximately $2 \times \sigma_{\hat{p}} \approx 2 \times \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$.
For a poll of 1,000 voters where $\hat{p} = 0.50$: $$\text{Margin of error} \approx 2 \times \sqrt{\frac{0.50 \times 0.50}{1000}} = 2 \times 0.0158 \approx 0.032 \text{ or } 3.2%$$
This explains why most national polls survey about 1,000-1,500 people. This gives a margin of error around 3%, which is precise enough to be useful while being affordable to conduct. Surveying 10,000 people would reduce the margin of error to about 1%, but the extra cost usually is not worth it.
Quality Control Sampling
Manufacturing companies cannot test every item they produce. Instead, they use sampling. A quality control manager might test 50 items per shift and compare the sample mean to the target specification.
Thanks to the CLT, the manager knows:
- If the process is on target, sample means will follow a predictable Normal distribution
- About 95% of sample means should fall within 2 standard errors of the target
- A sample mean outside this range suggests the process has drifted
This is the foundation of statistical process control. Control charts, which are used in factories worldwide, are direct applications of sampling distribution theory.
Scientific Reproducibility
Scientists often worry about whether their findings will replicate. The sampling distribution concept helps explain this.
If you conduct an experiment with $n = 30$ subjects and find a certain effect, another researcher repeating your study (with a different sample of 30 subjects) will likely get a somewhat different result due to sampling variability. The smaller the sample, the more variability, and the less likely results are to replicate closely.
This is one reason scientific journals increasingly require larger sample sizes. Larger samples have smaller standard errors, meaning findings are more likely to replicate across studies.
Self-Test Problems
Problem 1: The average height of all students at a university is $\mu = 68$ inches with standard deviation $\sigma = 4$ inches. For random samples of 64 students, find: a) The mean of the sampling distribution of $\bar{x}$ b) The standard error of $\bar{x}$
Show Answer
a) Mean of the sampling distribution: $$\mu_{\bar{x}} = \mu = \boxed{68 \text{ inches}}$$
The mean of the sampling distribution equals the population mean.
b) Standard error: $$\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{4}{\sqrt{64}} = \frac{4}{8} = \boxed{0.5 \text{ inches}}$$
Sample means from samples of 64 students will typically vary by about 0.5 inches from the true mean of 68 inches.
Problem 2: A company’s customer service calls have an average length of $\mu = 8$ minutes with $\sigma = 3$ minutes. The distribution of call lengths is right-skewed. a) For samples of $n = 9$ calls, can we assume $\bar{x}$ is approximately Normal? Why or why not? b) For samples of $n = 100$ calls, can we assume $\bar{x}$ is approximately Normal? Why or why not?
Show Answer
a) For $n = 9$:
No, we should not assume $\bar{x}$ is approximately Normal. The population is right-skewed, and a sample size of 9 is too small for the Central Limit Theorem to overcome the skewness. We would need a larger sample.
b) For $n = 100$:
Yes, we can assume $\bar{x}$ is approximately Normal. With $n = 100$, the Central Limit Theorem applies even for skewed populations. The sampling distribution of $\bar{x}$ will be approximately Normal regardless of the original population’s shape.
Problem 3: The time to complete a standardized test is Normally distributed with mean $\mu = 45$ minutes and standard deviation $\sigma = 8$ minutes. For a random sample of 16 test-takers: a) What is the probability that the sample mean completion time is less than 42 minutes? b) What is the probability that the sample mean is between 43 and 47 minutes?
Show Answer
First, find the standard error: $$\sigma_{\bar{x}} = \frac{8}{\sqrt{16}} = \frac{8}{4} = 2 \text{ minutes}$$
Since the population is Normal, $\bar{x} \sim N(45, 2)$.
a) Probability that $\bar{x} < 42$:
$$z = \frac{42 - 45}{2} = \frac{-3}{2} = -1.5$$
$P(Z < -1.5) = 0.0668$
$$\boxed{0.0668 \text{ or about } 6.7%}$$
b) Probability that $43 < \bar{x} < 47$:
$$z_1 = \frac{43 - 45}{2} = -1.0$$ $$z_2 = \frac{47 - 45}{2} = 1.0$$
$P(-1.0 < Z < 1.0) = P(Z < 1.0) - P(Z < -1.0) = 0.8413 - 0.1587 = 0.6826$
$$\boxed{0.6826 \text{ or about } 68.3%}$$
Problem 4: A factory produces bolts with a target diameter of 10 mm. The actual diameters have $\sigma = 0.2$ mm. How large a sample must be tested so that the standard error of $\bar{x}$ is at most 0.02 mm?
Show Answer
We need $\sigma_{\bar{x}} \leq 0.02$:
$$\frac{\sigma}{\sqrt{n}} \leq 0.02$$
$$\frac{0.2}{\sqrt{n}} \leq 0.02$$
$$\sqrt{n} \geq \frac{0.2}{0.02} = 10$$
$$n \geq 100$$
The factory must test at least $\boxed{100}$ bolts.
Problem 5: An election poll surveys 900 likely voters and finds that 52% support Candidate A ($\hat{p} = 0.52$). Assume the true proportion in the population is $p = 0.50$. a) Find the standard error of $\hat{p}$. b) What is the probability of getting a sample proportion of 0.52 or higher if the true proportion is 0.50?
Show Answer
a) Standard error of $\hat{p}$:
$$\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.50 \times 0.50}{900}} = \sqrt{\frac{0.25}{900}} = \sqrt{0.000278} = \boxed{0.0167}$$
b) Probability that $\hat{p} \geq 0.52$:
First, check conditions: $np = 450 \geq 10$ and $n(1-p) = 450 \geq 10$ ✓
Convert to Z-score: $$z = \frac{0.52 - 0.50}{0.0167} = \frac{0.02}{0.0167} = 1.20$$
$P(Z \geq 1.20) = 1 - P(Z < 1.20) = 1 - 0.8849 = 0.1151$
$$\boxed{0.1151 \text{ or about } 11.5%}$$
Even if the race is truly tied at 50-50, there is about an 11.5% chance a poll of 900 voters would show 52% or more for one candidate. This illustrates why a 52% result is not strong evidence that a candidate is actually ahead.
Problem 6: Two researchers study the same population (mean $\mu = 100$, standard deviation $\sigma = 15$). Researcher A uses samples of size 36, and Researcher B uses samples of size 144. a) Calculate the standard error for each researcher. b) Researcher A gets $\bar{x} = 104$. How many standard errors is this from $\mu$? c) Researcher B also gets $\bar{x} = 104$. How many standard errors is this from $\mu$? d) Which result provides stronger evidence that something unusual is happening? Why?
Show Answer
a) Standard errors:
Researcher A: $\sigma_{\bar{x}} = \frac{15}{\sqrt{36}} = \frac{15}{6} = \boxed{2.5}$
Researcher B: $\sigma_{\bar{x}} = \frac{15}{\sqrt{144}} = \frac{15}{12} = \boxed{1.25}$
b) Researcher A’s Z-score: $$z = \frac{104 - 100}{2.5} = \frac{4}{2.5} = \boxed{1.6 \text{ standard errors}}$$
c) Researcher B’s Z-score: $$z = \frac{104 - 100}{1.25} = \frac{4}{1.25} = \boxed{3.2 \text{ standard errors}}$$
d) Interpretation:
Researcher B’s result provides stronger evidence that something unusual is happening.
Both researchers got the same sample mean (104), but this result has very different meanings:
- For Researcher A, 104 is 1.6 standard errors above the mean. About 11% of samples would give a result this extreme or more. Not particularly unusual.
- For Researcher B, 104 is 3.2 standard errors above the mean. Less than 0.1% of samples would give a result this extreme. Very unusual if $\mu$ really is 100.
Researcher B’s larger sample size gives a smaller standard error, making the same deviation from $\mu$ much more statistically significant. This is why larger samples provide stronger evidence.
Summary
-
A parameter describes a population (usually unknown); a statistic describes a sample (calculated from data). Statistics are used to estimate parameters.
-
Sampling variability means different samples give different statistics. This is expected, not an error.
-
A sampling distribution shows all possible values a statistic can take and their probabilities across all possible samples of the same size.
-
The sampling distribution of $\bar{x}$ has mean $\mu_{\bar{x}} = \mu$ and standard error $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$.
-
The Central Limit Theorem states that for large samples, the sampling distribution of $\bar{x}$ is approximately Normal, regardless of the population’s shape.
-
Standard error measures how much a statistic varies from sample to sample. For the sample mean, $SE = \frac{\sigma}{\sqrt{n}}$ (or $\frac{s}{\sqrt{n}}$ when estimated from data).
-
Larger samples have smaller standard errors, meaning more reliable estimates. To cut the standard error in half, you must quadruple the sample size.
-
The CLT is the foundation of statistical inference. It allows us to use Normal distribution methods for confidence intervals and hypothesis tests, even when the original population is not Normal.
-
The sampling distribution of a proportion $\hat{p}$ has mean $p$ and standard error $\sqrt{\frac{p(1-p)}{n}}$. For large samples (when $np \geq 10$ and $n(1-p) \geq 10$), it is approximately Normal.
-
Understanding sampling distributions explains why polls have margins of error, why larger studies are more trustworthy, and why science values replication.