Introduction to Confidence Intervals
Estimate population values with a margin of error
You have probably heard phrases like “the poll shows 52% support, with a margin of error of plus or minus 3 percentage points.” Or perhaps you have seen a scientific study report that “the average improvement was 15 points, with a 95% confidence interval of 12 to 18 points.” These statements are everywhere in news reports, medical research, and business analysis.
But what do they actually mean? And why do statisticians report ranges instead of just giving you a single number?
The answer lies in an unavoidable truth about statistics: when you study a sample instead of an entire population, you never know the exact population value. Your sample gives you an estimate, but that estimate comes with uncertainty. A confidence interval is how statisticians honestly communicate that uncertainty. Instead of pretending to know something precisely, they give you a range of plausible values along with a measure of how confident you should be.
Understanding confidence intervals is not just academic. It helps you read poll results critically, understand medical research, and make better decisions based on data. It is also one of the most commonly misunderstood concepts in statistics, so let us make sure you understand it correctly.
Core Concepts
Point Estimates vs. Interval Estimates
When you want to know something about a population, like the average height of all adults or the proportion of voters supporting a candidate, you typically cannot measure everyone. Instead, you take a sample and calculate a statistic.
A point estimate is a single number calculated from sample data that serves as your “best guess” for the population parameter. For example:
- The sample mean $\bar{x}$ is a point estimate for the population mean $\mu$
- The sample proportion $\hat{p}$ is a point estimate for the population proportion $p$
Point estimates are useful, but they have a significant limitation: they give you no sense of how accurate they might be. A sample mean of 72 tells you nothing about whether the true population mean might be 71, 73, or 80.
An interval estimate addresses this problem by providing a range of values that is likely to contain the true parameter. Instead of saying “the average is 72,” you say “the average is somewhere between 70 and 74.” This range acknowledges the uncertainty inherent in sampling.
A confidence interval is a specific type of interval estimate that comes with a stated confidence level.
What a Confidence Level Really Means
The confidence level (often 95%, but sometimes 90% or 99%) tells you something important, but not what most people think.
Here is what the confidence level actually means: if you were to repeat your sampling procedure many, many times, and construct a confidence interval each time using the same method, approximately that percentage of your intervals would contain the true population parameter.
For a 95% confidence level:
- Take a sample, construct an interval. This interval either contains the true value or it does not.
- Take another sample, construct another interval.
- Repeat this thousands of times.
- About 95% of those intervals would contain the true parameter. About 5% would miss it.
The confidence level describes the long-run reliability of the method, not the probability that any particular interval is correct. Once you have computed a specific interval, the true parameter either is or is not in that interval. You just do not know which.
Think of it like a sharpshooter who hits the target 95% of the time. Before any particular shot, you can say there is a 95% chance of hitting. But after the shot is fired, the bullet either hit or it did not. The 95% describes the shooter’s overall accuracy, not the status of any single shot.
The Structure of a Confidence Interval
Every confidence interval has three components:
$$\text{Point Estimate} \pm \text{Margin of Error}$$
or equivalently:
$$(\text{Point Estimate} - \text{Margin of Error}, \text{Point Estimate} + \text{Margin of Error})$$
The point estimate is your best single guess (like the sample mean $\bar{x}$).
The margin of error is the “buffer” on either side that accounts for sampling variability. A larger margin of error means more uncertainty.
The margin of error itself has a structure:
$$\text{Margin of Error} = \text{Critical Value} \times \text{Standard Error}$$
- The critical value (often denoted $z^*$) comes from the confidence level. Higher confidence requires a larger critical value, which produces a wider interval.
- The standard error measures how much your statistic varies from sample to sample. It depends on sample size and population variability.
Confidence Interval for a Mean (When $\sigma$ Is Known)
Let us start with the simplest case: estimating a population mean $\mu$ when the population standard deviation $\sigma$ is known. (This situation is rare in practice, but it illustrates the key ideas clearly.)
If you have a random sample of size $n$ from a population with known standard deviation $\sigma$, and either the population is Normal or $n$ is large (so the Central Limit Theorem applies), then a confidence interval for $\mu$ is:
$$\bar{x} \pm z^* \cdot \frac{\sigma}{\sqrt{n}}$$
where:
- $\bar{x}$ is the sample mean
- $z^*$ is the critical value from the standard Normal distribution
- $\frac{\sigma}{\sqrt{n}}$ is the standard error of the mean
Common critical values:
| Confidence Level | $z^*$ |
|---|---|
| 90% | 1.645 |
| 95% | 1.96 |
| 99% | 2.576 |
The 95% confidence level uses $z^* = 1.96$ because in a standard Normal distribution, 95% of values fall between $-1.96$ and $+1.96$.
Interpreting Confidence Intervals Correctly
Correct interpretation is crucial. Here is how to properly state what a 95% confidence interval means:
Correct interpretations:
- “We are 95% confident that the true population mean lies in this interval.”
- “This interval was constructed using a method that captures the true parameter 95% of the time.”
- “If we repeated this study many times, 95% of the resulting intervals would contain the true mean.”
What “95% confident” means: It refers to our confidence in the method used to construct the interval, not a probability statement about where $\mu$ lies. The method works 95% of the time in the long run.
Common Misinterpretations
Many people misunderstand confidence intervals. Here are statements that sound reasonable but are actually incorrect:
Incorrect: “There is a 95% probability that $\mu$ is in this interval.”
Why it is wrong: The parameter $\mu$ is a fixed (though unknown) number. It either is in the interval or it is not. Probability applies to random events, and once the interval is calculated, nothing is random anymore.
Incorrect: “95% of the data falls within this interval.”
Why it is wrong: The confidence interval is about the population mean, not about individual data values. A confidence interval for the mean is typically much narrower than the range of individual values.
Incorrect: “If we took another sample, there is a 95% chance the new sample mean would fall in this interval.”
Why it is wrong: The interval estimates where the population mean is, not where future sample means will be. (A prediction interval would address that question.)
Incorrect: “This interval contains 95% of all possible sample means.”
Why it is wrong: The interval is centered on one particular sample mean. Different samples would produce different intervals.
Factors Affecting the Margin of Error
Understanding what affects the width of a confidence interval helps you design better studies and interpret results more thoughtfully.
1. Confidence Level: Higher confidence levels produce wider intervals. A 99% confidence interval is wider than a 95% interval, which is wider than a 90% interval. The more confident you want to be, the wider the net you must cast.
Think of it this way: if you want to be 100% confident of catching the true value, you would need an infinitely wide interval. More realistically, as confidence goes up, the interval gets wider.
2. Sample Size ($n$): Larger samples produce narrower intervals. This makes intuitive sense: more data gives you more information, leading to more precise estimates.
Mathematically, the standard error has $\sqrt{n}$ in the denominator: $$\text{Standard Error} = \frac{\sigma}{\sqrt{n}}$$
To cut the margin of error in half, you must quadruple the sample size.
3. Population Variability ($\sigma$): Populations with more variability produce wider intervals. If individual values are spread out, it is harder to pin down the mean precisely.
The standard error has $\sigma$ in the numerator, so more variability means a larger standard error and thus a wider interval.
Sample Size Determination
Often you want to determine how large a sample you need to achieve a desired margin of error. You can work backwards from the margin of error formula.
For estimating a mean with known $\sigma$:
$$\text{Margin of Error} = z^* \cdot \frac{\sigma}{\sqrt{n}}$$
Solving for $n$:
$$n = \left(\frac{z^* \cdot \sigma}{\text{Margin of Error}}\right)^2$$
This formula tells you the sample size needed to achieve your target precision at your chosen confidence level. Always round up to ensure you meet your goal.
Confidence Interval for a Proportion
Proportions appear constantly in real-world applications: the proportion of voters supporting a candidate, the proportion of patients responding to treatment, the proportion of products that are defective.
For a sample proportion $\hat{p}$ from a sample of size $n$, a confidence interval for the population proportion $p$ is:
$$\hat{p} \pm z^* \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$
where $\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$ is the standard error of the proportion.
Conditions for this interval to be valid:
- The sample should be a random sample from the population
- The sample size should be large enough: $n\hat{p} \geq 10$ and $n(1-\hat{p}) \geq 10$
These conditions ensure that the sampling distribution of $\hat{p}$ is approximately Normal, which justifies using the $z^*$ critical value.
Sample size for estimating a proportion:
If you want a specific margin of error when estimating a proportion:
$$n = \left(\frac{z^*}{E}\right)^2 \cdot \hat{p}(1-\hat{p})$$
where $E$ is the desired margin of error. If you do not have a preliminary estimate for $\hat{p}$, use $\hat{p} = 0.5$, which gives the most conservative (largest) sample size.
Notation and Terminology
| Term | Meaning | Example |
|---|---|---|
| Point estimate | Single value estimate of a parameter | $\bar{x}$ estimates $\mu$ |
| Confidence interval | Range of plausible values for a parameter | $(a, b)$ |
| Confidence level | Long-run success rate of the method | 95% |
| Margin of error | Half-width of the interval | $\pm 3%$ |
| Critical value ($z^*$) | Z-score corresponding to confidence level | $z^* = 1.96$ for 95% |
| Standard error | Standard deviation of the sampling distribution | $\frac{\sigma}{\sqrt{n}}$ for $\bar{x}$ |
| $\bar{x}$ | Sample mean | Point estimate for $\mu$ |
| $\hat{p}$ | Sample proportion | Point estimate for $p$ |
Examples
A manufacturer claims that their light bulbs last an average of $\mu$ hours. You test a random sample of 64 bulbs and find the sample mean lifetime is $\bar{x} = 1,050$ hours. Previous studies have established that the population standard deviation is $\sigma = 80$ hours.
Construct a 95% confidence interval for the true mean lifetime.
Solution:
We have:
- $\bar{x} = 1,050$ hours
- $\sigma = 80$ hours
- $n = 64$
- For 95% confidence: $z^* = 1.96$
Step 1: Calculate the standard error. $$SE = \frac{\sigma}{\sqrt{n}} = \frac{80}{\sqrt{64}} = \frac{80}{8} = 10 \text{ hours}$$
Step 2: Calculate the margin of error. $$ME = z^* \times SE = 1.96 \times 10 = 19.6 \text{ hours}$$
Step 3: Construct the interval. $$\bar{x} \pm ME = 1,050 \pm 19.6$$
The 95% confidence interval is $\boxed{(1030.4, 1069.6)}$ hours.
Interpretation: We are 95% confident that the true mean lifetime of all light bulbs from this manufacturer is between 1,030.4 and 1,069.6 hours.
A medical study reports: “The mean reduction in blood pressure was 12 mmHg, with a 95% confidence interval of (9, 15) mmHg.”
Which of the following statements are correct interpretations?
a) There is a 95% probability that the true mean reduction is between 9 and 15 mmHg.
b) We are 95% confident that the true mean blood pressure reduction for the population lies between 9 and 15 mmHg.
c) 95% of patients experienced a blood pressure reduction between 9 and 15 mmHg.
d) If this study were repeated many times, 95% of the resulting confidence intervals would contain the true mean reduction.
Solution:
a) Incorrect. The true mean is a fixed value, not a random variable. It either is or is not in the interval. We cannot assign a probability to a fixed value being in a particular range.
b) Correct. This properly expresses confidence in the interval-construction method rather than making a probability statement about the parameter.
c) Incorrect. The confidence interval is about the mean reduction, not about individual patient outcomes. Individual responses vary much more widely than this interval suggests.
d) Correct. This accurately describes what the confidence level means: it is a statement about the long-run performance of the method.
The correct interpretations are (b) and (d).
A university administrator wants to estimate the average amount of student loan debt among graduates. Based on national data, the standard deviation of student loan debt is approximately $\sigma = \$8,000$.
How many graduates must be surveyed to estimate the mean debt with a margin of error of no more than $\$500$ at a 95% confidence level?
Solution:
We need to find $n$ such that the margin of error is at most $\$500$.
Given:
- $\sigma = 8,000$
- Desired margin of error $E = 500$
- $z^* = 1.96$ for 95% confidence
Using the formula: $$n = \left(\frac{z^* \cdot \sigma}{E}\right)^2$$
Substituting: $$n = \left(\frac{1.96 \times 8,000}{500}\right)^2 = \left(\frac{15,680}{500}\right)^2 = (31.36)^2 = 983.4$$
Since we cannot survey a fraction of a person, we round up.
The administrator needs to survey at least $\boxed{984}$ graduates.
Checking our answer: With $n = 984$: $$ME = 1.96 \times \frac{8,000}{\sqrt{984}} = 1.96 \times \frac{8,000}{31.37} = 1.96 \times 255.0 = \$499.80$$
This is just under $\$500$, confirming that 984 is the minimum sample size needed.
A political poll surveys 1,200 registered voters and finds that 648 of them (54%) support a ballot measure. Construct a 95% confidence interval for the true proportion of all registered voters who support the measure.
Solution:
We have:
- $n = 1,200$
- $\hat{p} = \frac{648}{1,200} = 0.54$
- $z^* = 1.96$ for 95% confidence
Step 1: Check conditions.
- $n\hat{p} = 1,200 \times 0.54 = 648 \geq 10$ ✓
- $n(1-\hat{p}) = 1,200 \times 0.46 = 552 \geq 10$ ✓
The conditions are satisfied, so we can use the Normal approximation.
Step 2: Calculate the standard error. $$SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} = \sqrt{\frac{0.54 \times 0.46}{1,200}} = \sqrt{\frac{0.2484}{1,200}} = \sqrt{0.000207} = 0.0144$$
Step 3: Calculate the margin of error. $$ME = z^* \times SE = 1.96 \times 0.0144 = 0.0282$$
Step 4: Construct the interval. $$\hat{p} \pm ME = 0.54 \pm 0.0282$$
The 95% confidence interval is $(0.512, 0.568)$, or equivalently $\boxed{(51.2%, 56.8%)}$.
Interpretation: We are 95% confident that the true proportion of all registered voters who support the ballot measure is between 51.2% and 56.8%.
Note: Since the entire interval is above 50%, we have evidence that more than half of voters support the measure. However, the race could still be relatively close (support might be as low as 51.2%).
A researcher measures the resting heart rates of 100 athletes and finds $\bar{x} = 58$ beats per minute with a known population standard deviation of $\sigma = 6$ beats per minute.
Construct 90%, 95%, and 99% confidence intervals for the true mean resting heart rate, and compare them.
Solution:
We have:
- $\bar{x} = 58$ bpm
- $\sigma = 6$ bpm
- $n = 100$
The standard error is the same for all three intervals: $$SE = \frac{\sigma}{\sqrt{n}} = \frac{6}{\sqrt{100}} = \frac{6}{10} = 0.6 \text{ bpm}$$
90% Confidence Interval ($z^ = 1.645$):* $$ME = 1.645 \times 0.6 = 0.987$$ $$58 \pm 0.987 = \boxed{(57.01, 58.99)}$$
95% Confidence Interval ($z^ = 1.96$):* $$ME = 1.96 \times 0.6 = 1.176$$ $$58 \pm 1.176 = \boxed{(56.82, 59.18)}$$
99% Confidence Interval ($z^ = 2.576$):* $$ME = 2.576 \times 0.6 = 1.546$$ $$58 \pm 1.546 = \boxed{(56.45, 59.55)}$$
Comparison:
| Confidence Level | Critical Value | Margin of Error | Interval Width |
|---|---|---|---|
| 90% | 1.645 | 0.99 bpm | 1.98 bpm |
| 95% | 1.96 | 1.18 bpm | 2.35 bpm |
| 99% | 2.576 | 1.55 bpm | 3.09 bpm |
Key observations:
-
All three intervals are centered at the same point (58 bpm) because they use the same sample data.
-
Higher confidence levels produce wider intervals. The 99% interval is about 56% wider than the 90% interval.
-
There is a trade-off between confidence and precision. The 99% interval gives you more confidence that you have captured the true mean, but at the cost of a less precise estimate.
-
The choice of confidence level depends on the consequences of being wrong. Medical and safety applications often use 99% confidence. General research commonly uses 95%. Preliminary or exploratory analysis might use 90%.
A marketing team wants to estimate the proportion of customers who would be interested in a new product. They want a margin of error of at most 2 percentage points with 95% confidence. They have no preliminary estimate of the proportion.
a) What sample size is needed? b) If a pilot study found that about 30% were interested, how would that change the required sample size?
Solution:
a) Without a preliminary estimate:
When we have no prior information about $p$, we use $\hat{p} = 0.5$ because this maximizes $\hat{p}(1-\hat{p}) = 0.25$ and thus gives the largest (most conservative) sample size.
Given:
- Desired margin of error: $E = 0.02$
- $z^* = 1.96$ for 95% confidence
- $\hat{p} = 0.5$ (conservative estimate)
Using the formula: $$n = \left(\frac{z^*}{E}\right)^2 \times \hat{p}(1-\hat{p})$$
$$n = \left(\frac{1.96}{0.02}\right)^2 \times 0.5 \times 0.5 = (98)^2 \times 0.25 = 9,604 \times 0.25 = 2,401$$
The team needs at least $\boxed{2,401}$ respondents.
b) With pilot study estimate of 30%:
Now we use $\hat{p} = 0.30$:
$$n = \left(\frac{1.96}{0.02}\right)^2 \times 0.30 \times 0.70 = 9,604 \times 0.21 = 2,016.84$$
Rounding up, the team would need $\boxed{2,017}$ respondents.
Comparison:
The pilot study reduces the required sample size from 2,401 to 2,017, a savings of 384 respondents (about 16% fewer). This is because when $p$ is farther from 0.5, the sampling variability $\sqrt{p(1-p)/n}$ is smaller.
Using $\hat{p} = 0.5$ when you have no prior information is conservative. It guarantees your margin of error will be no larger than desired, regardless of what the true proportion turns out to be.
Key Properties and Rules
Confidence Interval Formulas
For a population mean (when $\sigma$ is known): $$\bar{x} \pm z^* \cdot \frac{\sigma}{\sqrt{n}}$$
For a population proportion: $$\hat{p} \pm z^* \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$
Critical Values for Common Confidence Levels
| Confidence Level | $z^*$ | How to Find |
|---|---|---|
| 90% | 1.645 | Middle 90% of $N(0,1)$ |
| 95% | 1.96 | Middle 95% of $N(0,1)$ |
| 99% | 2.576 | Middle 99% of $N(0,1)$ |
The critical value $z^$ is the z-score such that the area between $-z^$ and $z^*$ equals the confidence level.
Sample Size Formulas
For estimating a mean: $$n = \left(\frac{z^* \cdot \sigma}{E}\right)^2$$
For estimating a proportion: $$n = \left(\frac{z^*}{E}\right)^2 \cdot \hat{p}(1-\hat{p})$$
If no prior estimate of $p$ is available, use $\hat{p} = 0.5$ for the most conservative estimate.
Always round up to the next whole number.
Relationships to Remember
| Factor | Change | Effect on Interval Width |
|---|---|---|
| Confidence level | Increase | Wider interval |
| Sample size $n$ | Increase | Narrower interval |
| Population variability $\sigma$ | Increase | Wider interval |
To halve the margin of error:
- Keep confidence level and $\sigma$ the same
- Quadruple the sample size (multiply $n$ by 4)
Conditions for Validity
For the Z-interval for a mean:
- Data comes from a random sample
- Either the population is Normal, or $n \geq 30$ (by the Central Limit Theorem)
- The population standard deviation $\sigma$ is known
For the Z-interval for a proportion:
- Data comes from a random sample
- $n\hat{p} \geq 10$ and $n(1-\hat{p}) \geq 10$ (success-failure condition)
Real-World Applications
Poll Margins of Error
Every reputable political poll reports a margin of error. When you see “52% support the candidate, with a margin of error of plus or minus 3 percentage points,” you are looking at a confidence interval in disguise.
The margin of error comes from the formula for the standard error of a proportion. Most national polls survey about 1,000 to 1,500 people, which typically produces a margin of error around 3 percentage points at the 95% confidence level.
Understanding margins of error helps you interpret polls more intelligently:
- A candidate at 52% with a 3% margin of error could really be anywhere from 49% to 55%
- If two candidates are within each other’s margins of error, the race is statistically too close to call
- Polls with larger sample sizes have smaller margins of error and are more precise
Scientific Studies and Medical Research
Medical research routinely reports confidence intervals. When a study says “patients who took the new medication showed a 15-point improvement in symptoms (95% CI: 12 to 18),” researchers are acknowledging uncertainty in their estimate.
Confidence intervals in medical research help you assess:
- Effect size: The point estimate (15 points) tells you the best guess for how much the treatment helps
- Precision: A narrow interval (12 to 18) suggests a fairly precise estimate; a wide interval would suggest more uncertainty
- Statistical significance: If a confidence interval for a treatment effect does not include zero, the effect is statistically significant at that confidence level
The move toward reporting confidence intervals rather than just p-values represents a shift toward more transparent and informative statistical communication in science.
Quality Assurance and Manufacturing
Manufacturers use confidence intervals to assess product quality without testing every single item. A quality control engineer might test a sample of 100 components and construct a confidence interval for the true proportion that meets specifications.
For example, if 96 out of 100 tested components pass inspection, the 95% confidence interval for the true pass rate is approximately (90.2%, 98.7%). This tells the manufacturer that while the point estimate is 96%, the true rate could plausibly be as low as 90% or as high as 99%.
These intervals inform decisions about:
- Whether a production batch meets quality standards
- Whether process improvements have had a real effect
- How much inspection is needed to achieve adequate precision
Economic Forecasts and Business Analytics
Economic indicators like unemployment rates, inflation measures, and GDP growth are typically estimates based on samples, and they come with confidence intervals (though these are not always prominently reported).
Business analysts use confidence intervals to:
- Estimate average customer lifetime value with a stated level of uncertainty
- Project revenue ranges rather than single-point forecasts
- Quantify the precision of market research findings
Reporting ranges rather than single numbers leads to more honest and useful forecasts. Saying “we expect revenue between $\$4.2$ million and $\$4.8$ million with 95% confidence” is more informative than simply stating “we expect $\$4.5$ million.”
Self-Test Problems
Problem 1: A sample of 81 college students has a mean GPA of $\bar{x} = 3.12$. Assume the population standard deviation is $\sigma = 0.45$. Construct a 95% confidence interval for the true mean GPA of all students at this college.
Show Answer
Given: $n = 81$, $\bar{x} = 3.12$, $\sigma = 0.45$, $z^* = 1.96$
Standard error: $$SE = \frac{\sigma}{\sqrt{n}} = \frac{0.45}{\sqrt{81}} = \frac{0.45}{9} = 0.05$$
Margin of error: $$ME = 1.96 \times 0.05 = 0.098$$
Confidence interval: $$3.12 \pm 0.098 = \boxed{(3.02, 3.22)}$$
We are 95% confident that the true mean GPA of all students at this college is between 3.02 and 3.22.
Problem 2: A 95% confidence interval for the mean weight of a certain breed of dog is (62, 78) pounds. Which of the following statements are correct?
a) 95% of dogs of this breed weigh between 62 and 78 pounds. b) We are 95% confident that the true mean weight is between 62 and 78 pounds. c) There is a 95% probability that the true mean weight is between 62 and 78 pounds. d) If we repeated this study 100 times, about 95 of the resulting intervals would contain the true mean.
Show Answer
a) Incorrect. The confidence interval is about the mean weight, not individual dog weights. Individual dogs vary much more widely.
b) Correct. This is the proper interpretation of a confidence interval.
c) Incorrect. The true mean is a fixed value. Once the interval is calculated, the mean either is or is not in the interval. We cannot assign a probability.
d) Correct. This accurately describes what the 95% confidence level means as a long-run property of the method.
Correct answers: (b) and (d)
Problem 3: A researcher wants to estimate the mean daily water consumption of adults in a city. She knows from previous studies that $\sigma = 0.8$ liters. How large a sample does she need to estimate the mean with a margin of error of 0.1 liters at a 99% confidence level?
Show Answer
Given: $\sigma = 0.8$ liters, $E = 0.1$ liters, $z^* = 2.576$ for 99% confidence
Using the sample size formula: $$n = \left(\frac{z^* \cdot \sigma}{E}\right)^2 = \left(\frac{2.576 \times 0.8}{0.1}\right)^2 = \left(\frac{2.061}{0.1}\right)^2 = (20.61)^2 = 424.77$$
Rounding up: $\boxed{n = 425}$
The researcher needs to sample at least 425 adults.
Problem 4: In a survey of 500 smartphone users, 320 said they check their phone within 5 minutes of waking up. Construct a 90% confidence interval for the true proportion of smartphone users with this habit.
Show Answer
Given: $n = 500$, $\hat{p} = \frac{320}{500} = 0.64$, $z^* = 1.645$ for 90% confidence
Check conditions:
- $n\hat{p} = 500 \times 0.64 = 320 \geq 10$ ✓
- $n(1-\hat{p}) = 500 \times 0.36 = 180 \geq 10$ ✓
Standard error: $$SE = \sqrt{\frac{0.64 \times 0.36}{500}} = \sqrt{\frac{0.2304}{500}} = \sqrt{0.000461} = 0.0215$$
Margin of error: $$ME = 1.645 \times 0.0215 = 0.0354$$
Confidence interval: $$0.64 \pm 0.0354 = \boxed{(0.605, 0.675)}$$ or $(60.5%, 67.5%)$
We are 90% confident that between 60.5% and 67.5% of all smartphone users check their phone within 5 minutes of waking up.
Problem 5: A 95% confidence interval for a population mean is (45.2, 52.8). Without doing any calculations, determine the sample mean and the margin of error used to construct this interval.
Show Answer
The confidence interval is centered at the sample mean, so: $$\bar{x} = \frac{45.2 + 52.8}{2} = \frac{98}{2} = \boxed{49}$$
The margin of error is half the width of the interval: $$ME = \frac{52.8 - 45.2}{2} = \frac{7.6}{2} = \boxed{3.8}$$
The interval was constructed as $49 \pm 3.8$.
Problem 6: A market researcher wants to estimate the proportion of adults who would try a new product. She wants a margin of error of 4 percentage points with 95% confidence.
a) What sample size is needed if she has no preliminary estimate? b) A pilot study suggested 25% would try the product. How does this change the required sample size?
Show Answer
a) Without preliminary estimate (use $\hat{p} = 0.5$):
$$n = \left(\frac{z^*}{E}\right)^2 \times \hat{p}(1-\hat{p}) = \left(\frac{1.96}{0.04}\right)^2 \times 0.5 \times 0.5$$ $$n = (49)^2 \times 0.25 = 2401 \times 0.25 = 600.25$$
Rounding up: $\boxed{n = 601}$
b) With $\hat{p} = 0.25$:
$$n = \left(\frac{1.96}{0.04}\right)^2 \times 0.25 \times 0.75 = 2401 \times 0.1875 = 450.19$$
Rounding up: $\boxed{n = 451}$
Using the pilot study estimate reduces the required sample size by 150 respondents (about 25% fewer), saving time and money.
Problem 7: Two researchers study the same population. Researcher A uses a 90% confidence level, while Researcher B uses a 99% confidence level. Both take samples of the same size and compute confidence intervals for the population mean.
a) Whose interval is wider? b) Whose method will capture the true mean more often in the long run? c) If both researchers need to make the same decision based on their results, whose estimate is more precise?
Show Answer
a) Researcher B’s interval is wider. Higher confidence levels require larger critical values (2.576 vs 1.645), producing wider intervals.
b) Researcher B’s method will capture the true mean more often. The 99% confidence method captures the true parameter 99% of the time in the long run, compared to 90% for Researcher A’s method.
c) Researcher A’s estimate is more precise (narrower interval). However, this precision comes at the cost of lower confidence.
The choice between higher confidence (wider interval) and higher precision (narrower interval) depends on the consequences of being wrong. For critical decisions where missing the true value would be costly, higher confidence is preferred. For exploratory analysis where precision matters more, lower confidence may be acceptable.
Summary
-
A point estimate is a single number (like $\bar{x}$ or $\hat{p}$) used to estimate a population parameter. An interval estimate provides a range of plausible values.
-
A confidence interval has the form: point estimate $\pm$ margin of error, where margin of error = critical value $\times$ standard error.
-
The confidence level (e.g., 95%) describes the long-run reliability of the method: if you repeated the sampling process many times, that percentage of the resulting intervals would contain the true parameter.
-
A 95% confidence interval does NOT mean there is a 95% probability the parameter is in the interval. The parameter is fixed; it either is or is not in the interval.
-
For a mean with known $\sigma$: the interval is $\bar{x} \pm z^* \cdot \frac{\sigma}{\sqrt{n}}$.
-
For a proportion: the interval is $\hat{p} \pm z^* \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$.
-
Common critical values: $z^* = 1.645$ for 90%, $z^* = 1.96$ for 95%, $z^* = 2.576$ for 99%.
-
Margin of error decreases when sample size increases or confidence level decreases. It increases when population variability increases or confidence level increases.
-
To halve the margin of error while keeping everything else constant, you must quadruple the sample size.
-
Use sample size formulas to plan studies that achieve a desired margin of error. Always round up to ensure you meet your precision goal.