Discrete Probability Distributions
Random variables and their probability patterns
You already know how to think about probability for a single event - the chance of rolling a 6, or drawing an ace, or getting heads. But in real life, we often care about more than just “did it happen or not.” We care about quantities: How many heads will I get in 10 flips? How many customers will visit my store today? How much money will I win (or lose) if I play this game?
This is where random variables come in. A random variable attaches a number to each outcome of a random process, letting us ask questions like “What can I expect on average?” and “How much variation should I expect?” These tools help casinos set odds, insurance companies set premiums, and businesses make decisions under uncertainty.
The good news is that random variables are just a more organized way to think about probability. Instead of asking “What is the probability of each outcome?” you ask “What numbers could result from this random process, and how likely is each?” Once you have that information in a table or graph, you can calculate powerful summaries like the expected value - the long-run average of what to expect.
Core Concepts
What Is a Random Variable?
A random variable is a variable whose value is determined by the outcome of a random process. We typically use capital letters like $X$, $Y$, or $Z$ to represent random variables.
Here is the key idea: a random variable assigns a numerical value to each outcome in a sample space. Instead of just labeling outcomes as “heads” or “tails,” you assign numbers to them. For example:
- Flip a coin three times. Let $X$ = the number of heads. The possible values of $X$ are 0, 1, 2, or 3.
- Roll two dice. Let $Y$ = the sum of the two numbers. The possible values of $Y$ range from 2 to 12.
- Select a random student. Let $Z$ = their height in inches. The value of $Z$ could be any positive number (within a reasonable range).
The word “random” in “random variable” refers to the fact that the value is determined by chance. Before you flip those three coins, you do not know what value $X$ will take. After you flip them, $X$ takes on a specific value like 2 (if you got two heads).
Discrete vs. Continuous Random Variables
Random variables come in two types, and the distinction matters because we handle them differently.
A discrete random variable can only take on countable values - typically whole numbers or a finite list of possibilities. You can list out every possible value, even if the list is long.
Examples of discrete random variables:
- Number of heads in 10 coin flips (0, 1, 2, …, 10)
- Number of customers who enter a store in an hour (0, 1, 2, 3, …)
- Number of defective items in a batch (0, 1, 2, …, up to batch size)
- Score on a multiple choice test (0 through maximum possible)
A continuous random variable can take on any value within some interval - not just whole numbers, but any decimal. There are infinitely many possible values, and you cannot list them all.
Examples of continuous random variables:
- Height of a randomly selected person (could be 67.3 inches, 67.31 inches, etc.)
- Time until a light bulb burns out (could be any positive number of hours)
- Temperature at noon tomorrow (could be any value on the thermometer)
In this lesson, we focus on discrete random variables. The “countable values” nature of discrete variables means we can list each possibility along with its probability, which gives us a complete picture called a probability distribution.
Probability Distributions
A probability distribution tells you all the possible values a random variable can take and how likely each value is. For a discrete random variable, you can display this as a table, a formula, or a graph.
Think of a probability distribution as the complete story of a random variable. Once you know the distribution, you know everything there is to know about what values to expect and with what chances.
Example: Flip a fair coin twice. Let $X$ = the number of heads.
The possible outcomes are: HH, HT, TH, TT (each with probability $\frac{1}{4}$).
- $X = 0$ happens with outcome TT. Probability: $\frac{1}{4}$
- $X = 1$ happens with outcomes HT or TH. Probability: $\frac{2}{4} = \frac{1}{2}$
- $X = 2$ happens with outcome HH. Probability: $\frac{1}{4}$
The probability distribution table:
| $x$ | $P(X = x)$ |
|---|---|
| 0 | 0.25 |
| 1 | 0.50 |
| 2 | 0.25 |
This table is the probability distribution. It tells you everything about the random variable $X$.
Probability Distribution Graphs
You can visualize a discrete probability distribution using a probability histogram or a bar graph. Each possible value of the random variable goes on the horizontal axis, and the height of the bar represents the probability.
For the “number of heads in 2 flips” example above:
- There would be bars at $x = 0$, $x = 1$, and $x = 2$
- The bar at $x = 1$ would be twice as tall as the bars at $x = 0$ and $x = 2$
- The shape would be symmetric, with the most probability in the middle
These graphs give you an immediate visual sense of what values are most likely, how spread out the distribution is, and whether it is symmetric or skewed.
Requirements for a Valid Probability Distribution
Not every table of numbers is a valid probability distribution. A valid probability distribution for a discrete random variable must satisfy two rules:
Rule 1: Every probability must be between 0 and 1 $$0 \leq P(X = x) \leq 1 \text{ for all values } x$$
No outcome can have a negative probability or a probability greater than 1. Something cannot be “less than impossible” or “more than certain.”
Rule 2: The probabilities must sum to 1 $$\sum_{\text{all } x} P(X = x) = 1$$
Something has to happen. When you add up the probabilities of all possible values, you must get exactly 1 (or 100%). This rule ensures you have accounted for every possibility.
If someone hands you a table and claims it is a probability distribution, check these two conditions. If either fails, something is wrong.
Expected Value: The Long-Run Average
The expected value of a random variable, written $E(X)$ or $\mu$ (the Greek letter “mu”), is the long-run average value you would expect if you repeated the random process many, many times.
Here is the formula for expected value:
$$E(X) = \mu = \sum_{\text{all } x} x \cdot P(X = x)$$
In words: multiply each possible value by its probability, then add up all these products.
Why this formula? Think about it: values that are more likely should contribute more to the average. If you roll a die many times, you will get each number roughly equally often, so the average should be around the middle value. But if a weighted die comes up 6 half the time, the average should be pulled toward 6. The formula captures this by weighting each value by how often it occurs.
Example: Find the expected value for the number of heads in 2 coin flips.
Using our probability distribution: $$E(X) = 0 \cdot (0.25) + 1 \cdot (0.50) + 2 \cdot (0.25)$$ $$E(X) = 0 + 0.50 + 0.50 = 1$$
On average, you expect 1 head when flipping a coin twice. This makes intuitive sense - each flip has a 50% chance of heads, so two flips should average one head.
Important: The expected value does not have to be a value the random variable can actually take. If you roll one die, $E(X) = 3.5$, even though you cannot roll a 3.5. The expected value is an average, not a prediction for any single trial.
Interpreting Expected Value
Expected value has several important interpretations:
1. Long-run average: If you repeat the random process thousands of times and average the results, you will get very close to $E(X)$. This is guaranteed by the Law of Large Numbers.
2. Fair value: In gambling and finance, expected value tells you what an investment or bet is “worth” in the long run. A game with $E(X) = 0$ is fair - neither player has an advantage. A game with negative expected value means you lose money on average.
3. Center of the distribution: The expected value is like a “balance point” or center of gravity for the probability distribution. If you made a physical model of the distribution and tried to balance it on your finger, the balance point would be at $E(X)$.
Variance and Standard Deviation
Expected value tells you where the center of a distribution is, but it does not tell you how spread out the values are. Two random variables can have the same expected value but very different amounts of variability.
The variance of a random variable, written $\text{Var}(X)$ or $\sigma^2$ (sigma squared), measures how spread out the values are around the mean.
$$\sigma^2 = \text{Var}(X) = \sum_{\text{all } x} (x - \mu)^2 \cdot P(X = x)$$
In words: for each possible value, find how far it is from the mean, square that distance, multiply by the probability, then add everything up.
There is also a computational formula that is often easier to use:
$$\sigma^2 = E(X^2) - [E(X)]^2$$
This says: variance equals the expected value of $X^2$ minus the square of the expected value of $X$.
The standard deviation is the square root of the variance:
$$\sigma = \sqrt{\sigma^2}$$
Standard deviation is often more interpretable because it is in the same units as the original random variable. If $X$ is measured in dollars, $\sigma$ is also in dollars, while $\sigma^2$ would be in “dollars squared.”
A small standard deviation means values tend to be close to the mean. A large standard deviation means values are more spread out.
Linear Transformations of Random Variables
Sometimes you need to transform a random variable - for example, converting temperatures from Celsius to Fahrenheit, or adding a fixed bonus to random earnings. These are linear transformations of the form $Y = aX + b$, where $a$ and $b$ are constants.
The rules for how expected value and standard deviation change are remarkably simple:
For $Y = aX + b$:
$$E(Y) = E(aX + b) = aE(X) + b$$ $$\text{Var}(Y) = a^2 \text{Var}(X)$$ $$\sigma_Y = |a| \sigma_X$$
In words:
- Expected value: Multiplying $X$ by $a$ multiplies the mean by $a$. Adding $b$ adds $b$ to the mean.
- Variance: Multiplying $X$ by $a$ multiplies the variance by $a^2$. Adding a constant does not change the variance at all.
- Standard deviation: Multiplying $X$ by $a$ multiplies the standard deviation by $|a|$. Adding a constant does not change it.
Why does adding a constant not affect spread? Think about it: if you add 10 to everyone’s test score, every score shifts up by 10, but the differences between scores stay the same. The data is not more or less spread out; it is just shifted.
Notation and Terminology
| Term | Meaning | Example |
|---|---|---|
| Random variable | Variable whose value is determined by chance | $X$ = number of heads in 3 flips |
| Discrete random variable | Takes countable values | 0, 1, 2, 3 |
| Continuous random variable | Takes any value in an interval | Height, time, temperature |
| Probability distribution | All possible values and their probabilities | Table, formula, or graph |
| $P(X = x)$ | Probability that $X$ equals $x$ | $P(X = 2) = 0.375$ |
| Expected value $E(X)$ or $\mu$ | Long-run average | $E(X) = \sum x \cdot P(x)$ |
| Variance $\sigma^2$ or $\text{Var}(X)$ | Measure of spread (squared units) | $\sigma^2 = \sum (x - \mu)^2 \cdot P(x)$ |
| Standard deviation $\sigma$ | Measure of spread (original units) | $\sigma = \sqrt{\sigma^2}$ |
| Linear transformation | $Y = aX + b$ | Converting units, adding bonuses |
Examples
A spinner has four equal sections labeled 1, 2, 3, and 4. You spin it twice and record the sum of the two numbers. Create the probability distribution for the sum.
Solution:
Step 1: List all possible outcomes. Each spin can land on 1, 2, 3, or 4. With two spins, there are $4 \times 4 = 16$ equally likely outcomes.
Step 2: Find each possible sum and count how many ways it can occur.
| Sum | Ways to get it | Count |
|---|---|---|
| 2 | (1,1) | 1 |
| 3 | (1,2), (2,1) | 2 |
| 4 | (1,3), (2,2), (3,1) | 3 |
| 5 | (1,4), (2,3), (3,2), (4,1) | 4 |
| 6 | (2,4), (3,3), (4,2) | 3 |
| 7 | (3,4), (4,3) | 2 |
| 8 | (4,4) | 1 |
Step 3: Create the probability distribution. Divide each count by 16 (the total number of outcomes).
| $x$ | $P(X = x)$ |
|---|---|
| 2 | 1/16 = 0.0625 |
| 3 | 2/16 = 0.125 |
| 4 | 3/16 = 0.1875 |
| 5 | 4/16 = 0.25 |
| 6 | 3/16 = 0.1875 |
| 7 | 2/16 = 0.125 |
| 8 | 1/16 = 0.0625 |
Check: The probabilities sum to $0.0625 + 0.125 + 0.1875 + 0.25 + 0.1875 + 0.125 + 0.0625 = 1$ ✓
Notice the distribution is symmetric around 5, which is the most likely sum.
Determine whether each table represents a valid probability distribution.
Table A:
| $x$ | $P(X = x)$ |
|---|---|
| 1 | 0.3 |
| 2 | 0.5 |
| 3 | 0.2 |
Table B:
| $x$ | $P(X = x)$ |
|---|---|
| 0 | 0.4 |
| 1 | 0.3 |
| 2 | -0.1 |
| 3 | 0.4 |
Table C:
| $x$ | $P(X = x)$ |
|---|---|
| 10 | 0.15 |
| 20 | 0.25 |
| 30 | 0.35 |
| 40 | 0.20 |
Solution:
Table A:
- Check 1: All probabilities between 0 and 1? Yes. (0.3, 0.5, and 0.2 are all valid.)
- Check 2: Do they sum to 1? $0.3 + 0.5 + 0.2 = 1.0$ ✓
Table A is a valid probability distribution.
Table B:
- Check 1: All probabilities between 0 and 1? No! The probability for $x = 2$ is $-0.1$, which is negative.
Table B is NOT a valid probability distribution. A probability cannot be negative.
Table C:
- Check 1: All probabilities between 0 and 1? Yes. (0.15, 0.25, 0.35, 0.20 are all valid.)
- Check 2: Do they sum to 1? $0.15 + 0.25 + 0.35 + 0.20 = 0.95$
Table C is NOT a valid probability distribution. The probabilities sum to 0.95, not 1. Some outcome is missing, or the probabilities are incorrect.
A carnival game costs $2 to play. You roll a die, and your payout depends on the result:
- Roll 1 or 2: Win $0
- Roll 3 or 4: Win $2
- Roll 5: Win $4
- Roll 6: Win $10
What is the expected value of your net gain (or loss) per game?
Solution:
Step 1: Define the random variable. Let $X$ = net gain (payout minus cost to play).
Step 2: Create the probability distribution for net gain. Each die face has probability $\frac{1}{6}$.
| Outcome | Payout | Net Gain ($X$) | Probability |
|---|---|---|---|
| 1 or 2 | $0 | $0 - $2 = -$2 | 2/6 |
| 3 or 4 | $2 | $2 - $2 = $0 | 2/6 |
| 5 | $4 | $4 - $2 = $2 | 1/6 |
| 6 | $10 | $10 - $2 = $8 | 1/6 |
Simplified probability distribution:
| $x$ | $P(X = x)$ |
|---|---|
| -$2 | 2/6 = 1/3 |
| $0 | 2/6 = 1/3 |
| $2 | 1/6 |
| $8 | 1/6 |
Step 3: Calculate expected value.
$$E(X) = \sum x \cdot P(X = x)$$ $$E(X) = (-2)\left(\frac{1}{3}\right) + (0)\left(\frac{1}{3}\right) + (2)\left(\frac{1}{6}\right) + (8)\left(\frac{1}{6}\right)$$ $$E(X) = -\frac{2}{3} + 0 + \frac{2}{6} + \frac{8}{6}$$ $$E(X) = -\frac{4}{6} + \frac{2}{6} + \frac{8}{6} = \frac{6}{6} = 1$$
The expected value is $1.
Interpretation: On average, you win $1 per game in the long run. This is actually a generous carnival game! If you played 100 times, you would expect to come out ahead by about $100 on average. (Most carnival games have negative expected value for the player.)
The number of text messages a teenager receives in an hour follows this probability distribution:
| $x$ (messages) | $P(X = x)$ |
|---|---|
| 0 | 0.10 |
| 1 | 0.20 |
| 2 | 0.35 |
| 3 | 0.25 |
| 4 | 0.10 |
Find the expected value, variance, and standard deviation of the number of messages.
Solution:
Step 1: Calculate the expected value $E(X)$.
$$E(X) = \sum x \cdot P(X = x)$$ $$E(X) = 0(0.10) + 1(0.20) + 2(0.35) + 3(0.25) + 4(0.10)$$ $$E(X) = 0 + 0.20 + 0.70 + 0.75 + 0.40 = 2.05$$
The expected number of messages per hour is 2.05.
Step 2: Calculate $E(X^2)$.
$$E(X^2) = \sum x^2 \cdot P(X = x)$$ $$E(X^2) = 0^2(0.10) + 1^2(0.20) + 2^2(0.35) + 3^2(0.25) + 4^2(0.10)$$ $$E(X^2) = 0 + 0.20 + 1.40 + 2.25 + 1.60 = 5.45$$
Step 3: Calculate the variance using $\sigma^2 = E(X^2) - [E(X)]^2$.
$$\sigma^2 = 5.45 - (2.05)^2 = 5.45 - 4.2025 = 1.2475$$
Step 4: Calculate the standard deviation.
$$\sigma = \sqrt{1.2475} \approx 1.117$$
Summary:
- Expected value: $\mu = 2.05$ messages
- Variance: $\sigma^2 \approx 1.25$ messages$^2$
- Standard deviation: $\sigma \approx 1.12$ messages
Interpretation: On average, this teenager receives about 2 messages per hour. The standard deviation of about 1.12 messages tells us that the typical variation from the average is roughly 1 message. Most hours will have somewhere between 1 and 3 messages (within one standard deviation of the mean).
A company pays its salespeople a base salary plus commission. The number of units sold per week by a salesperson is a random variable $X$ with:
- Expected value: $E(X) = 20$ units
- Standard deviation: $\sigma_X = 5$ units
The weekly pay is calculated as: Pay = $400 + $50 per unit sold.
a) Express the weekly pay as a linear transformation of $X$. b) Find the expected weekly pay. c) Find the standard deviation of weekly pay. d) If the company adds a $100 holiday bonus, how does this affect the expected pay and standard deviation?
Solution:
a) Express pay as a linear transformation.
Let $Y$ = weekly pay in dollars.
$$Y = 400 + 50X$$
This is a linear transformation with $a = 50$ and $b = 400$.
b) Find the expected weekly pay.
Using the rule $E(aX + b) = aE(X) + b$:
$$E(Y) = E(400 + 50X) = 400 + 50 \cdot E(X)$$ $$E(Y) = 400 + 50(20) = 400 + 1000 = 1400$$
The expected weekly pay is $1,400.
c) Find the standard deviation of weekly pay.
Using the rule $\sigma_Y = |a| \sigma_X$:
$$\sigma_Y = |50| \cdot \sigma_X = 50(5) = 250$$
The standard deviation of weekly pay is $250.
Note: Adding the $400 base salary does not affect the standard deviation. The variability in pay comes entirely from the variability in units sold.
d) Effect of adding a $100 holiday bonus.
Now the pay would be $Y’ = 500 + 50X$ (we added 100 to the constant).
For expected value: $$E(Y’) = 500 + 50(20) = 500 + 1000 = 1500$$
The expected pay increases by $100 to $1,500.
For standard deviation: $$\sigma_{Y’} = |50| \cdot 5 = 250$$
The standard deviation remains $250.
Adding a constant shifts all values equally, so it increases the mean but does not change how spread out the values are. Every salesperson gets $100 more, but the variation between salespeople stays the same.
Key Properties and Rules
Requirements for a Probability Distribution
For a discrete probability distribution to be valid:
- All probabilities must be between 0 and 1: $0 \leq P(X = x) \leq 1$ for all values $x$
- Probabilities must sum to 1: $\sum P(X = x) = 1$
Expected Value Formulas
Definition: $$E(X) = \mu = \sum_{\text{all } x} x \cdot P(X = x)$$
Properties:
- $E(c) = c$ for any constant $c$
- $E(aX) = aE(X)$
- $E(X + c) = E(X) + c$
- $E(aX + b) = aE(X) + b$
Variance and Standard Deviation Formulas
Definition: $$\sigma^2 = \text{Var}(X) = \sum_{\text{all } x} (x - \mu)^2 \cdot P(X = x)$$
Computational formula: $$\sigma^2 = E(X^2) - [E(X)]^2$$
Standard deviation: $$\sigma = \sqrt{\sigma^2}$$
Linear Transformation Rules
For $Y = aX + b$:
| Measure | Formula | Key Insight |
|---|---|---|
| Expected value | $E(Y) = aE(X) + b$ | Both $a$ and $b$ affect the mean |
| Variance | $\text{Var}(Y) = a^2\text{Var}(X)$ | Only $a$ affects variance; constants add nothing |
| Standard deviation | $\sigma_Y = |a| \sigma_X$ | Adding constants does not change spread |
Real-World Applications
Expected Winnings in Games
Casinos, lotteries, and carnival games can all be analyzed using expected value. The expected value tells you how much you will win (or lose) on average per play.
Example: A lottery ticket costs $2 and offers these prizes:
- 1 in 1,000,000 chance to win $1,000,000
- 1 in 10,000 chance to win $1,000
- 1 in 100 chance to win $10
Expected value of net gain: $$E(X) = (1,000,000 - 2)\left(\frac{1}{1,000,000}\right) + (1,000 - 2)\left(\frac{1}{10,000}\right) + (10 - 2)\left(\frac{1}{100}\right) + (-2)\left(\frac{989,801}{1,000,000}\right)$$
Working this out would show that the expected value is negative - you lose money on average. This is how lotteries make a profit. Understanding expected value helps you make informed decisions about whether to play.
Insurance Premiums
Insurance companies use expected value to set premiums. Suppose data shows that in a given year:
- 1 in 500 policyholders will have a $50,000 claim
- 1 in 100 will have a $5,000 claim
- 1 in 20 will have a $500 claim
The expected claim per policyholder: $$E(\text{claim}) = 50,000\left(\frac{1}{500}\right) + 5,000\left(\frac{1}{100}\right) + 500\left(\frac{1}{20}\right) = 100 + 50 + 25 = \$175$$
The insurance company would need to charge at least $175 per policyholder just to break even on claims. They charge more to cover administrative costs and earn a profit.
Investment Returns
Financial analysts model investment returns as random variables. The expected return tells you what to anticipate on average, while the standard deviation measures risk - how much the actual return might vary from the expected return.
An investment with high expected return and high standard deviation is riskier than one with moderate expected return and low standard deviation. Understanding this tradeoff between expected return and risk (standard deviation) is fundamental to portfolio theory.
Business Decision Analysis
Businesses use expected value to evaluate decisions under uncertainty.
Example: A company is deciding whether to launch a new product.
- 30% chance of high demand: $500,000 profit
- 50% chance of moderate demand: $100,000 profit
- 20% chance of low demand: $200,000 loss
Expected profit: $$E(\text{profit}) = 500,000(0.30) + 100,000(0.50) + (-200,000)(0.20)$$ $$E(\text{profit}) = 150,000 + 50,000 - 40,000 = \$160,000$$
With a positive expected value of $160,000, the launch looks favorable on average. The company might also consider the standard deviation to understand the risk involved - how much could they lose in the worst case?
Self-Test Problems
Problem 1: A random variable $X$ has the following probability distribution:
| $x$ | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| $P(X = x)$ | 0.1 | 0.3 | $k$ | 0.2 |
Find the value of $k$ that makes this a valid probability distribution.
Show Answer
For a valid probability distribution, all probabilities must sum to 1.
$$0.1 + 0.3 + k + 0.2 = 1$$ $$0.6 + k = 1$$ $$k = 0.4$$
Check: $0.1 + 0.3 + 0.4 + 0.2 = 1$ ✓
Problem 2: You play a game where you flip a coin three times. You win $5 if you get exactly 2 heads, and you win $10 if you get 3 heads. Otherwise, you win nothing. What is the expected value of your winnings?
Show Answer
First, find the probabilities. With 3 flips, there are $2^3 = 8$ equally likely outcomes.
- P(0 heads) = 1/8 (TTT)
- P(1 head) = 3/8 (HTT, THT, TTH)
- P(2 heads) = 3/8 (HHT, HTH, THH)
- P(3 heads) = 1/8 (HHH)
Now create the distribution for winnings:
| Winnings ($X$) | $P(X)$ |
|---|---|
| $0 | 4/8 = 1/2 (0 or 1 heads) |
| $5 | 3/8 (2 heads) |
| $10 | 1/8 (3 heads) |
Expected value: $$E(X) = 0\left(\frac{1}{2}\right) + 5\left(\frac{3}{8}\right) + 10\left(\frac{1}{8}\right)$$ $$E(X) = 0 + \frac{15}{8} + \frac{10}{8} = \frac{25}{8} = \$3.125$$
On average, you win $3.125 per game.
Problem 3: A random variable $X$ has $E(X) = 10$ and $\sigma_X = 3$. Find the expected value and standard deviation of $Y = 2X - 5$.
Show Answer
Using the linear transformation rules with $a = 2$ and $b = -5$:
Expected value: $$E(Y) = aE(X) + b = 2(10) + (-5) = 20 - 5 = 15$$
Standard deviation: $$\sigma_Y = |a| \sigma_X = |2| \cdot 3 = 6$$
Note that the constant $-5$ does not affect the standard deviation.
Problem 4: The number of cars arriving at a drive-through window during a 5-minute period is a random variable with this distribution:
| Cars ($x$) | 0 | 1 | 2 | 3 | 4 |
|---|---|---|---|---|---|
| $P(X = x)$ | 0.05 | 0.20 | 0.35 | 0.30 | 0.10 |
a) Find the expected number of cars. b) Find the variance and standard deviation.
Show Answer
a) Expected value: $$E(X) = 0(0.05) + 1(0.20) + 2(0.35) + 3(0.30) + 4(0.10)$$ $$E(X) = 0 + 0.20 + 0.70 + 0.90 + 0.40 = 2.20 \text{ cars}$$
b) Variance and standard deviation:
First, find $E(X^2)$: $$E(X^2) = 0^2(0.05) + 1^2(0.20) + 2^2(0.35) + 3^2(0.30) + 4^2(0.10)$$ $$E(X^2) = 0 + 0.20 + 1.40 + 2.70 + 1.60 = 5.90$$
Variance: $$\sigma^2 = E(X^2) - [E(X)]^2 = 5.90 - (2.20)^2 = 5.90 - 4.84 = 1.06$$
Standard deviation: $$\sigma = \sqrt{1.06} \approx 1.03 \text{ cars}$$
Problem 5: An insurance policy costs $500 per year. If a claim is filed (probability 0.02), the insurance pays out $20,000. From the insurance company’s perspective, what is the expected profit per policy?
Show Answer
From the insurance company’s perspective:
- They receive $500 in premiums
- With probability 0.02, they pay out $20,000
- With probability 0.98, they pay nothing
Let $X$ = company’s profit per policy.
| Outcome | Profit | Probability |
|---|---|---|
| No claim | $500 | 0.98 |
| Claim | $500 - $20,000 = -$19,500 | 0.02 |
Expected profit: $$E(X) = 500(0.98) + (-19,500)(0.02)$$ $$E(X) = 490 - 390 = \$100$$
The insurance company expects to earn $100 profit per policy on average. This is how insurance companies stay in business - they collect more in premiums than they expect to pay out in claims.
Problem 6: Determine whether each of the following is a valid probability distribution. If not, explain why.
a)
| $x$ | 1 | 2 | 3 |
|---|---|---|---|
| $P(X = x)$ | 0.5 | 0.3 | 0.3 |
b)
| $x$ | 0 | 5 | 10 |
|---|---|---|---|
| $P(X = x)$ | 0.4 | 0.35 | 0.25 |
Show Answer
a) Not valid.
Check the sum: $0.5 + 0.3 + 0.3 = 1.1$
The probabilities sum to more than 1, which violates the requirement that probabilities sum to exactly 1.
b) Valid.
- All probabilities are between 0 and 1: ✓
- Sum: $0.4 + 0.35 + 0.25 = 1.0$ ✓
Both requirements are satisfied, so this is a valid probability distribution.
Summary
-
A random variable assigns numerical values to outcomes of a random process. Discrete random variables take countable values (like 0, 1, 2, …), while continuous random variables can take any value in an interval.
-
A probability distribution lists all possible values of a random variable along with their probabilities. It can be displayed as a table, graph, or formula.
-
For a valid probability distribution: all probabilities must be between 0 and 1, and the probabilities must sum to exactly 1.
-
The expected value $E(X)$ or $\mu$ is the long-run average value. Calculate it by multiplying each value by its probability and summing: $E(X) = \sum x \cdot P(X = x)$.
-
Expected value represents the “fair value” or “center” of a distribution. It does not have to be a value the random variable can actually take.
-
Variance $\sigma^2$ measures how spread out values are around the mean. Use the formula $\sigma^2 = E(X^2) - [E(X)]^2$.
-
Standard deviation $\sigma$ is the square root of variance and is measured in the same units as the random variable.
-
For a linear transformation $Y = aX + b$:
- Expected value: $E(Y) = aE(X) + b$
- Standard deviation: $\sigma_Y = |a|\sigma_X$ (adding a constant does not change spread)
-
Expected value has important applications in gambling (expected winnings), insurance (setting premiums), finance (investment returns), and business (decision analysis under uncertainty).