Raymond B. answered 06/29/25
Math, microeconomics or criminal justice
for a binomial distribution:
E(x) = np = n(1/n) = 1
Var(x) = npq = 1(1-1/n) = 1-1/n = (n-1)/n
Hugh R.
asked 06/11/25There are n balls marked 1, 2, ..., n in a container. Drawing one ball at a time, record its number, then put it back. Repeat this r times. If X is the number of different balls drawn, what is the expectation and variance of X?
Raymond B. answered 06/29/25
Math, microeconomics or criminal justice
for a binomial distribution:
E(x) = np = n(1/n) = 1
Var(x) = npq = 1(1-1/n) = 1-1/n = (n-1)/n
We can compute both the expected value and the variance using indicator variables and linearity of expectation. Let X_i be 1 if the ith ball is drawn at some point and 0 if the ith ball is never drawn. Then X is the sum X_1+X_2+...+X_n.
By linearity of expectation, since X = X_1+X_2+...+X_n, E[X] = E[X_1]+...+E[X_n].
Each X_i has the same probability of being 1 so E[X] = n E[X_1] = n*P(ball 1 is drawn) = n(1-P(ball 1 is never drawn in r draws) = n(1-(1-1/n)^r).
For example, if n=10 and r=5, E[X] = 40951/10000 = 4.0951.
One formula for the variance is Var[X] = E[X^2] -E[X]^2. We have calculated E[X] so we need to calculate E[X^2]. We can substitute X=X_1+...+X_n and expand the square into a sum of n diagonal terms like X_1^2 and n^2-n off-diagonal terms like X_1 X_2. By symmetry, all of the diagonal terms have the same expected value and all of the off-diagonal terms have the same expected value.
E[X^2] = E[(X_1+...+X_n)^2] = n E[X_1^2] + (n^2-n) E[X_1 X_2].
Since X_1 only takes the values 0 and 1, X_1^2 = X_1, so E[X_1^2] = E[X_1] which we computed above.
E[X_1 X_2] = P(both balls 1 and 2 are drawn at least once).
We can compute that probability using inclusion-exclusion:
P(both balls 1 and 2 are drawn) = 1 - P(ball 1 is missed) - P(ball 2 is missed) + P(both are missed)
= 1 - (1 - 1/n)^r - (1 - 1/n)^r + (1 - 2/n)^r.
That (1 - 2/n) is slightly lower than (1 - 1/n)^2 means the events of missing ball 1 and missing ball 2 are not independent. The indicators have a small negative correlation since missing ball 1 makes it harder to miss ball 2.
E[X^2] = n (1- (1-1/n)^r) + (n^2-n) (1 - 2(1-1/n)^r + (1-2/n)^r).
Var[X] = n (1- (1-1/n)^r) + (n^2-n) (1 - 2(1-1/n)^r + (1-2/n)^r) - n^2 (1-(1-1/n)^r)^2.
For example, if n=10 and r=5, Var[X] = 0.52825599 exactly.
This calculation can be slightly streamlined by computing the small negative covariance Cov[X_i,X_j] and using a general formula for the variance of a sum, Var[X_1+...+X_n] = Sum_i Var[X_i] + 2 Sum_i<j Cov[X_i,X_j], but the method I used above requires slightly less background.
Douglas Z.
06/19/25
Douglas Z.
06/23/25
Hi Hugh,
E(x) = np
Var(x) = np(1-p)
where E(x) = expected value
Var(x) = variance
Probability of any numbered item being drawn is 1/n, where n is the number of items in the container.
So:
E(x) = n(1/n) = 1
Var(x) = n(1/n)(1-(1/n))
Var(x) = 1 - 1/n
I hope this helps.
Hugh R.
It seems that you did not consider that there is a possibility that some balls are never drawn, especially when r < n. For instance, if n = 5 and r = 3, E(X) = 61/25.06/13/25
Huaizhong R.
06/13/25
Huaizhong R.
06/13/25
Joshua L.
06/13/25
Huaizhong R.
06/14/25
Get a free answer to a quick problem.
Most questions answered within 4 hours.
Choose an expert and meet online. No packages or subscriptions, pay only for the time you need.
Hugh R.
Your formula for E(X) when n=r=3 would produce E(X)=8/9<1, which is absurd. As far as r>0, we always have E(X) greater than or equal to 1.06/18/25