forked from lgreski/datasciencectacontent
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathstatinf-varOfBinomialDistribution.Rmd
31 lines (16 loc) · 1.64 KB
/
statinf-varOfBinomialDistribution.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Variance of Binomial Distribution
In the August run of *Statistical Inference*, a student asked a question about why the variance for a toss of a coin with a probability of heads (1) equal to *p* was *p(1 - p)* instead of *n p(1 - p)*, referencing the Wikipedia article for the [Binomial Distribution](http://bit.ly/2vU528j).
In the lecture, Professor Caffo uses the [moment statistics](http://bit.ly/2iTE6zW) formula for the second central moment to calculate the variance, assuming a single coin flip.
$p = 0.5$
$Var(X) =E[X^2] - E[X]^2 = p - p^2$
$p - p^2 = p(1 - p)$.
$p(1 - p) = 1 * .5(1 - .5) = .25$.
In the case where we have multiple coin flips, each flip is independent, and all flips have the same probability of a TRUE / 1 result. In this situation, we have a random variable *X* with parameters *n* and *p* that represents the sum of *n* independent variables *Z*, each of which can take the value 0 or 1. Therefore, the variance of the total number of flips is equal to <i>n * p(1 - p)</i>.
In the case of a single coin flip for a binomial distribution, the binomial distribution variance is:
$Var(p) = n * p(1-p) = 1 * .5(1 - .5) = .25$.
If we flip a fair coin 8 times, the variance of the binomial distribution of flips is:
$8 * p(1 - p) = 8 * .5 * .5 = 2$.
The conclusion is that the variance of a binomial distribution of *n* flips increases as the number of flips increases because we are measuring the mean of the counts of 1 over the total number of flips.
This makes sense intuitively because if *n = 10*, the mean can vary between 0 and 10 with *E[X] = 5*, but if *n = 100* the mean can vary between 0 and 100 with *E[X] = 50*.
regards,
Len