Expectation and variance of a binary random variable

If you start dealing with Generalized linear models (GLMs) you will come across sentences like “Obviously the variance of the binary dependent variable is $\mu(1-\mu)$.” Well, for everybody who does not find it too obvious the following derivation may help in understanding the mathematical reasoning behind GLMs, especially Logit and Probit models.

Assume a binary random variable $X$:

$X= \begin{cases}0 \; \text{with probability} \; P(X=0) \\1 \; \text{with probability} \; P(X=1)\end{cases}$

The relation $P(X=0)=1-P(X=1)$ holds since the probabilities (of a discrete random variable) must sum to 1.

The Variance of a random variable is defined as

$Var(X)=E \left \{[X-E(X)]^2 \right \}=E[X^2]-E[X]^2$.

The expected value of our binary random variable is

$E[X]=(1-P(X=1)) \cdot 0 + P(X=1) \cdot 1 = P(X=1)$.

$E[X]$ therefore has the nice interpretation of being the probabilty of X taking on the value 1.

With that information we can derive the variance of a binary random variate:

$\begin{array}{l l} Var(X) &= E[X^2]-E[X]^2 \\ &= E[X]-E[X]^2 \quad (*)\\ &=P(X=1)-P(X=1)^2 \\ &=P(X=1)(1-P(X=1)) \end{array}$

$(*)$ holds because X can only take on the values zero or one and it holds that $1^2=1$ and $0^2=0$.