# Expectation and variance of a binary random variable

If you start dealing with Generalized linear models (GLMs) you will come across sentences like “Obviously the variance of the binary dependent variable is $\mu(1-\mu)$.” Well, for everybody who does not find it too obvious the following derivation may help in understanding the mathematical reasoning behind GLMs, especially Logit and Probit models.

Assume a binary random variable $X$: $X= \begin{cases}0 \; \text{with probability} \; P(X=0) \\1 \; \text{with probability} \; P(X=1)\end{cases}$

The relation $P(X=0)=1-P(X=1)$ holds since the probabilities (of a discrete random variable) must sum to 1.

The Variance of a random variable is defined as $Var(X)=E \left \{[X-E(X)]^2 \right \}=E[X^2]-E[X]^2$.

The expected value of our binary random variable is $E[X]=(1-P(X=1)) \cdot 0 + P(X=1) \cdot 1 = P(X=1)$. $E[X]$ therefore has the nice interpretation of being the probabilty of X taking on the value 1.

With that information we can derive the variance of a binary random variate: $\begin{array}{l l} Var(X) &= E[X^2]-E[X]^2 \\ &= E[X]-E[X]^2 \quad (*)\\ &=P(X=1)-P(X=1)^2 \\ &=P(X=1)(1-P(X=1)) \end{array}$ $(*)$ holds because X can only take on the values zero or one and it holds that $1^2=1$ and $0^2=0$.

## One thought on “Expectation and variance of a binary random variable”

1. Deepti

This was useful. thanks.