If you start dealing with Generalized linear models (GLMs) you will come across sentences like “Obviously the variance of the binary dependent variable is .” Well, for everybody who does not find it too obvious the following derivation may help in understanding the mathematical reasoning behind GLMs, especially Logit and Probit models.
Assume a binary random variable :
The relation holds since the probabilities (of a discrete random variable) must sum to 1.
The Variance of a random variable is defined as
.
The expected value of our binary random variable is
.
therefore has the nice interpretation of being the probabilty of X taking on the value 1.
With that information we can derive the variance of a binary random variate:
holds because X can only take on the values zero or one and it holds that
and
.
This was useful. thanks.