Skip to contents

Derivation notes

For a series of Bernoulli observations, Y_i, the variance of the observations is given by

Var(Y)=((YiμY)2)n1Var(Y) = \frac{\sum(\left( Y_i - \mu_Y \right)^2)}{n-1}

where μY\mu_Y is the mean of the observations.

However, if the data is structured as Binomial observations, Pj=kjnjP_j = \frac{k_j}{n_j} (njn_j trials, kjk_j successes, and rjr_j failures) we need to weight the observations by the number of trials, njn_j to understand the underlying Bernoulli process.

First, to determine the mean of the underlying Bernoulli process:

μy=iYin=jkjn\mu_y = \frac{\sum_i Y_i}{n} = \frac{\sum_j k_j}{n}

i.e. the sum of observed successes divided by the total number of trials.

Then, we can sub-divide the sum of squares into the sum of squares of the successes and the sum of squares of the failures:

Var(Y)=((YiμY)2)n1Var(Y) = \frac{\sum(\left( Y_i - \mu_Y \right)^2)}{n-1} =jkj(1μY)2+rj(0μY)2n1 = \frac{\sum_j k_j (1-\mu_Y)^2 + r_j (0-\mu_Y)^2}{n-1} =jkj(1μY)2+rjμY2n1 = \frac{\sum_j k_j (1-\mu_Y)^2+ r_j \mu_Y^2}{n-1}

We follow simialr process to enumerate the variance of the fitted values, ŷ\hat{y}. The mean of the fitted values is:

μŷ=jnjŷjn\mu_{\hat{y}} = \frac{\sum_j n_j \hat{y}_j}{n}

and the variance is:

Var(Ŷ)=jnj(ŷjμŷ)2n1Var(\hat{Y}) = \frac{\sum_j n_j (\hat{y}_j - \mu_{\hat{y}})^2}{n-1}