Learn Before
Concept
Proof that the Mean of Bernoulli samples is an Unbiased Estimator for the Bernoulli Parameter
= \mathbb{E}\left[\frac{1}{m} \sum_{i=1}^{m}x^{(i)}\right] - \theta \\
= \frac{1}{m} \sum_{i=1}^{m}\mathbb{E}\left[x^{(i)}\right] - \theta \\
= \frac{1}{m} \sum_{i=1}^{m}\sum_{x^{(i)} = 0}^1\left( x^{(i)}\theta^{x^{(i)}}(1-\theta)^{(1-x^{(i)})}\right)- \theta \\
= \frac{1}{m} \sum_{i=1}^m(\theta) - \theta \\
= \theta - \theta = 0$$
Reasons that the steps above work: The first one is the definition of bias. The second one is plugging in our estimator. The third one is because the expected value of the sum is the sum of the expected values. The fourth is by the definition of expected value of a discrete random variable (it is each value times the probability at that value, which for a bernoulli r.v. is $\theta$ if $x = 1$, and $1 - \theta$ if $x = 0$, which can be summarized as $P(x;\theta) = \theta^x(1-\theta)^{1-x}$. The fifth line works because we have assumed identically distributed random variables (note that we also assumed that they are independent, but this isn’t necessary: the same derivation holds when the random variables are dependent). Then getting from the fifth to the sixth lines requires recognizing that the sum of $m$ $\theta$'s is equal to $m*\theta$, and when you multiply that by $\frac{1}{m}$ you get $\theta$.
0
1
Updated 2021-05-23
Tags
Data Science