829 lines
15 KiB
Markdown
829 lines
15 KiB
Markdown
---
|
||
title: "Fisher information"
|
||
chunk: 3/8
|
||
source: "https://en.wikipedia.org/wiki/Fisher_information"
|
||
category: "reference"
|
||
tags: "science, encyclopedia"
|
||
date_saved: "2026-05-05T09:50:15.726073+00:00"
|
||
instance: "kb-cron"
|
||
---
|
||
|
||
In other words, the precision to which we can estimate θ is fundamentally limited by the Fisher information of the likelihood function.
|
||
Alternatively, the same conclusion can be obtained directly from the Cauchy–Schwarz inequality for random variables,
|
||
|
||
|
||
|
||
|
||
|
|
||
|
||
Cov
|
||
|
||
(
|
||
A
|
||
,
|
||
B
|
||
)
|
||
|
||
|
||
|
|
||
|
||
|
||
2
|
||
|
||
|
||
≤
|
||
Var
|
||
|
||
(
|
||
A
|
||
)
|
||
Var
|
||
|
||
(
|
||
B
|
||
)
|
||
|
||
|
||
{\displaystyle |\operatorname {Cov} (A,B)|^{2}\leq \operatorname {Var} (A)\operatorname {Var} (B)}
|
||
|
||
, applied to the random variables
|
||
|
||
|
||
|
||
|
||
|
||
|
||
θ
|
||
^
|
||
|
||
|
||
|
||
(
|
||
X
|
||
)
|
||
|
||
|
||
{\displaystyle {\hat {\theta }}(X)}
|
||
|
||
and
|
||
|
||
|
||
|
||
|
||
∂
|
||
|
||
θ
|
||
|
||
|
||
log
|
||
|
||
f
|
||
(
|
||
X
|
||
;
|
||
θ
|
||
)
|
||
|
||
|
||
{\displaystyle \partial _{\theta }\log f(X;\theta )}
|
||
|
||
, and observing that for unbiased estimators we have
|
||
|
||
|
||
|
||
Cov
|
||
|
||
[
|
||
|
||
|
||
|
||
θ
|
||
^
|
||
|
||
|
||
|
||
(
|
||
X
|
||
)
|
||
,
|
||
|
||
∂
|
||
|
||
θ
|
||
|
||
|
||
log
|
||
|
||
f
|
||
(
|
||
X
|
||
;
|
||
θ
|
||
)
|
||
]
|
||
=
|
||
∫
|
||
|
||
|
||
|
||
θ
|
||
^
|
||
|
||
|
||
|
||
(
|
||
x
|
||
)
|
||
|
||
|
||
∂
|
||
|
||
θ
|
||
|
||
|
||
f
|
||
(
|
||
x
|
||
;
|
||
θ
|
||
)
|
||
|
||
d
|
||
x
|
||
=
|
||
|
||
∂
|
||
|
||
θ
|
||
|
||
|
||
E
|
||
|
||
[
|
||
|
||
|
||
|
||
θ
|
||
^
|
||
|
||
|
||
|
||
]
|
||
=
|
||
1.
|
||
|
||
|
||
{\displaystyle \operatorname {Cov} [{\hat {\theta }}(X),\partial _{\theta }\log f(X;\theta )]=\int {\hat {\theta }}(x)\,\partial _{\theta }f(x;\theta )\,dx=\partial _{\theta }\operatorname {E} [{\hat {\theta }}]=1.}
|
||
|
||
|
||
== Examples ==
|
||
|
||
=== Single-parameter Bernoulli experiment ===
|
||
A Bernoulli trial is a random variable with two possible outcomes, 0 and 1, with 1 having a probability of θ. The outcome can be thought of as determined by the toss of a biased coin, with the probability of heads (1) being θ and the probability of tails (0) being 1 − θ.
|
||
Let X be a Bernoulli trial of one sample from the distribution. The Fisher information contained in X may be calculated to be:
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
I
|
||
|
||
|
||
(
|
||
θ
|
||
)
|
||
|
||
|
||
|
||
=
|
||
−
|
||
E
|
||
|
||
|
||
[
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
∂
|
||
|
||
2
|
||
|
||
|
||
|
||
∂
|
||
|
||
θ
|
||
|
||
2
|
||
|
||
|
||
|
||
|
||
|
||
log
|
||
|
||
|
||
(
|
||
|
||
|
||
θ
|
||
|
||
X
|
||
|
||
|
||
(
|
||
1
|
||
−
|
||
θ
|
||
|
||
)
|
||
|
||
1
|
||
−
|
||
X
|
||
|
||
|
||
|
||
)
|
||
|
||
|
||
|
|
||
|
||
θ
|
||
|
||
]
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
=
|
||
−
|
||
E
|
||
|
||
|
||
[
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
∂
|
||
|
||
2
|
||
|
||
|
||
|
||
∂
|
||
|
||
θ
|
||
|
||
2
|
||
|
||
|
||
|
||
|
||
|
||
|
||
(
|
||
|
||
X
|
||
log
|
||
|
||
θ
|
||
+
|
||
(
|
||
1
|
||
−
|
||
X
|
||
)
|
||
log
|
||
|
||
(
|
||
1
|
||
−
|
||
θ
|
||
)
|
||
|
||
)
|
||
|
||
|
||
|
||
|
||
|
|
||
|
||
|
||
|
||
θ
|
||
|
||
]
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
=
|
||
E
|
||
|
||
|
||
[
|
||
|
||
|
||
|
||
|
||
|
||
|
||
X
|
||
|
||
θ
|
||
|
||
2
|
||
|
||
|
||
|
||
|
||
+
|
||
|
||
|
||
|
||
1
|
||
−
|
||
X
|
||
|
||
|
||
(
|
||
1
|
||
−
|
||
θ
|
||
|
||
)
|
||
|
||
2
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
|
||
|
||
|
||
|
||
θ
|
||
|
||
]
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
=
|
||
|
||
|
||
θ
|
||
|
||
θ
|
||
|
||
2
|
||
|
||
|
||
|
||
|
||
+
|
||
|
||
|
||
|
||
1
|
||
−
|
||
θ
|
||
|
||
|
||
(
|
||
1
|
||
−
|
||
θ
|
||
|
||
)
|
||
|
||
2
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
=
|
||
|
||
|
||
1
|
||
|
||
θ
|
||
(
|
||
1
|
||
−
|
||
θ
|
||
)
|
||
|
||
|
||
|
||
.
|
||
|
||
|
||
|
||
|
||
|
||
|
||
{\displaystyle {\begin{aligned}{\mathcal {I}}(\theta )&=-\operatorname {E} \left[\left.{\frac {\partial ^{2}}{\partial \theta ^{2}}}\log \left(\theta ^{X}(1-\theta )^{1-X}\right)\right|\theta \right]\\[5pt]&=-\operatorname {E} \left[\left.{\frac {\partial ^{2}}{\partial \theta ^{2}}}\left(X\log \theta +(1-X)\log(1-\theta )\right)\,\,\right|\,\,\theta \right]\\[5pt]&=\operatorname {E} \left[\left.{\frac {X}{\theta ^{2}}}+{\frac {1-X}{(1-\theta )^{2}}}\,\,\right|\,\,\theta \right]\\[5pt]&={\frac {\theta }{\theta ^{2}}}+{\frac {1-\theta }{(1-\theta )^{2}}}\\[5pt]&={\frac {1}{\theta (1-\theta )}}.\end{aligned}}}
|
||
|
||
|
||
Because Fisher information is additive, the Fisher information contained in n independent Bernoulli trials is therefore
|
||
|
||
|
||
|
||
|
||
|
||
|
||
I
|
||
|
||
|
||
(
|
||
θ
|
||
)
|
||
=
|
||
|
||
|
||
n
|
||
|
||
θ
|
||
(
|
||
1
|
||
−
|
||
θ
|
||
)
|
||
|
||
|
||
|
||
.
|
||
|
||
|
||
{\displaystyle {\mathcal {I}}(\theta )={\frac {n}{\theta (1-\theta )}}.}
|
||
|
||
|
||
If
|
||
|
||
|
||
|
||
|
||
x
|
||
|
||
i
|
||
|
||
|
||
|
||
|
||
{\displaystyle x_{i}}
|
||
|
||
is one of the
|
||
|
||
|
||
|
||
|
||
2
|
||
|
||
n
|
||
|
||
|
||
|
||
|
||
{\displaystyle 2^{n}}
|
||
|
||
possible outcomes of n independent Bernoulli trials and
|
||
|
||
|
||
|
||
|
||
x
|
||
|
||
i
|
||
j
|
||
|
||
|
||
|
||
|
||
{\displaystyle x_{ij}}
|
||
|
||
is the j th outcome of the i th trial, then the probability of
|
||
|
||
|
||
|
||
|
||
x
|
||
|
||
i
|
||
|
||
|
||
|
||
|
||
{\displaystyle x_{i}}
|
||
|
||
is given by
|
||
|
||
|
||
|
||
|
||
p
|
||
(
|
||
|
||
x
|
||
|
||
i
|
||
|
||
|
||
,
|
||
θ
|
||
)
|
||
=
|
||
|
||
∏
|
||
|
||
j
|
||
=
|
||
0
|
||
|
||
|
||
n
|
||
|
||
|
||
|
||
θ
|
||
|
||
|
||
x
|
||
|
||
i
|
||
j
|
||
|
||
|
||
|
||
|
||
(
|
||
1
|
||
−
|
||
θ
|
||
|
||
)
|
||
|
||
|
||
x
|
||
|
||
i
|
||
j
|
||
|
||
|
||
|
||
|
||
|
||
|
||
{\displaystyle p(x_{i},\theta )=\prod _{j=0}^{n}\theta ^{x_{ij}}(1-\theta )^{x_{ij}}}
|
||
|
||
|
||
The sample mean of the i th trial is
|
||
|
||
|
||
|
||
|
||
μ
|
||
|
||
i
|
||
|
||
|
||
=
|
||
(
|
||
1
|
||
|
||
/
|
||
|
||
n
|
||
)
|
||
|
||
∑
|
||
|
||
j
|
||
=
|
||
1
|
||
|
||
|
||
n
|
||
|
||
|
||
|
||
x
|
||
|
||
i
|
||
j
|
||
|
||
|
||
|
||
|
||
{\displaystyle \mu _{i}=(1/n)\sum _{j=1}^{n}x_{ij}}
|
||
|
||
. The expected value of the sample mean (over the sampling distribution) is
|
||
|
||
|
||
|
||
|
||
E
|
||
(
|
||
μ
|
||
)
|
||
=
|
||
|
||
∑
|
||
|
||
|
||
x
|
||
|
||
i
|
||
|
||
|
||
|
||
|
||
|
||
μ
|
||
|
||
i
|
||
|
||
|
||
|
||
p
|
||
(
|
||
|
||
x
|
||
|
||
i
|
||
|
||
|
||
,
|
||
θ
|
||
)
|
||
=
|
||
θ
|
||
,
|
||
|
||
|
||
{\displaystyle E(\mu )=\sum _{x_{i}}\mu _{i}\,p(x_{i},\theta )=\theta ,}
|
||
|
||
|
||
where the sum is over all
|
||
|
||
|
||
|
||
|
||
2
|
||
|
||
n
|
||
|
||
|
||
|
||
|
||
{\displaystyle 2^{n}}
|
||
|
||
possible trial outcomes. The expected value of the square of the sample mean is
|
||
|
||
|
||
|
||
|
||
E
|
||
(
|
||
|
||
μ
|
||
|
||
2
|
||
|
||
|
||
)
|
||
=
|
||
|
||
∑
|
||
|
||
|
||
x
|
||
|
||
i
|
||
|
||
|
||
|
||
|
||
|
||
μ
|
||
|
||
i
|
||
|
||
|
||
2
|
||
|
||
|
||
|
||
p
|
||
(
|
||
|
||
x
|
||
|
||
i
|
||
|
||
|
||
,
|
||
θ
|
||
)
|
||
=
|
||
|
||
|
||
|
||
(
|
||
1
|
||
+
|
||
(
|
||
n
|
||
−
|
||
1
|
||
)
|
||
θ
|
||
)
|
||
θ
|
||
|
||
n
|
||
|
||
|
||
|
||
|
||
{\displaystyle E(\mu ^{2})=\sum _{x_{i}}\mu _{i}^{2}\,p(x_{i},\theta )={\frac {(1+(n-1)\theta )\theta }{n}}}
|
||
|
||
|
||
so the variance in the value of the mean is
|
||
|
||
|
||
|
||
|
||
E
|
||
(
|
||
|
||
μ
|
||
|
||
2
|
||
|
||
|
||
)
|
||
−
|
||
E
|
||
(
|
||
μ
|
||
|
||
)
|
||
|
||
2
|
||
|
||
|
||
=
|
||
|
||
|
||
|
||
θ
|
||
(
|
||
1
|
||
−
|
||
θ
|
||
)
|
||
|
||
n
|
||
|
||
|
||
|
||
|
||
{\displaystyle E(\mu ^{2})-E(\mu )^{2}={\frac {\theta (1-\theta )}{n}}}
|
||
|
||
|
||
It is seen that the Fisher information is the reciprocal of the variance of the mean number of successes in n Bernoulli trials. This is generally true. In this case, the Cramér–Rao bound is an equality. |