kb/data/en.wikipedia.org/wiki/Fisher_information-4.md

30 KiB
Raw Blame History

title chunk source category tags date_saved instance
Fisher information 5/8 https://en.wikipedia.org/wiki/Fisher_information reference science, encyclopedia 2026-05-05T09:50:15.726073+00:00 kb-cron

=== Multivariate normal distribution === The FIM for a N-variate multivariate normal distribution,

    X
    
    N
    
      (
      
        μ
        (
        θ
        )
        ,
        
        Σ
        (
        θ
        )
      
      )
    
  

{\displaystyle \,X\sim N\left(\mu (\theta ),\,\Sigma (\theta )\right)}

has a special form. Let the K-dimensional vector of parameters be

    θ
    =
    
      
        
          [
          
            
              
                
                  θ
                  
                    1
                  
                
              
              
                …
              
              
                
                  θ
                  
                    K
                  
                
              
            
          
          ]
        
      
      
        
          T
        
      
    
  

{\displaystyle \theta ={\begin{bmatrix}\theta _{1}&\dots &\theta _{K}\end{bmatrix}}^{\textsf {T}}}

and the vector of normal random variables be

    X
    =
    
      
        
          [
          
            
              
                
                  X
                  
                    1
                  
                
              
              
                …
              
              
                
                  X
                  
                    N
                  
                
              
            
          
          ]
        
      
      
        
          T
        
      
    
  

{\displaystyle X={\begin{bmatrix}X_{1}&\dots &X_{N}\end{bmatrix}}^{\textsf {T}}}

. Assume that the mean values of these random variables are

    μ
    (
    θ
    )
    =
    
      
        
          [
          
            
              
                
                  μ
                  
                    1
                  
                
                (
                θ
                )
              
              
                …
              
              
                
                  μ
                  
                    N
                  
                
                (
                θ
                )
              
            
          
          ]
        
      
      
        
          T
        
      
    
  

{\displaystyle \,\mu (\theta )={\begin{bmatrix}\mu _{1}(\theta )&\dots &\mu _{N}(\theta )\end{bmatrix}}^{\textsf {T}}}

, and let

    Σ
    (
    θ
    )
  

{\displaystyle \,\Sigma (\theta )}

be the covariance matrix. Then, for

    1
    ≤
    m
    ,
    
    n
    ≤
    K
  

{\displaystyle 1\leq m,\,n\leq K}

, the (m, n) entry of the FIM is:

          I
        
      
      
        m
        ,
        n
      
    
    =
    
      
        
          ∂
          
            μ
            
              
                T
              
            
          
        
        
          ∂
          
            θ
            
              m
            
          
        
      
    
    
      Σ
      
        
        1
      
    
    
      
        
          ∂
          μ
        
        
          ∂
          
            θ
            
              n
            
          
        
      
    
    +
    
      
        1
        2
      
    
    tr
    
    
      (
      
        
          Σ
          
            
            1
          
        
        
          
            
              ∂
              Σ
            
            
              ∂
              
                θ
                
                  m
                
              
            
          
        
        
          Σ
          
            
            1
          
        
        
          
            
              ∂
              Σ
            
            
              ∂
              
                θ
                
                  n
                
              
            
          
        
      
      )
    
    ,
  

{\displaystyle {\mathcal {I}}_{m,n}={\frac {\partial \mu ^{\textsf {T}}}{\partial \theta _{m}}}\Sigma ^{-1}{\frac {\partial \mu }{\partial \theta _{n}}}+{\frac {1}{2}}\operatorname {tr} \left(\Sigma ^{-1}{\frac {\partial \Sigma }{\partial \theta _{m}}}\Sigma ^{-1}{\frac {\partial \Sigma }{\partial \theta _{n}}}\right),}

where

    (
    ⋅
    
      )
      
        
          T
        
      
    
  

{\displaystyle (\cdot )^{\textsf {T}}}

denotes the transpose of a vector,

    tr
    
    (
    ⋅
    )
  

{\displaystyle \operatorname {tr} (\cdot )}

denotes the trace of a square matrix, and:

                  ∂
                  μ
                
                
                  ∂
                  
                    θ
                    
                      m
                    
                  
                
              
            
          
          
            
            =
            
              
                
                  [
                  
                    
                      
                        
                          
                            
                              
                                ∂
                                
                                  μ
                                  
                                    1
                                  
                                
                              
                              
                                ∂
                                
                                  θ
                                  
                                    m
                                  
                                
                              
                            
                          
                        
                      
                      
                        
                          
                            
                              
                                ∂
                                
                                  μ
                                  
                                    2
                                  
                                
                              
                              
                                ∂
                                
                                  θ
                                  
                                    m
                                  
                                
                              
                            
                          
                        
                      
                      
                        ⋯
                      
                      
                        
                          
                            
                              
                                ∂
                                
                                  μ
                                  
                                    N
                                  
                                
                              
                              
                                ∂
                                
                                  θ
                                  
                                    m
                                  
                                
                              
                            
                          
                        
                      
                    
                  
                  ]
                
              
              
                
                  T
                
              
            
            ;
          
        
        
          
            
              
                
                  
                    ∂
                    Σ
                  
                  
                    ∂
                    
                      θ
                      
                        m
                      
                    
                  
                
              
            
          
          
            
            =
            
              
                [
                
                  
                    
                      
                        
                          
                            
                              ∂
                              
                                Σ
                                
                                  1
                                  ,
                                  1
                                
                              
                            
                            
                              ∂
                              
                                θ
                                
                                  m
                                
                              
                            
                          
                        
                      
                    
                    
                      
                        
                          
                            
                              ∂
                              
                                Σ
                                
                                  1
                                  ,
                                  2
                                
                              
                            
                            
                              ∂
                              
                                θ
                                
                                  m
                                
                              
                            
                          
                        
                      
                    
                    
                      ⋯
                    
                    
                      
                        
                          
                            
                              ∂
                              
                                Σ
                                
                                  1
                                  ,
                                  N
                                
                              
                            
                            
                              ∂
                              
                                θ
                                
                                  m
                                
                              
                            
                          
                        
                      
                    
                  
                  
                    
                      
                        
                          
                            
                              ∂
                              
                                Σ
                                
                                  2
                                  ,
                                  1
                                
                              
                            
                            
                              ∂
                              
                                θ
                                
                                  m
                                
                              
                            
                          
                        
                      
                    
                    
                      
                        
                          
                            
                              ∂
                              
                                Σ
                                
                                  2
                                  ,
                                  2
                                
                              
                            
                            
                              ∂
                              
                                θ
                                
                                  m
                                
                              
                            
                          
                        
                      
                    
                    
                      ⋯
                    
                    
                      
                        
                          
                            
                              ∂
                              
                                Σ
                                
                                  2
                                  ,
                                  N
                                
                              
                            
                            
                              ∂
                              
                                θ
                                
                                  m
                                
                              
                            
                          
                        
                      
                    
                  
                  
                    
                      ⋮
                    
                    
                      ⋮
                    
                    
                      ⋱
                    
                    
                      ⋮
                    
                  
                  
                    
                      
                        
                          
                            
                              ∂
                              
                                Σ
                                
                                  N
                                  ,
                                  1
                                
                              
                            
                            
                              ∂
                              
                                θ
                                
                                  m
                                
                              
                            
                          
                        
                      
                    
                    
                      
                        
                          
                            
                              ∂
                              
                                Σ
                                
                                  N
                                  ,
                                  2
                                
                              
                            
                            
                              ∂
                              
                                θ
                                
                                  m
                                
                              
                            
                          
                        
                      
                    
                    
                      ⋯
                    
                    
                      
                        
                          
                            
                              ∂
                              
                                Σ
                                
                                  N
                                  ,
                                  N
                                
                              
                            
                            
                              ∂
                              
                                θ
                                
                                  m
                                
                              
                            
                          
                        
                      
                    
                  
                
                ]
              
            
            .
          
        
      
    
  

{\displaystyle {\begin{aligned}{\frac {\partial \mu }{\partial \theta _{m}}}&={\begin{bmatrix}{\dfrac {\partial \mu _{1}}{\partial \theta _{m}}}&{\dfrac {\partial \mu _{2}}{\partial \theta _{m}}}&\cdots &{\dfrac {\partial \mu _{N}}{\partial \theta _{m}}}\end{bmatrix}}^{\textsf {T}};\\[8pt]{\dfrac {\partial \Sigma }{\partial \theta _{m}}}&={\begin{bmatrix}{\dfrac {\partial \Sigma _{1,1}}{\partial \theta _{m}}}&{\dfrac {\partial \Sigma _{1,2}}{\partial \theta _{m}}}&\cdots &{\dfrac {\partial \Sigma _{1,N}}{\partial \theta _{m}}}\\[5pt]{\dfrac {\partial \Sigma _{2,1}}{\partial \theta _{m}}}&{\dfrac {\partial \Sigma _{2,2}}{\partial \theta _{m}}}&\cdots &{\dfrac {\partial \Sigma _{2,N}}{\partial \theta _{m}}}\\\vdots &\vdots &\ddots &\vdots \\{\dfrac {\partial \Sigma _{N,1}}{\partial \theta _{m}}}&{\dfrac {\partial \Sigma _{N,2}}{\partial \theta _{m}}}&\cdots &{\dfrac {\partial \Sigma _{N,N}}{\partial \theta _{m}}}\end{bmatrix}}.\end{aligned}}}

Note that a special, but very common, case is the one where

    Σ
    (
    θ
    )
    =
    Σ
  

{\displaystyle \Sigma (\theta )=\Sigma }

, a constant. Then

          I
        
      
      
        m
        ,
        n
      
    
    =
    
      
        
          ∂
          
            μ
            
              
                T
              
            
          
        
        
          ∂
          
            θ
            
              m
            
          
        
      
    
    
      Σ
      
        
        1
      
    
    
      
        
          ∂
          μ
        
        
          ∂
          
            θ
            
              n
            
          
        
      
    
    .
     
  

{\displaystyle {\mathcal {I}}_{m,n}={\frac {\partial \mu ^{\textsf {T}}}{\partial \theta _{m}}}\Sigma ^{-1}{\frac {\partial \mu }{\partial \theta _{n}}}.\ }

In this case the Fisher information matrix may be identified with the coefficient matrix of the normal equations of least squares estimation theory. Another special case occurs when the mean and covariance depend on two different vector parameters, say, β and θ. This is especially popular in the analysis of spatial data, which often uses a linear model with correlated residuals. In this case,

        I
      
    
    (
    β
    ,
    θ
    )
    =
    diag
    
    
      (
      
        
          
            I
          
        
        (
        β
        )
        ,
        
          
            I
          
        
        (
        θ
        )
      
      )
    
  

{\displaystyle {\mathcal {I}}(\beta ,\theta )=\operatorname {diag} \left({\mathcal {I}}(\beta ),{\mathcal {I}}(\theta )\right)}

where

                I
              
            
            
              (
              β
              
                )
                
                  m
                  ,
                  n
                
              
            
          
          
            
            =
            
              
                
                  ∂
                  
                    μ
                    
                      
                        T
                      
                    
                  
                
                
                  ∂
                  
                    β
                    
                      m
                    
                  
                
              
            
            
              Σ
              
                
                1
              
            
            
              
                
                  ∂
                  μ
                
                
                  ∂
                  
                    β
                    
                      n
                    
                  
                
              
            
            ,
          
        
        
          
            
              
                I
              
            
            
              (
              θ
              
                )
                
                  m
                  ,
                  n
                
              
            
          
          
            
            =
            
              
                1
                2
              
            
            tr
            
            
              (
              
                
                  Σ
                  
                    
                    1
                  
                
                
                  
                    
                      ∂
                      Σ
                    
                    
                      ∂
                      
                        θ
                        
                          m
                        
                      
                    
                  
                
                
                  
                    Σ
                    
                      
                      1
                    
                  
                
                
                  
                    
                      ∂
                      Σ
                    
                    
                      ∂
                      
                        θ
                        
                          n
                        
                      
                    
                  
                
              
              )
            
          
        
      
    
  

{\displaystyle {\begin{aligned}{\mathcal {I}}{(\beta )_{m,n}}&={\frac {\partial \mu ^{\textsf {T}}}{\partial \beta _{m}}}\Sigma ^{-1}{\frac {\partial \mu }{\partial \beta _{n}}},\\[5pt]{\mathcal {I}}{(\theta )_{m,n}}&={\frac {1}{2}}\operatorname {tr} \left(\Sigma ^{-1}{\frac {\partial \Sigma }{\partial \theta _{m}}}{\Sigma ^{-1}}{\frac {\partial \Sigma }{\partial \theta _{n}}}\right)\end{aligned}}}

== Properties ==

=== Chain rule === Similar to the entropy or mutual information, the Fisher information also possesses a chain rule decomposition. In particular, if X and Y are jointly distributed random variables, it follows that:

          I
        
      
      
        X
        ,
        Y
      
    
    (
    θ
    )
    =
    
      
        
          I
        
      
      
        X
      
    
    (
    θ
    )
    +
    
      
        
          I
        
      
      
        Y
        
        X
      
    
    (
    θ
    )
    ,
  

{\displaystyle {\mathcal {I}}_{X,Y}(\theta )={\mathcal {I}}_{X}(\theta )+{\mathcal {I}}_{Y\mid X}(\theta ),}

where

          I
        
      
      
        Y
        
        X
      
    
    (
    θ
    )
    =
    
      E
      
        X
      
    
    
    
      [
      
        
          
            
              I
            
          
          
            Y
            
            X
            =
            x
          
        
        (
        θ
        )
      
      ]
    
  

{\displaystyle {\mathcal {I}}_{Y\mid X}(\theta )=\operatorname {E} _{X}\left[{\mathcal {I}}_{Y\mid X=x}(\theta )\right]}

and

          I
        
      
      
        Y
        
        X
        =
        x
      
    
    (
    θ
    )
  

{\displaystyle {\mathcal {I}}_{Y\mid X=x}(\theta )}

is the Fisher information of Y relative to

    θ
  

{\displaystyle \theta }

calculated with respect to the conditional density of Y given a specific value X = x. As a special case, if the two random variables are independent, the information yielded by the two random variables is the sum of the information from each random variable separately:

          I
        
      
      
        X
        ,
        Y
      
    
    (
    θ
    )
    =
    
      
        
          I
        
      
      
        X
      
    
    (
    θ
    )
    +
    
      
        
          I
        
      
      
        Y
      
    
    (
    θ
    )
    .
  

{\displaystyle {\mathcal {I}}_{X,Y}(\theta )={\mathcal {I}}_{X}(\theta )+{\mathcal {I}}_{Y}(\theta ).}

Consequently, the information in a random sample of n independent and identically distributed observations is n times the information in a sample of size 1.

=== f-divergence ===