kb/Asymptotic_equipartition_property-0.md at 79d4443fb67a3691f7468a0736da27af6d15ae76

turtle89431 42b33219b4 Scrape wikipedia-science: 16984 new, 4210 updated, 21749 total (kb-cron)

2026-05-05 07:42:35 -07:00

11 KiB

Raw Blame History

title	chunk	source	category	tags	date_saved	instance
Asymptotic equipartition property	1/3	https://en.wikipedia.org/wiki/Asymptotic_equipartition_property	reference	science, encyclopedia	2026-05-05T14:39:53.761478+00:00	kb-cron

In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression. Roughly speaking, the theorem states that although there are many series of results that may be produced by a random process, the one actually produced is most probably from a loosely defined set of outcomes that all have approximately the same chance of being the one actually realized. (This is a consequence of the law of large numbers and ergodic theory.) Although there are individual outcomes which have a higher probability than any outcome in this set, the vast number of outcomes in the set almost guarantees that the outcome will come from the set. One way of intuitively understanding the property is through Cramér's large deviation theorem, which states that the probability of a large deviation from mean decays exponentially with the number of samples. Such results are studied in large deviations theory; intuitively, it is the large deviations that would violate equipartition, but these are unlikely. In the field of pseudorandom number generation, a candidate generator of undetermined quality whose output sequence lies too far outside the typical set by some statistical criteria is rejected as insufficiently random. Thus, although the typical set is loosely defined, practical notions arise concerning sufficient typicality.

== Definition == Given a discrete-time stationary ergodic stochastic process

    X
  

{\displaystyle X}

on the probability space

    (
    Ω
    ,
    B
    ,
    p
    )
  

{\displaystyle (\Omega ,B,p)}

, the asymptotic equipartition property is an assertion that, almost surely,

    −
    
      
        1
        n
      
    
    log
    ⁡
    p
    (
    
      X
      
        1
      
    
    ,
    
      X
      
        2
      
    
    ,
    …
    ,
    
      X
      
        n
      
    
    )
    →
    H
    (
    X
    )
    
    
       as 
    
    
    n
    →
    ∞
  

{\displaystyle -{\frac {1}{n}}\log p(X_{1},X_{2},\dots ,X_{n})\to H(X)\quad {\text{ as }}\quad n\to \infty }

where

    H
    (
    X
    )
  

{\displaystyle H(X)}

or simply

    H
  

{\displaystyle H}

denotes the entropy rate of

    X
  

{\displaystyle X}

, which must exist for all discrete-time stationary processes including the ergodic ones. The asymptotic equipartition property is proved for finite-valued (i.e.

      |
    
    Ω
    
      |
    
    <
    ∞
  

{\displaystyle |\Omega |<\infty }

) stationary ergodic stochastic processes in the Shannon–McMillan–Breiman theorem using the ergodic theory and for any i.i.d. sources directly using the law of large numbers in both the discrete-valued case (where

    H
  

{\displaystyle H}

is simply the entropy of a symbol) and the continuous-valued case (where

    H
  

{\displaystyle H}

is the differential entropy instead). The definition of the asymptotic equipartition property can also be extended for certain classes of continuous-time stochastic processes for which a typical set exists for long enough observation time. The convergence is proven almost sure in all cases.

== Discrete-time i.i.d. sources == Given

    X
  

{\displaystyle X}

is an i.i.d. source which may take values in the alphabet

        X
      
    
  

{\displaystyle {\mathcal {X}}}

, its time series

      X
      
        1
      
    
    ,
    …
    ,
    
      X
      
        n
      
    
  

{\displaystyle X_{1},\ldots ,X_{n}}

is i.i.d. with entropy

    H
    (
    X
    )
  

{\displaystyle H(X)}

. The weak law of large numbers gives the asymptotic equipartition property with convergence in probability,

      lim
      
        n
        →
        ∞
      
    
    Pr
    
      [
      
        
          |
          
            −
            
              
                1
                n
              
            
            log
            ⁡
            p
            (
            
              X
              
                1
              
            
            ,
            
              X
              
                2
              
            
            ,
            …
            ,
            
              X
              
                n
              
            
            )
            −
            H
            (
            X
            )
          
          |
        
        >
        ε
      
      ]
    
    =
    0
    
    ∀
    ε
    >
    0.
  

{\displaystyle \lim _{n\to \infty }\Pr \left[\left|-{\frac {1}{n}}\log p(X_{1},X_{2},\ldots ,X_{n})-H(X)\right|>\varepsilon \right]=0\qquad \forall \varepsilon >0.}

since the entropy is equal to the expectation of

    −
    
      
        1
        n
      
    
    log
    ⁡
    p
    (
    
      X
      
        1
      
    
    ,
    
      X
      
        2
      
    
    ,
    …
    ,
    
      X
      
        n
      
    
    )
    .
  

{\displaystyle -{\frac {1}{n}}\log p(X_{1},X_{2},\ldots ,X_{n}).}

The strong law of large numbers asserts the stronger almost sure convergence,

    Pr
    
      [
      
        
          lim
          
            n
            →
            ∞
          
        
        −
        
          
            1
            n
          
        
        log
        ⁡
        p
        (
        
          X
          
            1
          
        
        ,
        
          X
          
            2
          
        
        ,
        …
        ,
        
          X
          
            n
          
        
        )
        =
        H
        (
        X
        )
      
      ]
    
    =
    1.
  

{\displaystyle \Pr \left[\lim _{n\to \infty }-{\frac {1}{n}}\log p(X_{1},X_{2},\ldots ,X_{n})=H(X)\right]=1.}

Convergence in the sense of L1 asserts an even stronger

      E
    
    
      [
      
        |
        
          
            lim
            
              n
              →
              ∞
            
          
          −
          
            
              1
              n
            
          
          log
          ⁡
          p
          (
          
            X
            
              1
            
          
          ,
          
            X
            
              2
            
          
          ,
          …
          ,
          
            X
            
              n
            
          
          )
          −
          H
          (
          X
          )
        
        |
      
      ]
    
    =
    0
  

{\displaystyle \mathbb {E} \left[\left|\lim _{n\to \infty }-{\frac {1}{n}}\log p(X_{1},X_{2},\ldots ,X_{n})-H(X)\right|\right]=0}

== Discrete-time finite-valued stationary ergodic sources == Consider a finite-valued sample space

    Ω
  

{\displaystyle \Omega }

, i.e.

      |
    
    Ω
    
      |
    
    <
    ∞
  

{\displaystyle |\Omega |<\infty }

, for the discrete-time stationary ergodic process

    X
    :=
    {
    
      X
      
        n
      
    
    }
  

{\displaystyle X:=\{X_{n}\}}

defined on the probability space

    (
    Ω
    ,
    B
    ,
    p
    )
  

{\displaystyle (\Omega ,B,p)}

. The Shannon–McMillan–Breiman theorem, due to Claude Shannon, Brockway McMillan, and Leo Breiman, states that we have convergence in the sense of L1. Chung Kai-lai generalized this to the case where

    X
  

{\displaystyle X}

may take value in a set of countable infinity, provided that the entropy rate is still finite.

== Non-stationary discrete-time source producing independent symbols == The assumptions of stationarity/ergodicity/identical distribution of random variables is not essential for the asymptotic equipartition property to hold. Indeed, as is quite clear intuitively, the asymptotic equipartition property requires only some form of the law of large numbers to hold, which is fairly general. However, the expression needs to be suitably generalized, and the conditions need to be formulated precisely. Consider a source that produces independent symbols, possibly with different output statistics at each instant, for which the statistics of the process are known completely, that is, the marginal distribution of the process seen at each time instant is known. The joint distribution is just the product of marginals. Then, under the condition (which can be relaxed) that

      V
      a
      r
    
    [
    log
    ⁡
    p
    (
    
      X
      
        i
      
    
    )
    ]
    <
    M
  

{\displaystyle \mathrm {Var} [\log p(X_{i})]<M}

for all i, for some M > 0, the following holds (AEP):

11 KiB Raw Blame History Unescape Escape

11 KiB

Raw Blame History