4.7 KiB
| title | chunk | source | category | tags | date_saved | instance |
|---|---|---|---|---|---|---|
| Data processing inequality | 1/1 | https://en.wikipedia.org/wiki/Data_processing_inequality | reference | science, encyclopedia | 2026-05-05T11:32:29.558623+00:00 | kb-cron |
The data processing inequality is an information theoretic concept that states that the information content of a signal cannot be increased via a local physical operation. This can be expressed concisely as 'post-processing cannot increase information'.
== Statement == Let three random variables form the Markov chain
X
→
Y
→
Z
{\displaystyle X\rightarrow Y\rightarrow Z}
, implying that the conditional distribution of
Z
{\displaystyle Z}
depends only on
Y
{\displaystyle Y}
and is conditionally independent of
X
{\displaystyle X}
. Specifically, we have such a Markov chain if the joint probability mass function can be written as
p
(
x
,
y
,
z
)
=
p
(
x
)
p
(
y
|
x
)
p
(
z
|
y
)
=
p
(
y
)
p
(
x
|
y
)
p
(
z
|
y
)
{\displaystyle p(x,y,z)=p(x)p(y|x)p(z|y)=p(y)p(x|y)p(z|y)}
In this setting, no processing of
Y
{\displaystyle Y}
, deterministic or random, can increase the information that
Y
{\displaystyle Y}
contains about
X
{\displaystyle X}
. Using the mutual information, this can be written as :
I
(
X
;
Y
)
⩾
I
(
X
;
Z
)
,
{\displaystyle I(X;Y)\geqslant I(X;Z),}
with the equality
I
(
X
;
Y
)
=
I
(
X
;
Z
)
{\displaystyle I(X;Y)=I(X;Z)}
if and only if
I
(
X
;
Y
∣
Z
)
=
0
{\displaystyle I(X;Y\mid Z)=0}
. That is,
Z
{\displaystyle Z}
and
Y
{\displaystyle Y}
contain the same information about
X
{\displaystyle X}
, and
X
→
Z
→
Y
{\displaystyle X\rightarrow Z\rightarrow Y}
also forms a Markov chain.
== Proof == One can apply the chain rule for mutual information to obtain two different decompositions of
I
(
X
;
Y
,
Z
)
{\displaystyle I(X;Y,Z)}
:
I
(
X
;
Z
)
+
I
(
X
;
Y
∣
Z
)
=
I
(
X
;
Y
,
Z
)
=
I
(
X
;
Y
)
+
I
(
X
;
Z
∣
Y
)
{\displaystyle I(X;Z)+I(X;Y\mid Z)=I(X;Y,Z)=I(X;Y)+I(X;Z\mid Y)}
By the relationship
X
→
Y
→
Z
{\displaystyle X\rightarrow Y\rightarrow Z}
, we know that
X
{\displaystyle X}
and
Z
{\displaystyle Z}
are conditionally independent, given
Y
{\displaystyle Y}
, which means the conditional mutual information,
I
(
X
;
Z
∣
Y
)
=
0
{\displaystyle I(X;Z\mid Y)=0}
. The data processing inequality then follows from the non-negativity of
I
(
X
;
Y
∣
Z
)
≥
0
{\displaystyle I(X;Y\mid Z)\geq 0}
.
== See also == Garbage in, garbage out
== References ==
== External links == http://www.scholarpedia.org/article/Mutual_information