kb/data/en.wikipedia.org/wiki/Data_processing_inequality-0.md

402 lines
4.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Data processing inequality"
chunk: 1/1
source: "https://en.wikipedia.org/wiki/Data_processing_inequality"
category: "reference"
tags: "science, encyclopedia"
date_saved: "2026-05-05T11:32:29.558623+00:00"
instance: "kb-cron"
---
The data processing inequality is an information theoretic concept that states that the information content of a signal cannot be increased via a local physical operation. This can be expressed concisely as 'post-processing cannot increase information'.
== Statement ==
Let three random variables form the Markov chain
X
Y
Z
{\displaystyle X\rightarrow Y\rightarrow Z}
, implying that the conditional distribution of
Z
{\displaystyle Z}
depends only on
Y
{\displaystyle Y}
and is conditionally independent of
X
{\displaystyle X}
. Specifically, we have such a Markov chain if the joint probability mass function can be written as
p
(
x
,
y
,
z
)
=
p
(
x
)
p
(
y
|
x
)
p
(
z
|
y
)
=
p
(
y
)
p
(
x
|
y
)
p
(
z
|
y
)
{\displaystyle p(x,y,z)=p(x)p(y|x)p(z|y)=p(y)p(x|y)p(z|y)}
In this setting, no processing of
Y
{\displaystyle Y}
, deterministic or random, can increase the information that
Y
{\displaystyle Y}
contains about
X
{\displaystyle X}
. Using the mutual information, this can be written as :
I
(
X
;
Y
)
I
(
X
;
Z
)
,
{\displaystyle I(X;Y)\geqslant I(X;Z),}
with the equality
I
(
X
;
Y
)
=
I
(
X
;
Z
)
{\displaystyle I(X;Y)=I(X;Z)}
if and only if
I
(
X
;
Y
Z
)
=
0
{\displaystyle I(X;Y\mid Z)=0}
. That is,
Z
{\displaystyle Z}
and
Y
{\displaystyle Y}
contain the same information about
X
{\displaystyle X}
, and
X
Z
Y
{\displaystyle X\rightarrow Z\rightarrow Y}
also forms a Markov chain.
== Proof ==
One can apply the chain rule for mutual information to obtain two different decompositions of
I
(
X
;
Y
,
Z
)
{\displaystyle I(X;Y,Z)}
:
I
(
X
;
Z
)
+
I
(
X
;
Y
Z
)
=
I
(
X
;
Y
,
Z
)
=
I
(
X
;
Y
)
+
I
(
X
;
Z
Y
)
{\displaystyle I(X;Z)+I(X;Y\mid Z)=I(X;Y,Z)=I(X;Y)+I(X;Z\mid Y)}
By the relationship
X
Y
Z
{\displaystyle X\rightarrow Y\rightarrow Z}
, we know that
X
{\displaystyle X}
and
Z
{\displaystyle Z}
are conditionally independent, given
Y
{\displaystyle Y}
, which means the conditional mutual information,
I
(
X
;
Z
Y
)
=
0
{\displaystyle I(X;Z\mid Y)=0}
. The data processing inequality then follows from the non-negativity of
I
(
X
;
Y
Z
)
0
{\displaystyle I(X;Y\mid Z)\geq 0}
.
== See also ==
Garbage in, garbage out
== References ==
== External links ==
http://www.scholarpedia.org/article/Mutual_information