kb/data/en.wikipedia.org/wiki/Conditional_variance-0.md

682 lines
11 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Conditional variance"
chunk: 1/2
source: "https://en.wikipedia.org/wiki/Conditional_variance"
category: "reference"
tags: "science, encyclopedia"
date_saved: "2026-05-05T12:22:02.721176+00:00"
instance: "kb-cron"
---
In probability theory and statistics, a conditional variance is the variance of a random variable given the value(s) of one or more other variables.
Particularly in econometrics, the conditional variance is also known as the scedastic function or skedastic function. Conditional variances are important parts of autoregressive conditional heteroskedasticity (ARCH) models.
== Definition ==
The conditional variance of a random variable Y given another random variable X is
Var
(
Y
X
)
=
E
(
(
Y
E
(
Y
X
)
)
2
|
X
)
.
{\displaystyle \operatorname {Var} (Y\mid X)=\operatorname {E} {\Big (}{\big (}Y-\operatorname {E} (Y\mid X){\big )}^{2}\;{\Big |}\;X{\Big )}.}
The conditional variance tells us how much variance is left if we use
E
(
Y
X
)
{\displaystyle \operatorname {E} (Y\mid X)}
to "predict" Y.
Here, as usual,
E
(
Y
X
)
{\displaystyle \operatorname {E} (Y\mid X)}
stands for the conditional expectation of Y given X,
which we may recall, is a random variable itself (a function of X, determined up to probability one).
As a result,
Var
(
Y
X
)
{\displaystyle \operatorname {Var} (Y\mid X)}
itself is a random variable (and is a function of X).
== Explanation, relation to least-squares ==
Recall that variance is the expected squared deviation between a random variable (say, Y) and its expected value.
The expected value can be thought of as a reasonable prediction of the outcomes of the random experiment (in particular, the expected value is the best constant prediction when predictions are assessed by expected squared prediction error). Thus, one interpretation of variance is that it gives the smallest possible expected squared prediction error. If we have the knowledge of another random variable (X) that we can use to predict Y, we can potentially use this knowledge to reduce the expected squared error. As it turns out, the best prediction of Y given X is the conditional expectation. In particular, for any
f
:
R
R
{\displaystyle f:\mathbb {R} \to \mathbb {R} }
measurable,
E
[
(
Y
f
(
X
)
)
2
]
=
E
[
(
Y
E
(
Y
|
X
)
+
E
(
Y
|
X
)
f
(
X
)
)
2
]
=
E
[
E
{
(
Y
E
(
Y
|
X
)
+
E
(
Y
|
X
)
f
(
X
)
)
2
|
X
}
]
=
E
[
Var
(
Y
|
X
)
]
+
E
[
(
E
(
Y
|
X
)
f
(
X
)
)
2
]
.
{\displaystyle {\begin{aligned}\operatorname {E} [(Y-f(X))^{2}]&=\operatorname {E} [(Y-\operatorname {E} (Y|X)\,\,+\,\,\operatorname {E} (Y|X)-f(X))^{2}]\\&=\operatorname {E} [\operatorname {E} \{(Y-\operatorname {E} (Y|X)\,\,+\,\,\operatorname {E} (Y|X)-f(X))^{2}|X\}]\\&=\operatorname {E} [\operatorname {Var} (Y|X)]+\operatorname {E} [(\operatorname {E} (Y|X)-f(X))^{2}]\,.\end{aligned}}}
By selecting
f
(
X
)
=
E
(
Y
|
X
)
{\displaystyle f(X)=\operatorname {E} (Y|X)}
, the second, nonnegative term becomes zero, showing the claim.
Here, the second equality used the law of total expectation.
We also see that the expected conditional variance of Y given X shows up as the irreducible error of predicting Y given only the knowledge of X.
== Special cases, variations ==
=== Conditioning on discrete random variables ===
When X takes on countable many values
S
=
{
x
1
,
x
2
,
}
{\displaystyle S=\{x_{1},x_{2},\dots \}}
with positive probability, i.e., it is a discrete random variable, we can introduce
Var
(
Y
|
X
=
x
)
{\displaystyle \operatorname {Var} (Y|X=x)}
, the conditional variance of Y given that X=x for any x from S as follows:
Var
(
Y
|
X
=
x
)
=
E
(
(
Y
E
(
Y
X
=
x
)
)
2
X
=
x
)
=
E
(
Y
2
|
X
=
x
)
E
(
Y
|
X
=
x
)
2
,
{\displaystyle \operatorname {Var} (Y|X=x)=\operatorname {E} ((Y-\operatorname {E} (Y\mid X=x))^{2}\mid X=x)=\operatorname {E} (Y^{2}|X=x)-\operatorname {E} (Y|X=x)^{2},}
where recall that
E
(
Z
X
=
x
)
{\displaystyle \operatorname {E} (Z\mid X=x)}
is the conditional expectation of Z given that X=x, which is well-defined for
x
S
{\displaystyle x\in S}
.
An alternative notation for
Var
(
Y
|
X
=
x
)
{\displaystyle \operatorname {Var} (Y|X=x)}
is
Var
Y
X
(
Y
|
x
)
.
{\displaystyle \operatorname {Var} _{Y\mid X}(Y|x).}
Note that here
Var
(
Y
|
X
=
x
)
{\displaystyle \operatorname {Var} (Y|X=x)}
defines a constant for possible values of x, and in particular,
Var
(
Y
|
X
=
x
)
{\displaystyle \operatorname {Var} (Y|X=x)}
, is not a random variable.
The connection of this definition to
Var
(
Y
|
X
)
{\displaystyle \operatorname {Var} (Y|X)}
is as follows:
Let S be as above and define the function
v
:
S
R
{\displaystyle v:S\to \mathbb {R} }
as
v
(
x
)
=
Var
(
Y
|
X
=
x
)
{\displaystyle v(x)=\operatorname {Var} (Y|X=x)}
. Then,
v
(
X
)
=
Var
(
Y
|
X
)
{\displaystyle v(X)=\operatorname {Var} (Y|X)}
almost surely.