5.8 KiB
| title | chunk | source | category | tags | date_saved | instance |
|---|---|---|---|---|---|---|
| Directional component analysis | 1/2 | https://en.wikipedia.org/wiki/Directional_component_analysis | reference | science, encyclopedia | 2026-05-05T09:54:01.826255+00:00 | kb-cron |
Directional component analysis (DCA) is a statistical method used in climate science for identifying representative patterns of variability in space-time data-sets such as historical climate observations, weather prediction ensembles or climate ensembles. The first DCA pattern is a pattern of weather or climate variability that is both likely to occur (measured using likelihood) and has a large impact (for a specified linear impact function, and given certain mathematical conditions: see below). The first DCA pattern contrasts with the first PCA pattern, which is likely to occur, but may not have a large impact, and with a pattern derived from the gradient of the impact function, which has a large impact, but may not be likely to occur. DCA differs from other pattern identification methods used in climate research, such as EOFs, rotated EOFs and extended EOFs in that it takes into account an external vector, the gradient of the impact. DCA provides a way to reduce large ensembles from weather forecasts or climate models to just two patterns. The first pattern is the ensemble mean, and the second pattern is the DCA pattern, which represents variability around the ensemble mean in a way that takes impact into account. DCA contrasts with other methods that have been proposed for the reduction of ensembles in that it takes impact into account in addition to the structure of the ensemble.
== Overview ==
=== Inputs === DCA is calculated from two inputs:
a multivariate dataset of weather or climate data, such as historical climate observations, or a weather or climate ensemble a linear impact function. The linear impact function is a function which defines a level of impact for every spatial pattern in the weather or climate data as a weighted sum of the values at different locations in the spatial pattern. An example is the mean value across the spatial pattern. The linear impact function can be generated as the first term in the multivariate Taylor series of a non-linear impact function.
=== Formula === Consider a space-time data set
X
{\displaystyle X}
, containing individual spatial pattern vectors
x
{\displaystyle x}
, where the individual patterns are each considered as single samples from a multivariate normal distribution with mean zero and covariance matrix
C
{\displaystyle C}
. We define a linear impact function of a spatial pattern as
r
t
x
{\displaystyle r^{t}x}
, where
r
{\displaystyle r}
is a vector of spatial weights. The first DCA pattern is given in terms the covariance matrix
C
{\displaystyle C}
and the weights
r
{\displaystyle r}
by the proportional expression
x
∝
C
r
{\displaystyle x\propto Cr}
.
The pattern can then be normalized to any length as required.
=== Properties === If the weather or climate data is elliptically distributed (e.g., is distributed as a multivariate normal distribution or a multivariate t-distribution) then the first DCA pattern (DCA1) is defined as the spatial pattern with the following mathematical properties:
DCA1 maximises probability density for a given value of impact DCA1 maximises impact for a given value of probability density DCA1 maximises the product of impact and probability density DCA1 is the conditional expectation, conditional on exceeding a certain level of impact DCA1 is the impact-weighted ensemble mean Any modification of DCA1 will lead to a pattern that is either less extreme, or has a lower probability density.
=== Rainfall Example === For instance, in a rainfall anomaly dataset, using an impact metric defined as the total rainfall anomaly, the first DCA pattern is the spatial pattern that has the highest probability density for a given total rainfall anomaly. If the given total rainfall anomaly is chosen to have a large value, then this pattern combines being extreme in terms of the metric (i.e., representing large amounts of total rainfall) with being likely in terms of the pattern, and so is well suited as a representative extreme pattern.
=== Comparison with PCA === The main differences between Principal component analysis (PCA) and DCA are
PCA is a function of just the covariance matrix, and the first PCA pattern is defined so as to maximise explained variance DCA is a function of the covariance matrix and a vector direction (the gradient of the impact function), and the first DCA pattern is defined so as to maximise probability density for a given value of the impact metric As a result, for unit vector spatial patterns:
The first PCA spatial pattern always corresponds to a higher explained variance, but has a lower value of the impact metric (e.g., the total rainfall anomaly), except in degenerate cases The first DCA spatial pattern always corresponds to a higher value of the impact metric, but has a lower value of the explained variance, except in degenerate cases The degenerate cases occur when the PCA and DCA patterns are equal. Also, given the first PCA pattern, the DCA pattern can be scaled so that:
The scaled DCA pattern has the same probability density as the first PCA pattern, but higher impact, or The scaled DCA pattern has the same impact as the first PCA pattern, but higher probability density.
== Two Dimensional Example == Source:
Figure 1 gives an example, which can be understood as follows: