kb/data/en.wikipedia.org/wiki/Design_effect-11.md

4.8 KiB

title chunk source category tags date_saved instance
Design effect 12/12 https://en.wikipedia.org/wiki/Design_effect reference science, encyclopedia 2026-05-05T09:49:56.844427+00:00 kb-cron

When planning a future data collection -

    D
    e
    f
    f
  

{\displaystyle Deff}

may be used to evaluate the sampling efficiency. E.g. if there is potentially "too much" increase in variance due to some sampling design decision, or if some alternative (economically feasible) design is more efficient. This also influences the sample size (overall, per stratum, per cluster, etc.). When planning the sample size, work may be done to correct the design effect so as to separate the interviewer effect (measurement error) from the effects of the sampling design on the sampling variance. As a diagnostic tool -

    D
    e
    f
    f
  

{\displaystyle Deff}

may help in evaluating potential problems with a post-hoc weighting analysis (e.g. from non-response adjustments). For example, if the

    D
    e
    f
    f
  

{\displaystyle Deff}

value is especially high, then it might indicate an issue with the sampling or weighting scheme. This can also assist when performing some manipulation on the weights (e.g., weight trimming), the design effect could be used to evaluate the influence of the manipulation on the effective sample size. And also in identifying glaring issues with the data or its analysis (e.g., ranging from mistakes to the presence of Outliers). Although some literature suggests that

    D
    e
    f
    f
    >
    1.5
  

{\displaystyle Deff>1.5}

is likely to require some attention, there is no universal rule of thumb for which design effect value is "too high". Practical considerations of

    D
    e
    f
    f
  

{\displaystyle Deff}

values are often context dependent. Considering the design effect is unnecessary when the source population is closely IID, or when the sample design of the data was drawn as a simple random sample. It is also less useful when the sample size is relatively small (at least partially, for practical reasons). While Kish originally hoped to have the design effect be as agnostic as possible to the underlying distribution of the data, sampling probabilities, their correlations, and the statistics of interest, followup research has shown that these do influence the design effect. Hence, these properties should be carefully considered when deciding which

    D
    e
    f
    f
  

{\displaystyle Deff}

calculation to use, and how to use it. The design effect is rarely applied when constructing confidence intervals. Ideally, one would be able to determine, for an estimator of a particular parameter, both the variance under Simple Random Sample (SRS) with replacement and the design effect (which accounts for all elements of the sampling design that change the variance). In such scenarios, the basic variance and the design effect could have been multiplied to compute the variance of the estimator for the specific design. This computed value can then be employed to form confidence intervals. However, in real-world applications, it is uncommon to estimate both values simultaneously. As a result, other methods are favored. For instance, Taylor linearization is utilized to construct confidence intervals based on the variance of the weighted mean. More broadly, the bootstrap method, also known as replication weights, is applied for a range of weighted statistics.

== Software implementations == Kish's design effect is implemented in various statistical software packages:

Python: design_effect from the balance package. R: surveysummary from the survey package. It is also implemented in other R packages (e.g., pewmethods, and samplesize4surveys). SAS: Using Proc Surveymeans. Stata: Using the estat post-estimation command after the svy: mean command. sudaan. WESVAR: calculates Kish's design effect with replacement (SRSWR), i.e.

    D
    e
    f
    t
  

{\displaystyle Deft}

.

== Notes ==

== See also == Frequency Weighting (spectral analysis) Inverse probability weighting Propensity score matching

== References == This article was submitted to WikiJournal of Science for external academic peer review in 2023 (reviewer reports). The updated content was reintegrated into the Wikipedia page under a CC-BY-SA-3.0 license (2024). The version of record as reviewed is: Tal Galili; et al. (5 May 2024). "Design effect" (PDF). WikiJournal of Science. 7 (1): 4. doi:10.15347/WJS/2024.004. ISSN 2470-6345. Wikidata Q116768211.