kb/data/en.wikipedia.org/wiki/Replication_crisis-9.md

9.1 KiB
Raw Blame History

title chunk source category tags date_saved instance
Replication crisis 10/15 https://en.wikipedia.org/wiki/Replication_crisis reference science, encyclopedia 2026-05-05T03:45:08.741659+00:00 kb-cron

==== Statistical heterogeneity ==== As also reported by Stanley and colleagues, a further reason studies might fail to replicate is high heterogeneity of the to-be-replicated effects. Heterogeneity (variance in research findings due to multiple true effect sizes rather than one) is measured by the I-squared statistic, which quantifies unexplained variation in effect sizes across studies. This variation can be due to differences in experimental methods, populations, cohorts, and statistical methods between replication studies. Heterogeneity poses a challenge to studies attempting to replicate previously found effect sizes. When heterogeneity is high, subsequent replications have a high probability of finding an effect size radically different than that of the original study. Importantly, significant levels of heterogeneity are also found in direct/exact replications of a study. Stanley and colleagues discuss this while reporting a study by quantitative behavioral scientist Richard Klein and colleagues, where the authors attempted to replicate 15 psychological effects across 36 different sites in Europe and the U.S. In the study, Klein and colleagues found significant amounts of heterogeneity in 8 out of 16 effects (I-squared = 23% to 91%). Importantly, while the replication sites intentionally differed on a variety of characteristics, such differences could account for very little heterogeneity . According to Stanley and colleagues, this suggested that heterogeneity could have been a genuine characteristic of the phenomena being investigated. For instance, phenomena might be influenced by so-called "hidden moderators" relevant factors that were previously not understood to be important in the production of a certain effect. In their analysis of 200 meta-analyses of psychological effects, Stanley and colleagues found a median percent of heterogeneity of I-squared = 74%. According to the authors, this level of heterogeneity can be considered "huge". It is three times larger than the random sampling variance of effect sizes measured in their study. If considered along sampling error, heterogeneity yields a standard deviation from one study to the next even larger than the median effect size of the 200 meta-analyses they investigated. The authors conclude that if replication is defined by a subsequent study finding a sufficiently similar effect size to the original, replication success is not likely even if replications have very large sample sizes. Importantly, this occurs even if replications are direct or exact since heterogeneity nonetheless remains relatively high in these cases.

==== Others ==== Within economics, the replication crisis may be also exacerbated because econometric results are fragile: using different but plausible estimation procedures or data preprocessing techniques can lead to conflicting results.

=== Context sensitivity === New York University professor Jay Van Bavel and colleagues argue that a further reason findings are difficult to replicate is the sensitivity to context of certain psychological effects. On this view, failures to replicate might be explained by contextual differences between the original experiment and the replication, often called "hidden moderators". Van Bavel and colleagues tested the influence of context sensitivity by reanalyzing the data of the widely cited Reproducibility Project carried out by the Open Science Collaboration. They re-coded effects according to their sensitivity to contextual factors and then tested the relationship between context sensitivity and replication success in various regression models. Context sensitivity was found to negatively correlate with replication success, such that higher ratings of context sensitivity were associated with lower probabilities of replicating an effect. Importantly, context sensitivity significantly correlated with replication success even when adjusting for other factors considered important for reproducing results (e.g., effect size and sample size of original, statistical power of the replication, methodological similarity between original and replication). In light of the results, the authors concluded that attempting a replication in a different time, place or with a different sample can significantly alter an experiment's results. Context sensitivity thus may be a reason certain effects fail to replicate in psychology.

=== Bayesian explanation === In the framework of Bayesian probability, by Bayes' theorem, rejecting the null hypothesis at significance level 5% does not mean that the posterior probability for the alternative hypothesis is 95%, and the posterior probability is also different from the probability of replication. Consider a simplified case where there are only two hypotheses. Let the prior probability of the null hypothesis be

    P
    r
    (
    
      H
      
        0
      
    
    )
  

{\displaystyle Pr(H_{0})}

, and the alternative

    P
    r
    (
    
      H
      
        1
      
    
    )
    =
    1
    
    P
    r
    (
    
      H
      
        0
      
    
    )
  

{\displaystyle Pr(H_{1})=1-Pr(H_{0})}

. For a given statistical study, let its false positive rate (significance level) be

    P
    r
    (
    
      find 
    
    
      H
      
        1
      
    
    
      |
    
    
      H
      
        0
      
    
    )
  

{\displaystyle Pr({\text{find }}H_{1}|H_{0})}

, and true positive rate (power) be

    P
    r
    (
    
      find 
    
    
      H
      
        1
      
    
    
      |
    
    
      H
      
        1
      
    
    )
  

{\displaystyle Pr({\text{find }}H_{1}|H_{1})}

. For illustrative purposes, let significance level be 0.05 and power be 0.45 (underpowered). Now, by Bayes' theorem, conditional on the statistical studying finding

      H
      
        1
      
    
  

{\displaystyle H_{1}}

to be true, the posterior probability of

      H
      
        1
      
    
  

{\displaystyle H_{1}}

actually being true is not

    1
    
    P
    r
    (
    
      find 
    
    
      H
      
        1
      
    
    
      |
    
    
      H
      
        0
      
    
    )
    =
    0.95
  

{\displaystyle 1-Pr({\text{find }}H_{1}|H_{0})=0.95}

, but

    P
    r
    (
    
      H
      
        1
      
    
    
      |
    
    
       find 
    
    
      H
      
        1
      
    
    )
    =
    
      
        
          P
          r
          (
          
             find 
          
          
            H
            
              1
            
          
          
            |
          
          
            H
            
              1
            
          
          )
          P
          r
          (
          
            H
            
              1
            
          
          )
        
        
          P
          r
          (
          
             find 
          
          
            H
            
              1
            
          
          
            |
          
          
            H
            
              0
            
          
          )
          P
          r
          (
          
            H
            
              0
            
          
          )
          +
          P
          r
          (
          
             find 
          
          
            H
            
              1
            
          
          
            |
          
          
            H
            
              1
            
          
          )
          P
          r
          (
          
            H
            
              1
            
          
          )
        
      
    
  

{\displaystyle Pr(H_{1}|{\text{ find }}H_{1})={\frac {Pr({\text{ find }}H_{1}|H_{1})Pr(H_{1})}{Pr({\text{ find }}H_{1}|H_{0})Pr(H_{0})+Pr({\text{ find }}H_{1}|H_{1})Pr(H_{1})}}}