--- title: "Replication crisis" chunk: 2/15 source: "https://en.wikipedia.org/wiki/Replication_crisis" category: "reference" tags: "science, encyclopedia" date_saved: "2026-05-05T03:45:08.741659+00:00" instance: "kb-cron" --- A null hypothesis test is a decision procedure which takes in some data, and outputs either H 0 {\displaystyle H_{0}} or H 1 {\displaystyle H_{1}} . If it outputs H 1 {\displaystyle H_{1}} , it is usually stated as "there is a statistically significant effect" or "the null hypothesis is rejected". Often, the statistical test is a (one-sided) threshold test, which is structured as follows: Gather data D {\displaystyle D} . Compute a test statistic t [ D ] {\displaystyle t[D]} for the data. Compare the test statistic against a critical value/threshold t threshold {\displaystyle t_{\text{threshold}}} . If t [ D ] > t threshold {\displaystyle t[D]>t_{\text{threshold}}} , then output H 1 {\displaystyle H_{1}} , else, output H 0 {\displaystyle H_{0}} . A two-sided threshold test is similar, but with two thresholds, such that it outputs H 1 {\displaystyle H_{1}} if either t [ D ] < t threshold − {\displaystyle t[D] t threshold + {\displaystyle t[D]>t_{\text{threshold}}^{+}} There are 4 possible outcomes of a null hypothesis test: false negative, true negative, false positive, true positive. A false negative means that H 0 {\displaystyle H_{0}} is true, but the test outcome is H 1 {\displaystyle H_{1}} ; a true negative means that H 0 {\displaystyle H_{0}} is true, and the test outcome is H 0 {\displaystyle H_{0}} , etc. Significance level, false positive rate, or the alpha level, is the probability of finding the alternative to be true when the null hypothesis is true: ( significance ) := α := P r ( find H 1 | H 0 ) {\displaystyle ({\text{significance}}):=\alpha :=Pr({\text{find }}H_{1}|H_{0})} For example, when the test is a one-sided threshold test, then α = P r D ∼ H 0 ( t [ D ] > t threshold ) {\displaystyle \alpha =Pr_{D\sim H_{0}}(t[D]>t_{\text{threshold}})} where D ∼ H 0 {\displaystyle D\sim H_{0}} means "the data is sampled from H 0 {\displaystyle H_{0}} ". Statistical power, true positive rate, is the probability of finding the alternative to be true when the alternative hypothesis is true: ( power ) := 1 − β := P r ( find H 1 | H 1 ) {\displaystyle ({\text{power}}):=1-\beta :=Pr({\text{find }}H_{1}|H_{1})} where β {\displaystyle \beta } is also called the false negative rate. For example, when the test is a one-sided threshold test, then 1 − β = P r D ∼ H 1 ( t [ D ] > t threshold ) {\displaystyle 1-\beta =Pr_{D\sim H_{1}}(t[D]>t_{\text{threshold}})} . Given a statistical test and a data set D {\displaystyle D} , the corresponding p-value is the probability that the test statistic is at least as extreme, conditional on H 0 {\displaystyle H_{0}} . For example, for a one-sided threshold test, p [ D ] = P r D ′ ∼ H 0 ( t [ D ′ ] > t [ D ] ) {\displaystyle p[D]=Pr_{D'\sim H_{0}}(t[D']>t[D])} If the null hypothesis is true, then the p-value is distributed uniformly on [ 0 , 1 ] {\displaystyle [0,1]} . Otherwise, it is typically peaked at p = 0.0 {\displaystyle p=0.0} and roughly exponential, though the precise shape of the p-value distribution depends on what the alternative hypothesis is. Because the p-values are distributed uniformly on [ 0 , 1 ] {\displaystyle [0,1]} under the null hypothesis, researchers can set any significance level α {\displaystyle \alpha } by computing the p-value, then output H 1 {\displaystyle H_{1}} if p [ D ] < α {\displaystyle p[D]<\alpha } . This is usually stated as "the null hypothesis is rejected at significance level α {\displaystyle \alpha } ", or " H 1 ( p < α ) {\displaystyle H_{1}\;(p<\alpha )} ", such as "smoking is correlated with cancer (p < 0.001)". === History === The replication crisis dates to a number of events in the early 2010s. Felipe Romero identified four precursors to the crisis: