7.0 KiB
| title | chunk | source | category | tags | date_saved | instance |
|---|---|---|---|---|---|---|
| Replication crisis | 6/15 | https://en.wikipedia.org/wiki/Replication_crisis | reference | science, encyclopedia | 2026-05-05T03:45:08.741659+00:00 | kb-cron |
Academic "publish or perish" culture exacerbates publication bias. Intense pressure to publish in recognized journals, driven by hypercompetitive environments and bibliometric career evaluations, incentivizes researchers to prioritize publishable results over validity. According to Fanelli, this pushes scientists to employ a number of strategies aimed at making results "publishable". In the context of publication bias, this can mean adopting behaviors aimed at making results positive or statistically significant, often at the expense of their validity. According to Center for Open Science founder Brian Nosek and his colleagues, "publish or perish" culture created a situation whereby the goals and values of single scientists (e.g., publishability) are not aligned with the general goals of science (e.g., pursuing scientific truth). This is detrimental to the validity of published findings. Philosopher Brian D. Earp and psychologist Jim A. C. Everett argue that, although replication is in the best interests of academics and researchers as a group, features of academic psychological culture discourage replication by individual researchers. They argue that performing replications can be time-consuming, and take away resources from projects that reflect the researcher's original thinking. They are harder to publish, largely because they are unoriginal, and even when they can be published they are unlikely to be viewed as major contributions to the field. Replications "bring less recognition and reward, including grant money, to their authors". In his 1971 book Scientific Knowledge and Its Social Problems, philosopher and historian of science Jerome R. Ravetz predicted that science—in its progression from "little" science composed of isolated communities of researchers to "big" science or "techno-science"—would suffer major problems in its internal system of quality control. He recognized that the incentive structure for modern scientists could become dysfunctional, creating perverse incentives to publish any findings, however dubious. According to Ravetz, quality in science is maintained only when there is a community of scholars, linked by a set of shared norms and standards, who are willing and able to hold each other accountable.
==== Standards of reporting ==== Certain publishing practices also make it difficult to conduct replications and to monitor the severity of the reproducibility crisis, for articles often come with insufficient descriptions for other scholars to reproduce the study. The Reproducibility Project: Cancer Biology showed that of 193 experiments from 53 top papers about cancer published between 2010 and 2012, only 50 experiments from 23 papers have authors who provided enough information for researchers to redo the studies, sometimes with modifications. None of the 193 papers examined had its experimental protocols fully described and replicating 70% of experiments required asking for key reagents. The aforementioned study of empirical findings in the Strategic Management Journal found that 70% of 88 articles could not be replicated due to a lack of sufficient information for data or procedures. In water resources and management, most of 1,987 articles published in 2017 were not replicable because of a lack of available information shared online. In studies of event-related potentials, only two-thirds the information needed to replicate a study were reported in a sample of 150 studies, highlighting that there are substantial gaps in reporting.
==== Procedural bias ==== By the Duhem-Quine thesis, scientific results are interpreted by both a substantive theory and a theory of instruments. For example, astronomical observations depend both on the theory of astronomical objects and the theory of telescopes. A large amount of non-replicable research might accumulate if there is a bias of the following kind: faced with a null result, a scientist prefers to treat the data as saying the instrument is insufficient; faced with a non-null result, a scientist prefers to accept the instrument as good, and treat the data as saying something about the substantive theory.
==== Cultural evolution ==== Smaldino and McElreath proposed a simple model for the cultural evolution of scientific practice. Each lab randomly decides to produce novel research or replication research, at different fixed levels of false positive rate, true positive rate, replication rate, and productivity (its "traits"). A lab might use more "effort", making the ROC curve more convex but decreasing productivity. A lab accumulates a score over its lifetime that increases with publications and decreases when another lab fails to replicate its results. At regular intervals, a random lab "dies" and another "reproduces" a child lab with a similar trait as its parent. Labs with higher scores are more likely to reproduce. Under certain parameter settings, the population of labs converge to maximum productivity even at the price of very high false positive rates.
=== Questionable research practices ===
Questionable research practices are behaviors that exploit researcher degrees of freedom (researcher DF)—choices in study design, data analysis, or reporting—to inflate false positive rates and undermine reproducibility. Examples of questionable research practices include data dredging, selective reporting of only statistically significant findings, HARKing (hypothesizing after results are known), PARKing (pre-registering after results are known), and conducting inappropriate power analyses.
==== Genesis ==== Researchers' degrees of freedom occur at many stages: hypothesis formulation, design of experiments, data collection and analysis, and reporting of research. Analyses of identical datasets by different teams, even absent incentives for significant findings, often yield divergent results in disciplines such as psychology, linguistics, and ecology. This is because research design and data analysis entail numerous decisions that are not sufficiently constrained by a field's best practices and statistical methodologies. As a result, researcher DF can lead to situations where some failed replication attempts use a different, yet plausible, research design or statistical analysis; such studies do not necessarily undermine previous findings. Multiverse analysis, a method that makes inferences based on all plausible data-processing pipelines, provides a solution to the problem of analytical flexibility. Sensitivity analysis explores modelling specifications to create a comprehensive view of how different analytical choices influence outcomes. Collaborative approaches can be used to compensate for questionable research practices. In multianalyst approaches, different analysts conduct different analyses to address questions. This collaborative validation fosters intellectual honesty and exposes questionable research practices, leading to more reliable and robust scientific conclusions.