kb/data/en.wikipedia.org/wiki/Data_dredging-2.md

5.9 KiB

title chunk source category tags date_saved instance
Data dredging 3/3 https://en.wikipedia.org/wiki/Data_dredging reference science, encyclopedia 2026-05-05T07:02:25.227717+00:00 kb-cron

== Appearance in media == One example is the chocolate weight loss hoax study conducted by journalist John Bohannon, who explained publicly in a Gizmodo article that the study was deliberately conducted fraudulently as a social experiment. This study was widespread in many media outlets around 2015, with many people believing the claim that eating a chocolate bar every day would cause them to lose weight, against their better judgement. This study was published in the Institute of Diet and Health. According to Bohannon, to reduce the p-value to below 0.05, taking 18 different variables into consideration when testing was crucial.

== Remedies == While looking for patterns in data is legitimate, applying a statistical test of significance or hypothesis test to the same data until a pattern emerges is prone to abuse. One way to construct hypotheses while avoiding data dredging is to conduct randomized out-of-sample tests. The researcher collects a data set, then randomly partitions it into two subsets, A and B. Only one subset—say, subset A—is examined for creating hypotheses. Once a hypothesis is formulated, it must be tested on subset B, which was not used to construct the hypothesis. Only where B also supports such a hypothesis is it reasonable to believe the hypothesis might be valid. (This is a simple type of cross-validation and is often termed training-test or split-half validation.) Another remedy for data dredging is to record the number of all significance tests conducted during the study and simply divide one's criterion for significance (alpha) by this number; this is the Bonferroni correction. However, this is a very conservative metric. A family-wise alpha of 0.05, divided in this way by 1,000 to account for 1,000 significance tests, yields a very stringent per-hypothesis alpha of 0.00005. Methods particularly useful in analysis of variance, and in constructing simultaneous confidence bands for regressions involving basis functions are Scheffé's method and, if the researcher has in mind only pairwise comparisons, the Tukey method. To avoid the extreme conservativeness of the Bonferroni correction, more sophisticated selective inference methods are available. The most common selective inference method is the use of Benjamini and Hochberg's false discovery rate controlling procedure: it is a less conservative approach that has become a popular method for control of multiple hypothesis tests. When neither approach is practical, one can make a clear distinction between data analyses that are confirmatory and analyses that are exploratory. Statistical inference is appropriate only for the former. Ultimately, the statistical significance of a test and the statistical confidence of a finding are joint properties of data and the method used to examine the data. Thus, if someone says that a certain event has probability of 20% ± 2% 19 times out of 20, this means that if the probability of the event is estimated by the same method used to obtain the 20% estimate, the result is between 18% and 22% with probability 0.95. No claim of statistical significance can be made by only looking, without due regard to the method used to assess the data. Academic journals increasingly shift to the registered report format, which aims to counteract very serious issues such as data dredging and HARKing, which have made theory-testing research very unreliable. For example, Nature Human Behaviour has adopted the registered report format, as it "shift[s] the emphasis from the results of research to the questions that guide the research and the methods used to answer them". The European Journal of Personality defines this format as follows: "In a registered report, authors create a study proposal that includes theoretical and empirical background, research questions/hypotheses, and pilot data (if available). Upon submission, this proposal will then be reviewed prior to data collection, and if accepted, the paper resulting from this peer-reviewed procedure will be published, regardless of the study outcomes." Methods and results can also be made publicly available, as in the open science approach, making it yet more difficult for data dredging to take place.

== See also ==

== Notes ==

== References ==

== Further reading == Graham Elliott, Nikolay Kudrin, Kaspar Wüthrich. 2025. "The Power of Tests for Detecting p-Hacking." The Review of Economics and Statistics. Ioannidis, John P.A. (August 30, 2005). "Why Most Published Research Findings Are False". PLOS Medicine. 2 (8) e124. San Francisco: Public Library of Science. doi:10.1371/journal.pmed.0020124. ISSN 1549-1277. PMC 1182327. PMID 16060722. Head, Megan L.; Holman, Luke; Lanfear, Rob; Kahn, Andrew T.; Jennions, Michael D. (13 March 2015). "The Extent and Consequences of P-Hacking in Science". PLOS Biology. 13 (3) e1002106. doi:10.1371/journal.pbio.1002106. PMC 4359000. PMID 25768323. Insel, Thomas (November 14, 2014). "P-Hacking". NIMH Director's Blog. Archived from the original on December 15, 2016. Smith, Gary (2016). Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics. Gerald Duckworth & Co. ISBN 978-0-7156-4974-9. Stefan, Angelika M.; Schönbrodt, Felix D. (February 1, 2023). "Big little lies: a compendium and simulation of p-hacking strategies". Royal Society Open Science. 10 (2) 220346. Bibcode:2023RSOS...1020346S. doi:10.1098/rsos.220346. PMC 9905987. PMID 36778954.

== External links == A bibliography on data-snooping bias Spurious Correlations, a gallery of examples of implausible correlations StatQuest: P-value pitfalls and power calculations on YouTube Video explaining p-hacking by "Neuroskeptic", a blogger at Discover Magazine Step Away From Stepwise, an article in the Journal of Big Data criticizing stepwise regression