6.6 KiB
| title | chunk | source | category | tags | date_saved | instance |
|---|---|---|---|---|---|---|
| Data sharing | 1/2 | https://en.wikipedia.org/wiki/Data_sharing | reference | science, encyclopedia | 2026-05-05T04:28:02.473355+00:00 | kb-cron |
Data sharing denotes the dissemination of research datasets to enable access and use by other investigators. Policies governing this practice are increasingly instituted by funding agencies, academic institutions, and scholarly journals, reflecting the consensus that transparency and openness constitute foundational principles of the scientific method. A number of funding agencies and science journals require authors of peer-reviewed papers to share any supplemental information (raw data, statistical methods or source code) necessary to understand, develop or reproduce published research. A great deal of scientific research is not subject to data sharing requirements, and many of these policies have liberal exceptions. In the absence of any binding requirement, data sharing is at the discretion of the scientists themselves. In addition, in certain situations governments and institutions prohibit or severely limit data sharing to protect proprietary interests, national security, and subject/patient/victim confidentiality. Data sharing may also be restricted to protect institutions and scientists from use of data for political purposes. Data and methods may be requested from an author years after publication. In order to encourage data sharing and prevent the loss or corruption of data, a number of funding agencies and journals established policies on data archiving. Access to publicly archived data is a recent development in the history of science made possible by technological advances in communications and information technology. To take full advantage of modern rapid communication may require consensual agreement on the criteria underlying mutual recognition of respective contributions. Models recognized for the timely sharing of data for more effective response to emergent infectious disease threats include the data sharing mechanism introduced by the GISAID Initiative. Despite policies on data sharing and archiving, data withholding still happens. Authors may fail to archive data or they only archive a portion of the data. Failure to archive data alone is not data withholding. When a researcher requests additional information, an author sometimes refuses to provide it. When authors withhold data like this, they run the risk of losing the trust of the science community. A 2022 study identified about 3500 research papers which contained statements that the data was available, but upon request and further seeking the data, found that it was unavailable for 94% of papers. Data sharing may also indicate the sharing of personal information on a social media platform.
== U.S. government policies ==
=== Federal law === On August 9, 2007, President Bush signed the America COMPETES Act (or the "America Creating Opportunities to Meaningfully Promote Excellence in Technology, Education, and Science Act") requiring civilian federal agencies to provide guidelines, policies and procedures, to facilitate and optimize the open exchange of data and research between agencies, the public and policymakers. See Section 1009.
=== NIH data sharing policy === 'The National Institutes of Health (NIH) Grants Policy Statement defines "data" as "recorded information, regardless of the form or medium on which it may be recorded, and includes writings, films, sound recordings, pictorial reproductions, drawings, designs, or other graphic representations, procedural manuals, forms, diagrams, work flow charts, equipment descriptions, data files, data processing or computer programs (software), statistical records, and other research data."' The NIH Final Statement of Sharing of Research Data says:
'NIH reaffirms its support for the concept of data sharing. We believe that data sharing is essential for expedited translation of research results into knowledge, products, and procedures to improve human health. The NIH endorses the sharing of final research data to serve these and other important scientific goals. The NIH expects and supports the timely release and sharing of final research data from NIH-supported studies for use by other researchers. 'NIH recognizes that the investigators who collect the data have a legitimate interest in benefiting from their investment of time and effort. We have therefore revised our definition of "the timely release and sharing" to be no later than the acceptance for publication of the main findings from the final data set. NIH continues to expect that the initial investigators may benefit from first and continuing use but not from prolonged exclusive use.'
=== NSF Policy from Grant General Conditions === 36. Sharing of Findings, Data, and Other Research Products a. NSF …expects investigators to share with other researchers, at no more than incremental cost and within a reasonable time, the data, samples, physical collections and other supporting materials created or gathered in the course of the work. It also encourages awardees to share software and inventions or otherwise act to make the innovations they embody widely useful and usable.
b. Adjustments and, where essential, exceptions may be allowed to safeguard the rights of individuals and subjects, the validity of results, or the integrity of collections or to accommodate legitimate interests of investigators.
== Office of Research Integrity == Allegations of misconduct in medical research carry severe consequences. The United States Department of Health and Human Services established an office to oversee investigations of allegations of misconduct, including data withholding. The website defines the mission:
"The Office of Research Integrity (ORI) promotes integrity in biomedical and behavioral research supported by the U.S. Public Health Service (PHS) at about 4,000 institutions worldwide. ORI monitors institutional investigations of research misconduct and facilitates the responsible conduct of research (RCR) through educational, preventive, and regulatory activities."
== Ideals in data sharing == Some research organizations feel particularly strongly about data sharing. Stanford University's WaveLab has a philosophy about reproducible research and disclosing all algorithms and source code necessary to reproduce the research. In a paper titled "WaveLab and Reproducible Research," the authors describe some of the problems they encountered in trying to reproduce their own research after a period of time. In many cases, it was so difficult they gave up the effort. These experiences are what convinced them of the importance of disclosing source code. The philosophy is described: