6.3 KiB
| title | chunk | source | category | tags | date_saved | instance |
|---|---|---|---|---|---|---|
| Open scientific data | 7/11 | https://en.wikipedia.org/wiki/Open_scientific_data | reference | science, encyclopedia | 2026-05-05T03:49:42.862927+00:00 | kb-cron |
== Legal status == The opening of scientific data has raised a variety of legal issues in regards to ownership rights, copyrights, privacy and ethics. While it is commonly considered that researchers "own the data they collect in the course of their research", this "view is incorrect": the creation of dataset involves potentially the rights of numerous additional actors such as institutions (research agencies, funders, public bodies), associated data producers, personal data on private citizens. The legal situation of digital data has been consequently described as a "bundle of rights" due to the fact that the "legal category of "property" (...) is not a suitable model for dealing with the complexity of data governance problems"
=== Copyright === Copyright has been the primary focus of the legal literature of open scientific data until the 2010s. The legality of data sharing was early on identified a crucial issue. In contrast with the sharing of scientific publication, the main impediment was not copyright but uncertainty: "the concept of 'data' [was] a new concept, created in the computer age, while copyright law emerged at the time of printed publications." In theory, copyright and author rights provisions do not apply to simple collections of facts and figures. In practice, the notion of data is much more expansive and could include protected content or creative arrangement of non-copyrightable contents. The status of data in international conventions on intellectual property is ambiguous. According to the Article 2 of the Berne Convention "every production in the literary, scientific and artistic domain" are protected. Yet, research data is often not an original creation entirely produced by one or several authors, but rather a "collection of facts, typically collated using automated or semiautomated instruments or scientific equipment." Consequently, there are no universal convention on data copyright and debates over "the extent to which copyright applies" are still prevalent, with different outcomes depending on the jurisdiction or the specifics of the dataset. This lack of harmonization stems logically from the novelty of "research data" as a key concept of scientific research: "the concept of 'data' is a new concept, created in the computer age, while copyright law emerged at the time of printed publications." In the United States, the European Union and several other jurisdictions, copyright laws have acknowledged a distinction between data itself (which can be an unprotected "fact") and the compilation of the data (which can be a creative arrangement). This principle largely predates the contemporary policy debate over scientific data, as the earliest court cases ruled in favor of compilation rights go back to the 19th century. In the United States compilation rights have been defined in the Copyright Act of 1976 with an explicit mention of datasets: "a work formed by the collection and assembling of pre-existing materials or of data" (Par 101). In its 1991 decision, Feist Publications, Inc., v. Rural Telephone Service Co., the Supreme Court has clarified the extents and the limitations on database copyrights, as the "assembling" should be demonstrably original and the "raw facts" contained in the compilation are still unprotected. Even in the jurisdiction where the application of the copyright to data outputs remains unsettled and partly theoretical, it has nevertheless created significant legal uncertainties. The frontier between a set of raw facts and an original compilation is not clearly delineated. Although scientific organizations are usually well aware of copyright laws, the complexity of data rights create unprecedented challenges. After 2010, national and supra-national jurisdiction have partly changed their stance in regard to the copyright protection of research data. As the sharing is encouraged, scientific data has been also acknowledged as an informal public good: "policymakers, funders, and academic institutions are working to increase awareness that, while the publications and knowledge derived from research data pertain to the authors, research data needs to be considered a public good so that its potential social and scientific value can be realised"
=== Database rights === The European Union provides one of the strongest intellectual property framework for data, with a double layer of rights: copyrights for original compilations (similarly to the United States) and sui generis database rights. Criteria for the originality of compilations have been harmonized across the membership states, by the 1996 Database Directive and by several major case laws settled by the European court of justice such as Infopaq International A/S v Danske Dagblades Forening c or Football Dataco Ltd et al. v Yahoo! UK Ltd. Overall, it has been acknowledged that significant efforts in the making of the dataset are not sufficient to claim compilation rights, as the structure has to "express his creativity in an original manner" The Database Directive has also introduced an original framework of protection for dataset, the sui generis rights that are conferred to any dataset that required a "substantial investment". While they last 15 year, sui generis rights have the potential to become permanent, as they can be renewed for every update of the dataset. Due to their large scope in length and protection, sui generis rights have initially not been largely acknowledged by the European jurisprudence, which has raised a high bar its enforcement. This cautious approach has been reversed in the 2010s, as the 2013 decision Innoweb BV v Wegener ICT Media BV and Wegener Mediaventions strengthened the positions of database owners and condemned the reuse of non-protected data in web search engines. The consolidation and expansion of database rights remain a controversial topic in European regulations, as it is partly at odds with the commitment of the European Union in favor of data-driven economy and open science. While a few exceptions exists for scientific and pedagogic uses, they are limited in scope (no rights for further reutilization) and they have not been activated in all member states.