kb/data/en.wikipedia.org/wiki/Open_energy_system_databases-0.md

6.7 KiB

title chunk source category tags date_saved instance
Open energy system databases 1/5 https://en.wikipedia.org/wiki/Open_energy_system_databases reference science, encyclopedia 2026-05-05T06:32:17.894236+00:00 kb-cron

Open energy system database projects employ open data methods to collect, clean, and republish energy-related datasets for open use. The resulting information is then available, given a suitable open license, for statistical analysis and for building numerical energy system models, including open energy system models. Permissive licenses like Creative Commons CC0 and CC BY are preferred, but some projects will house data made public under market transparency regulations and carrying unqualified copyright. The databases themselves may furnish information on national power plant fleets, renewable generation assets, transmission networks, time series for electricity loads, dispatch, spot prices, and cross-border trades, weather information, and similar. They may also offer other energy statistics including fossil fuel imports and exports, gas, oil, and coal prices, emissions certificate prices, and information on energy efficiency costs and benefits. Much of the data is sourced from official or semi-official agencies, including national statistics offices, transmission system operators, and electricity market operators. Data is also crowdsourced using public wikis and public upload facilities. Projects usually also maintain a strict record of the provenance and version histories of the datasets they hold. Some projects, as part of their mandate, also try to persuade primary data providers to release their data under more liberal licensing conditions. Two drivers favor the establishment of such databases. The first is a wish to reduce the duplication of effort that accompanies each new analytical project as it assembles and processes the data that it needs from primary sources. And the second is an increasing desire to make public policy energy models more transparent to improve their acceptance by policymakers and the public. Better transparency dictates the use of open information, able to be accessed and scrutinized by third-parties, in addition to releasing the source code for the models in question.

== General considerations ==

=== Background === In the mid-1990s, energy models used structured text files for data interchange but efforts were being made to migrate to relational database management systems for data processing. These early efforts however remained local to a project and did not involve online publishing or open data principles. The first energy information portal to go live was OpenEI in late 2009, followed by reegle in 2011. A 2012 paper marks the first scientific publication to advocate the crowdsourcing of energy data. The 2012 PhD thesis by Chris Davis also discusses the crowdsourcing of energy data in some depth. A 2016 thesis surveyed the spatial (GIS) information requirements for energy planning and finds that most types of data, with the exception of energy expenditure data, are available but nonetheless remain scattered and poorly coordinated. In terms of open data, a 2017 paper concludes that energy research has lagged behind other fields, most notably physics, biotechnology, and medicine. The paper also lists the benefits of open data and open models and discusses the reasons that many projects nonetheless remain closed. A one-page opinion piece from 2017 advances the case for using open energy data and modeling to build public trust in policy analysis. The article also argues that scientific journals have a responsibility to require that data and code be submitted alongside text for peer review.

=== Database design === Data models are central to the design and organization of databases. Open energy database projects generally try to develop and adhere to well resolved data models, using de facto and published standards where applicable. Some projects attempt to coordinate their data models in order to harmonize their data and improve its utility. Defining and maintaining suitable metadata is also a key issue. The life-cycle management of data includes, but is not limited to, the use of version control to track the provenance of incoming and cleansed data. Some sites allow users to comment on and rate individual datasets.

=== Dataset copyright and database rights === Issues surrounding copyright remain at the forefront with regard to open energy data. As noted, most energy datasets are collated and published by official or semi-official sources. But many of the publicly available energy datasets carry no license, limiting their reuse in numerical and statistical models, open or otherwise. Copyright protected material cannot lawfully be circulated, nor can it be modified and republished. Measures to enforce market transparency have not helped much because the associated information is again not licensed to enable modification and republication. Transparency measures include the 2013 European energy market transparency regulation 543/2013. Indeed, 543/2013 "is only an obligation to publish, not an obligation to license". Notwithstanding, 543/2013 does enable downloaded data to be computer processed with legal certainty. Energy databases with hardware located with the European Union are protected under a general database law, irrespective of the legal status of the information they hold. Database rights not waived by public sector providers significantly restrict the amount of data a user can lawfully access. A December 2017 submission by energy researchers in Germany and elsewhere highlighted a number of concerns over the re-use of public sector information within the Europe Union. The submission drew heavily on a recent legal opinion covering electricity data.

=== Energy statistics === National and international energy statistics are published regularly by governments and international agencies, such as the IEA. In 2016 the United Nations issued guidelines for energy statistics. While the definitions and sectoral breakdowns are useful when defining models, the information provided is rarely in sufficient detail to enable its use in high-resolution energy system models.

=== Published standards === There are few published standards covering the collection and structuring of high-resolution energy system data. The IEC Common Information Model (CIM) defines data exchange protocols for low and high voltage electricity networks.

=== Non-open data === Although this page is about genuinely open data, some important databases remain closed. Data collected by the International Energy Agency (IEA) is widely quoted in policy studies but remains nonetheless paywalled. Researchers at Oxford University have called for this situation to change.