--- title: "Biological data" chunk: 1/2 source: "https://en.wikipedia.org/wiki/Biological_data" category: "reference" tags: "science, encyclopedia" date_saved: "2026-05-05T14:01:38.379909+00:00" instance: "kb-cron" --- Biological data refers to a compound or information derived from living organisms and their products. A medicinal compound made from living organisms, such as a serum or a vaccine, could be characterized as biological data. Biological data is highly complex when compared with other forms of data. There are many forms of biological data, including text, sequence data, protein structure, genomic data and amino acids, and links among others. == Biological data and bioinformatics == Biological data works closely with bioinformatics, which is a recent discipline focusing on addressing the need to analyze and interpret vast amounts of genomic data. In the past few decades, leaps in genomic research have led to massive amounts of biological data. As a result, bioinformatics was created as the convergence of genomics, biotechnology, and information technology, while concentrating on biological data. Biological data has also been difficult to define, as bioinformatics is a wide-encompassing field. Further, the question of what constitutes as being a living organism has been contentious, as "alive" represents a nebulous term that encompasses molecular evolution, biological modeling, biophysics, and systems biology. From the past decade onwards, bioinformatics and the analysis of biological data have been thriving as a result of leaps in technology required to manage and interpret data. It is currently a thriving field, as society has become more concentrated on the acquisition, transfer, and exploitation of bioinformatics and biological data. == Types of biological data == Biological data can be extracted for use in the domains of omics, bio-imaging, and medical imaging. Life scientists value biological data to provide molecular details in living organisms. Tools for DNA sequencing, gene expression (GE), bio-imaging, neuro-imaging, and brain-machine interfaces are all domains that utilize biological data, and model biological systems with high dimensionality. Moreover, raw biological sequence data usually refers to DNA, RNA, and amino acids. Biological data can also be described as data on biological entities. For instance, characteristics such as: sequences, graphs, geometric information, scalar and vector fields, patterns, constraints, images, and spatial information may all be characterized as biological data, as they describe features of biological beings. In many instances, biological data are associated with several of these categories. For instance, as described in the National Institute of Health's report on Catalyzing Inquiry at the Interface of Computing and Biology, a protein structure may be associated with a one-dimensional sequence, a two-dimensional image, and a three dimensional structure, and so on. === Biomedical databases === Biomedical databases have often been referred to as the databases of Electronic Health Records (EHRs), genomic data in decentralized federal database systems, and biological data, including genomic data, collected from large-scale clinical studies. == Bio-hacking and privacy threats == === Bio-hacking === Bio-computing attacks have become more common as recent studies have shown that common tools may allow an assailant to synthesize biological information which can be used to hijack information from DNA-analyses. The threat of biohacking has become more apparent as DNA-analysis increases in commonality in fields such as forensic science, clinical research, and genomics. Biohacking can be carried out by synthesizing malicious DNA and inserted into biological samples. Researchers have established scenarios that demonstrate the threat of biohacking, such as a hacker reaching a biological sample by hiding malicious DNA on common surfaces, such as lab coats, benches, or rubber gloves, which would then contaminate the genetic data. However, the threat of biohacking may be mitigated by using similar techniques that are used to prevent conventional injection attacks. Clinicians and researchers may mitigate a bio-hack by extracting genetic information from biological samples, and comparing the samples to identify material unknown materials. Studies have shown that comparing genetic information with biological samples, to identify bio-hacking code, has been up to 95% effective in detecting malicious DNA inserts in bio-hacking attacks. === Genetic samples as personal data === Privacy concerns in genomic research have arises around the notion of whether or not genomic samples contain personal data, or should be regarded as physical matter. Moreover, concerns arise as some countries recognize genomic data as personal data (and apply data protection rules) while other countries regard the samples in terms of physical matter and do not apply the same data protection laws to genomic samples. The forthcoming General Data Protection Regulation (GDPR) has been cited as a potential legal instrument that may better enforce privacy regulations in bio-banking and genomic research. However, ambiguity surrounding the definition of "personal data" in the text of the GDPR, especially regarding biological data, has led to doubts on whether regulation will be enforced for genetic samples. Article 4(1) states that personal data is defined as "Any information relating to an identified or identifiable natural person ('data subject')"