kb/data/en.wikipedia.org/wiki/Biological_network_inference-0.md

6.7 KiB
Raw Blame History

title chunk source category tags date_saved instance
Biological network inference 1/3 https://en.wikipedia.org/wiki/Biological_network_inference reference science, encyclopedia 2026-05-05T14:01:42.011989+00:00 kb-cron

Biological network inference is the process of making inferences and predictions about biological networks. By using these networks to analyze patterns in biological systems, such as food-webs, we can visualize the nature and strength of these interactions between species, DNA, proteins, and more. The analysis of biological networks with respect to diseases has led to the development of the field of network medicine. Recent examples of application of network theory in biology include applications to understanding the cell cycle as well as a quantitative framework for developmental processes. Good network inference requires proper planning and execution of an experiment, thereby ensuring quality data acquisition. Optimal experimental design in principle refers to the use of statistical and or mathematical concepts to plan for data acquisition. This must be done in such a way that the data information content is enriched, and a sufficient amount of data is collected with enough technical and biological replicates where necessary.

== Steps == The general cycle to modeling biological networks is as follows:

Prior knowledge Involves a thorough literature and database search or seeking an expert's opinion. Model selection A formalism to model your system, usually an ordinary differential equation, boolean network, or Linear regression models, e.g. Least-angle regression, by Bayesian network or based on Information theory approaches. it can also be done by the application of a correlation-based inference algorithm, as will be discussed below, an approach which is having increased success as the size of the available microarray sets keeps increasing Hypothesis/assumptions Experimental design Data acquisition Ensure that high quality data is collected with all the required variables being measured Network inference This process is mathematical rigorous and computationally costly. Model refinement Cross-check how well the results meet the expectations. The process is terminated upon obtaining a good model fit to data, otherwise, there is need for model re-adjustment.

== Biological networks == A network is a set of nodes and a set of directed or undirected edges between the nodes. Many types of biological networks exist, including transcriptional, signalling and metabolic. Few such networks are known in anything approaching their complete structure, even in the simplest bacteria. Still less is known on the parameters governing the behavior of such networks over time, how the networks at different levels in a cell interact, and how to predict the complete state description of a eukaryotic cell or bacterial organism at a given point in the future. Systems biology, in this sense, is still in its infancy . There is great interest in network medicine for the modelling biological systems. This article focuses on inference of biological network structure using the growing sets of high-throughput expression data for genes, proteins, and metabolites. Briefly, methods using high-throughput data for inference of regulatory networks rely on searching for patterns of partial correlation or conditional probabilities that indicate causal influence. Such patterns of partial correlations found in the high-throughput data, possibly combined with other supplemental data on the genes or proteins in the proposed networks, or combined with other information on the organism, form the basis upon which such algorithms work. Such algorithms can be of use in inferring the topology of any network where the change in state of one node can affect the state of other nodes.

== Transcriptional regulatory networks == Genes are the nodes and the edges are directed. A gene serves as the source of a direct regulatory edge to a target gene by producing an RNA or protein molecule that functions as a transcriptional activator or inhibitor of the target gene. If the gene is an activator, then it is the source of a positive regulatory connection; if an inhibitor, then it is the source of a negative regulatory connection. Computational algorithms take as primary input data measurements of mRNA expression levels of the genes under consideration for inclusion in the network, returning an estimate of the network topology. Such algorithms are typically based on linearity, independence or normality assumptions, which must be verified on a case-by-case basis. Clustering or some form of statistical classification is typically employed to perform an initial organization of the high-throughput mRNA expression values derived from microarray experiments, in particular to select sets of genes as candidates for network nodes. The question then arises: how can the clustering or classification results be connected to the underlying biology? Such results can be useful for pattern classification for example, to classify subtypes of cancer, or to predict differential responses to a drug (pharmacogenomics). But to understand the relationships between the genes, that is, to more precisely define the influence of each gene on the others, the scientist typically attempts to reconstruct the transcriptional regulatory network.

== Gene co-expression networks ==

A gene co-expression network is an undirected graph, where each node corresponds to a gene, and a pair of nodes is connected with an edge if there is a significant co-expression relationship between them.

== Signal transduction ==

Signal transduction networks use proteins for the nodes and directed edges to represent interaction in which the biochemical conformation of the child is modified by the action of the parent (e.g. mediated by phosphorylation, ubiquitylation, methylation, etc.). Primary input into the inference algorithm would be data from a set of experiments measuring protein activation / inactivation (e.g., phosphorylation / dephosphorylation) across a set of proteins. Inference for such signalling networks is complicated by the fact that total concentrations of signalling proteins will fluctuate over time due to transcriptional and translational regulation. Such variation can lead to statistical confounding. Accordingly, more sophisticated statistical techniques must be applied to analyse such datasets.(very important in the biology of cancer)

== Metabolic network ==

Metabolite networks use nodes to represent chemical reactions and directed edges for the metabolic pathways and regulatory interactions that guide these reactions. Primary input into an algorithm would be data from a set of experiments measuring metabolite levels.

== Protein-protein interaction networks ==