kb/data/en.wikipedia.org/wiki/Biological_network_inference-2.md

3.6 KiB
Raw Blame History

title chunk source category tags date_saved instance
Biological network inference 3/3 https://en.wikipedia.org/wiki/Biological_network_inference reference science, encyclopedia 2026-05-05T14:01:42.011989+00:00 kb-cron

=== Centrality Analysis === Centrality gives an estimation on how important a node or edge is for the connectivity or the information flow of the network. It is a useful parameter in signalling networks and it is often used when trying to find drug targets. It is most commonly used in PINs to determine important proteins and their functions. Centrality can be measured in different ways depending on the graph and the question that needs answering, they include the degree of nodes or the number of connected edges to a node, global centrality measures, or via random walks which is used by the Google PageRank algorithm to assign weight to each webpage. The centrality measures may be affected by errors due to noise on measurement and other causes. Therefore, the topological descriptors should be defined as random variable with the associated probability distribution encoding the uncertainty on their value.

=== Topological Clustering === Topological Clustering or Topological Data Analysis (TDA) provides a general framework to analyze high dimensional, incomplete, and noisy data in a way that reduces dimensional and gives a robustness to noise. The idea that is that the shape of data sets contains relevant information. When this information is a homology group there is a mathematical interpretation that assumes that features that persist for a wide range of parameters are "true" features and features persisting for only a narrow range of parameters are noise, although the theoretical justification for this is unclear. This technique has been used for progression analysis of disease, viral evolution, propagation of contagions on networks, bacteria classification using molecular spectroscopy, and much more in and outside of biology.

=== Shortest paths === The shortest path problem is a common problem in graph theory that tries to find the path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized. This method can be used to determine the network diameter or redundancy in a network. there are many algorithms for this including Dijkstra's algorithm, BellmanFord algorithm, and the FloydWarshall algorithm just to name a few.

== Clustering analysis == Cluster analysis groups objects (nodes) such that objects in the same cluster are more similar to each other than to those in other clusters. This can be used to perform pattern recognition, image analysis, information retrieval, statistical data analysis, and so much more. It has applications in Plant and animal ecology, Sequence analysis, antimicrobial activity analysis, and many other fields. Cluster analysis algorithms come in many forms as well such as Hierarchical clustering, k-means clustering, Distribution-based clustering, Density-based clustering, and Grid-based clustering.

== Annotation enrichment analysis == Gene annotation databases are commonly used to evaluate the functional properties of experimentally derived gene sets. Annotation Enrichment Analysis (AEA) is used to overcome biases from overlap statistical methods used to assess these associations. It does this by using gene/protein annotations to infer which annotations are over-represented in a list of genes/proteins taken from a network.

== Network analysis tools ==

== See also == Cellular model Cytoscape tool Bayesian probability Network medicine

== References ==