kb/data/en.wikipedia.org/wiki/Biological_data_visualization-4.md

46 lines
6.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Biological data visualization"
chunk: 5/5
source: "https://en.wikipedia.org/wiki/Biological_data_visualization"
category: "reference"
tags: "science, encyclopedia"
date_saved: "2026-05-05T14:01:39.614425+00:00"
instance: "kb-cron"
---
Regular multiple sequence alignment Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characters are aligned in successive columns. Many sequence visualization programs also use color to display information about the properties of the individual sequence elements; in DNA and RNA sequences, this equates to assigning each nucleotide its own color. In protein alignments color is often used to indicate amino acid properties to aid in judging the conservation of a given amino acid substitution.
For multiple sequences the last row in each column is often the consensus sequence determined by the alignment; the consensus sequence is also often represented in graphical format with a sequence logo in which the size of each nucleotide or amino acid letter corresponds to its degree of conservation.
Circular multiple sequence alignment A common assumption of multiple sequence alignment techniques is that the left- and right-most positions of the input sequences are relevant to the alignment. However, the position where a sequence starts or ends can be totally arbitrary. For instance, when linearizing a circular molecular structure, the start of the sequence is selected randomly. This is relevant, for instance, in the process of multiple sequence alignment of mitochondrial DNA, viroid, viral or other genomes, which have a circular molecular structure.
Spiral multiple sequence alignment Color is used to display information about the properties of the individual sequence elements. There can also be gaps that make the sequences fit better among themselves. In summary, the topology of the spiral sequence alignment is equivalent to a standard linear matrix, with the advantage that it summarizes very long sequences in a practical way. That means that each individual spiral represents one of the sequences being aligned.
3D visualization A common, one-dimensional, representation of a protein sequence is a list of the amino acids that form it. However, 3-dimensional alignment displays the way sequences may match each other. The 1D-3D Group Alignment Viewer, from the RCSD Protein Data Bank, supports exploration of multiple sequence alignments (MSA) at sequence and structure levels for PDB experimental structures and Computed Structure Models (CSMs). It is possible to select proteins and/or residue regions from the MSA to view their 3D structures aligned.
RCSB.org clusters protein entities (PDB experimental structures and CSMs) by sequence identity threshold and UniProt accession. For each cluster, the MSA is calculated using Clustal Omega and displayed in the 1D-3D Group Alignment Viewer using specific color schemes. PDB protein sequence positions are represented in blue if residue was experimentally determined, and in gray if not. CSMs are colored according to their local pLDDT scores.
== Phylogenies ==
A phylogenetic tree is a branching diagram or a tree showing the evolutionary relationships among various biological species or other entities based upon similarities and differences in their physical or genetic characteristics. It is a visual representation that shows the evolutionary history between a set of species or taxa during a specific time.
Two things are implicitly occurring along the branches of a phylogenetic tree. The first is the passage of time. Deeper nodes are older than the shallower nodes to which they are connected. Thus, deeper nodes indicate both more distant relationships among the terminal taxa that they connect, and a greater age for the most recent common ancestor of those taxa. The second thing is evolutionary modification, or the accumulation of hereditary genetic and/or structural changes along these branches. The term "branch length" typically refers to the number of these changes. If the "branch lengths" of the tree measure these changes, we also call the tree a phylogram.
Regular phylogenetic tree Generally called a dendrogram, it is a diagram with straight lines representing a tree. It would show a column of nodes representing individual taxa, and the remaining nodes represent the clusters to which the data belong, with the arrows representing the distance: a way to measure how different they are (dissimilarity). The distance between merged clusters is monotone, increasing with the level of the merger: the height of each node in the plot is proportional to the value of the intergroup dissimilarity between its two branches.
Cladogram It is also a diagram with straight lines representing a tree. The difference between a cladogram and an evolutionary tree is that the cladogram does not show how ancestors are related to descendants, nor does it show how much they have changed. This means that more than one evolutionary tree may correspond to the same cladogram.
Circular phylogenetic tree Circular trees are often used to illustrate relationships among members of major groups of extant organisms, and these trees may have many terminal taxa. It might seem counterintuitive, but the same information given in a regular phylogenetic tree is given in a circular genetic tree. The topology of the structure remains the same, and it only changes shape to better fit a lot of information in less space.
3D Visualization In a phylogram, the evolutionary distance is represented on one of the axes and the genes on the other. For it to be possible to visualize the paralogs, a third axis can be added. In standard (2D) phylogeny layout it is not always easy to distinguish gene duplication events (paralogs) from speciation branching (species), because only one spatial axis (genes) is available to show the mix of these two kinds of information. By contrast, they can be easily distinguished in 3DPE, because it projects them onto two orthogonal axes: species (X) vs. paralogs (Z). For instance, the evolution of many paralogs is visually obvious in the 3DPE view (in the three eukaryote species, on the right), but this pattern is less clear in the 2D representation.
== Visualization software ==
== References ==
== External links ==
=== Related conferences ===
BioVis: Symposium on Biological Data Visualization
Applications of Information Visualization in Bioinformatics
CIBDV: Computational Intelligence for Biological Data Visualization
IVBI: Information Visualization in Biomedical Informatics Symposium
VMLS: Visualization in Medicine & Life Sciences
VIZBI: Workshop on Visualizing Biological Data