kb/data/en.wikipedia.org/wiki/BLOSUM-2.md

47 lines
2.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "BLOSUM"
chunk: 3/3
source: "https://en.wikipedia.org/wiki/BLOSUM"
category: "reference"
tags: "science, encyclopedia"
date_saved: "2026-05-05T14:01:55.384613+00:00"
instance: "kb-cron"
---
==== Reliable prediction of T-cell epitopes ====
A novel input representation has been developed consisting of a combination of sparse encoding, Blosum encoding, and input derived from hidden Markov models. this method predicts T-cell epitopes for the genome of hepatitis C virus and discuss possible applications of the prediction method to guide the process of rational vaccine design.
=== Use in BLAST ===
BLOSUM matrices are also used as a scoring matrix when comparing DNA sequences or protein sequences to judge the quality of the alignment. This form of scoring system is utilized by a wide range of alignment software including BLAST.
==== Comparing PAM and BLOSUM ====
In addition to BLOSUM matrices, a previously developed scoring matrix can be used. This is known as a PAM. The two result in the same scoring outcome, but use differing methodologies. BLOSUM looks directly at mutations in motifs of related sequences while PAM's extrapolate evolutionary information based on closely related sequences.
Since both PAM and BLOSUM are different methods for showing the same scoring information, the two can be compared but due to the very different method of obtaining this score, a PAM100 does not equal a BLOSUM100.
===== The relationship between PAM and BLOSUM =====
===== The differences between PAM and BLOSUM =====
=== Availability ===
The "reference" version of BLOSUM is found in the NCBI toolkits. Both the older (deprecated) NCBI C Toolkit and the current NCBI C++ Toolkit provide the BLOSUM45, BLOSUM50, BLOSUM62, BLOSUM80, and BLOSUM90 matrices. Both also offer APIs for making use of the matrices.
The original source code for calculating BLOSUM is also found on the NCBI website, at https://ftp.ncbi.nih.gov/repository/blocks/unix/blosum/. This archive "blosum.tar.Z" represents the original miscalculated version with improved search performance from 1992. The archive also contains pre-calculated BLOSUM outputs at the following similarity levels: "-2" (blosumn), 30, 40, 45, 50, 55, 60, 62, 65, 70, 75, 80, 85, 90, 95, and 100.
==== Software Packages ====
There are several software packages in different programming languages that allow easy use of Blosum matrices. Besides the aforementioned NCBI Toolkits, there are:
blosum module for Python
BioJava library for Java
... and many more.
== See also ==
Sequence alignment
Point accepted mutation
== References ==
== External links ==
Sean R. Eddy (2004). "Where did the BLOSUM62 alignment score matrix come from?". Nature Biotechnology. 22 (8): 10356. doi:10.1038/nbt0804-1035. PMID 15286655. S2CID 205269887.
BLOCKS WWW server
Scoring systems for BLAST at NCBI
Data files of matrices including BLOSUM30100 on the NCBI FTP server.
Interactive BLOSUM Network Visualization Archived 30 January 2017 at the Wayback Machine