What is ReBLOSUM ?
The Rectangular BLOSUM matrices give scores to compare amino acid groups from different alphabets. In some indexing applications, memory constraints require some grouping of amino acids. The ReBLOSUM matrices can be used when only one alphabet is compressed : for example, a 4x20 matrix is far more precise than a square 4x4 matrix.
Alphabets names are those defined in [Edgar 04] and gather the works of [Li 03] and other papers. The reference BLOSUM implementation is the one from [Henikoff 92]. Parameters "lambda" and "K" for the e-value are computed according to [Karlin 90].
You can get all the matrices using the Li alphabets on an unique web page, or in an unique zip file.
Get a specific matrix
Reference
Rectangular BLOSUM matrices are a collaboration between Symbiose team (IRISA, CNRS, INRIA, Univ. Rennes 1) and Bonsai team (CRIStAL, CNRS, INRIA, Univ. Lille 1). Please cite :
- [Peterlongo 08] Pierre Peterlongo, Laurent Noé, Dominique Lavenier, Van Hoa Nguyen, Gregory Kucherov, Mathieu Giraud. Optimal neighborhood indexing for protein similarity search, BMC Bioinformatics, 9:534, 2008.
Publications
Principal references used in ReBLOSUM (see the article for a full list) :
- [Edgar 04] Robert C. Edgar. Local homology recognition and distance measures in linear time using compressed amino acid alphabets, Nucleic Acid Research, 32, 380-385, 2004.
- [Henikoff 92] S. Henikoff and J.G. Henikoff. Amino acid substitution matrices from protein blocks, Proc. Natl. Acad. Sci. 89, 10915-10919, 1992.
- [Karlin 90] S. Karlin and S. Altschul. Methods for assessing the statistical significance of molecular sequence feature by using general scoring schemes, Proc. Natl. Acad. Sci. 87, 2264-2268, 1990.
- [Li 03] T. Li, K. Fan, J. Wang, W. Wang, Reduction of protein sequence complexity by residue grouping., Protein Eng., 16, 323-330, 2003.