Hubsm: A Novel Amino Acid Substitution Matrix for Comparing Hub Proteins

Author(s):  
Renganayaki G. ◽  
Achuthsankar S. Nair

Sequence alignment algorithms and  database search methods use BLOSUM and PAM substitution matrices constructed from general proteins. These de facto matrices are not optimal to align sequences accurately, for the proteins with markedly different compositional bias in the amino acid.   In this work, a new amino acid substitution matrix is calculated for the disorder and low complexity rich region of Hub proteins, based on residue characteristics. Insights into the amino acid background frequencies and the substitution scores obtained from the Hubsm unveils the  residue substitution patterns which differs from commonly used scoring matrices .When comparing the Hub protein sequences for detecting homologs,  the use of this Hubsm matrix yields better results than PAM and BLOSUM matrices. Usage of Hubsm matrix can be optimal in database search and for the construction of more accurate sequence alignments of Hub proteins.

2004 ◽  
Vol 02 (04) ◽  
pp. 719-745 ◽  
Author(s):  
ARUN SIDDHARTH KONAGURTHU ◽  
JAMES WHISSTOCK ◽  
PETER J. STUCKEY

In this paper we demonstrate a practical approach to construct progressive multiple alignments using sequence triplet optimizations rather than a conventional pairwise approach. Using the sequence triplet alignments progressively provides a scope for the synthesis of a three-residue exchange amino acid substitution matrix. We develop such a 20×20×20 matrix for the first time and demonstrate how its use in optimal sequence triplet alignments increases the sensitivity of building multiple alignments. Various comparisons were made between alignments generated using the progressive triplet methods and the conventional progressive pairwise procedure. The assessment of these data reveal that, in general, the triplet based approaches generate more accurate sequence alignments than the traditional pairwise based procedures, especially between more divergent sets of sequences.


2019 ◽  
Vol 88 (2) ◽  
pp. 136-150 ◽  
Author(s):  
Julia A. Shore ◽  
Barbara R. Holland ◽  
Jeremy G. Sumner ◽  
Kay Nieselt ◽  
Peter R. Wills

2015 ◽  
Vol 16 (1) ◽  
Author(s):  
Santiago Rios ◽  
Marta F. Fernandez ◽  
Gianluigi Caltabiano ◽  
Mercedes Campillo ◽  
Leonardo Pardo ◽  
...  

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Tomasz Woźniak ◽  
Małgorzata Sajek ◽  
Jadwiga Jaruzelska ◽  
Marcin Piotr Sajek

Abstract Background The functions of RNA molecules are mainly determined by their secondary structures. These functions can also be predicted using bioinformatic tools that enable the alignment of multiple RNAs to determine functional domains and/or classify RNA molecules into RNA families. However, the existing multiple RNA alignment tools, which use structural information, are slow in aligning long molecules and/or a large number of molecules. Therefore, a more rapid tool for multiple RNA alignment may improve the classification of known RNAs and help to reveal the functions of newly discovered RNAs. Results Here, we introduce an extremely fast Python-based tool called RNAlign2D. It converts RNA sequences to pseudo-amino acid sequences, which incorporate structural information, and uses a customizable scoring matrix to align these RNA molecules via the multiple protein sequence alignment tool MUSCLE. Conclusions RNAlign2D produces accurate RNA alignments in a very short time. The pseudo-amino acid substitution matrix approach utilized in RNAlign2D is applicable for virtually all protein aligners.


2006 ◽  
Vol 04 (03) ◽  
pp. 769-782 ◽  
Author(s):  
XIN LIU ◽  
WEI-MOU ZHENG

Amino acid substitution matrices play an essential role in protein sequence alignment, a fundamental task in bioinformatics. Most widely used matrices, such as PAM matrices derived from homologous sequences and BLOSUM matrices derived from aligned segments of PROSITE, did not integrate conformation information in their construction. There are a few structure-based matrices, which are derived from limited data of structure alignment. Using databases PDB_SELECT and DSSP, we create a database of sequence-conformation blocks which explicitly represent sequence-structure relationship. Members in a block are identical in conformation and are highly similar in sequence. From this block database, we derive a conformation-specific amino acid substitution matrix CBSM60. The matrix shows an improved performance in conformational segment search and homolog detection.


Sign in / Sign up

Export Citation Format

Share Document