Kernel Generative Topographic Mapping of Protein Sequences
Keyword(s):
The world of pharmacology is becoming increasingly dependent on the advances in the fields of genomics and proteomics. The –omics sciences bring about the challenge of how to deal with the large amounts of complex data they generate from an intelligence data analysis perspective. In this chapter, the authors focus on the analysis of a specific type of proteins, the G protein-couple receptors, which are the target for over 15% of current drugs. They describe a kernel method of the manifold learning family for the analysis of protein amino acid symbolic sequences. This method sheds light on the structure of protein subfamilies, while providing an intuitive visualization of such structure.