An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins

BACK TO CONTENTS | PDF | NEXT

Title	An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins
Authors	Divya P Syamaladevi^{1, 2}, Adwait Joshi² & Ramanathan Sowdhamini²*
Affiliation	¹Sugarcane Breeding Institute Indian Council of Agricultural Research Coimbatore, India, PIN 641 007; ²National Center for Biological Sciences (TIFR), UAS-GKVK Campus, Bellary Road, Bangalore 560 065, India.
Email	mini@ncbs.res.in; *Corresponding author
Article Type	Hypothesis
Date	Received November 19, 2012; Revised January 01, 2013; Accepted January 02, 2013; Published June 08, 2013
Abstract	Annotations of the genes and their products are largely guided by inferring homology. Sequence similarity is the primary measure used for annotation purpose however, the domain content and order were given less importance albeit the fact that domain insertion, deletion, positional changes can bring in functional varieties. Of late, several methods developed quantify domain architecture similarity depending on alignments of their sequences and are focused on only homologous proteins. We present an alignment-free domain architecture-similarity search (ADASS) algorithm that identifies proteins that share very poor sequence similarity yet having similar domain architectures. We introduce a “singlet matching-triplet comparison” method in ADASS, wherein triplet of domains is compared with other triplets in a pair-wise comparison of two domain architectures. Different events in the triplet comparison are scored as per a scoring scheme and an average pairwise distance score (Domain Architecture Distance score - DAD Score) is calculated between protein domains architectures. We use domain architectures of a selected domain termed as centric domain and cluster them based on DAD score. The algorithm has high Positive Prediction Value (PPV) with respect to the clustering of the sequences of selected domain architectures. A comparison of domain architecture based dendrograms using ADASS method and an existing method revealed that ADASS can classify proteins depending on the extent of domain architecture level similarity. ADASS is more relevant in cases of proteins with tiny domains having little contribution to the overall sequence similarity but contributing significantly to the overall function.
Keywords	Domain architecture, Phylogeny, ADASS, Alignment free domain architecture similarity search.
Citation	*Syamaladevi et al.* Bioinformation 9(10): 491-499 (2013)
Edited by	P Kangueane
ISSN	0973-2063
Publisher	Biomedical Informatics
License	This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.