|
|
|
|
Beyond Bioinformatics |
|
|
|
|
|
|||||||||||||||||||||||||||
|
Title |
|
Being a binding site: Characterizing residue composition of binding sites on proteins
|
|||||||||||||||||||||||||||
|
Authors |
Gábor Iván1, 2, Zoltán Szabadka1, 2, Vince Grolmusz1, 2, *
|
||||||||||||||||||||||||||||
|
Affiliation |
1Protein Information Technology Group, Department of Computer Science, Eötvös University, Pázmány P. stny. 1/C, H-1117 Budapest, Hungary; 2Uratim Ltd. Sóstói út 31/b, H-4400, Nyíregyháza, Hungary
|
||||||||||||||||||||||||||||
|
|
grolmusz@cs.elte.hu; * Corresponding author
|
||||||||||||||||||||||||||||
|
Article Type |
Hypothesis
|
||||||||||||||||||||||||||||
|
Date |
received October 25, 2007; accepted December 29, 2007; published online December 30, 2007 |
||||||||||||||||||||||||||||
|
Abstract |
The Protein Data Bank contains the description of more than 45,000 three-dimensional protein and nucleic-acid structures today. Started to exist as the computer-readable depository of crystallographic data complementing printed articles, the proper interpretation of the content of the individual files in the PDB still frequently needs the detailed information found in the citing publication. This fact implies that the fully automatic processing of the whole PDB is a very hard task. We first cleaned and re-structured the PDB data, then analyzed the residue composition of the binding sites in the whole PDB for frequency and for hidden association rules. Main results of the paper: (i) the cleaning and repairing algorithm (ii) redundancy elimination from the data (iii) application of association rule mining to the cleaned non-redundant data set. We have found numerous significant relations of the residue-composition of the ligand binding sites on protein surfaces, summarized in two figures. One of the classical data-mining methods for exploring implication-rules, the association-rule mining, is capable to find previously unknown residue-set preferences of bind ligands on protein surfaces. Since protein-ligand binding is a key step in enzymatic mechanisms and in drug discovery, these uncovered preferences in the study of more than 19,500 binding sites may help in identifying new binding protein-ligand pairs.
|
||||||||||||||||||||||||||||
|
Keywords |
binding site; functions; structural data; protein; association rules
|
||||||||||||||||||||||||||||
|
Citation |
Ivan et al., Bioinformation 2(5): 216-221 (2007)
|
||||||||||||||||||||||||||||
|
Edited by |
J. C. Tong, T. W. Tan & S. Ranganathan
|
||||||||||||||||||||||||||||
|
ISSN |
0973-2063
|
||||||||||||||||||||||||||||
|
Publisher |
Biomedical Informatics Publishing Group
|
||||||||||||||||||||||||||||
|
Copyright |
Publisher
|
||||||||||||||||||||||||||||
|
Copyright Transfer Agreement |
The authors of published articles in Bioinformation automatically transfer the copyright to the publisher upon formal acceptance. However, the authors reserve right to use the information contained in the article for non commercial purposes.
|
||||||||||||||||||||||||||||
|
License |
This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited.
|
||||||||||||||||||||||||||||