Ligation site in proteins recognized in silico.

Recognition of a ligation site in a protein molecule is important for identifying its biological activity. The model for in silico recognition of ligation sites in proteins is presented. The idealized hydrophobic core stabilizing protein structure is represented by a three-dimensional Gaussian function. The experimentally observed distribution of hydrophobicity compared with the theoretical distribution reveals differences. The area of high differences indicates the ligation site. Availability http://bioinformatics.cm-uj.krakow.pl/activesite


Background:
The classic model of an oil drop representing the hydrophobic core in proteins given by Kauzmann [1] was intended to visualize the importance of hydrophobic interactions responsible for forming and stabilizing the protein tertiary structure. [2, 3, 4] The hydrophilic surface with the hydrophobic center of the molecule is generally accepted [5,6] as the model according to which the amino acid sequence partitions a protein into its inside and outside.

[7]
The model oriented on localization of the area responsible for ligand binding, based on characteristics of spatial distribution of hydrophobicity which changes from protein interior (maximal hydrophobicity) to exterior (close to zero level of hydrophobicity), can be represented by a threedimensional Gaussian function. [8, 9, 10] The simple comparison of theoretical (Gaussian function) and empirical spatial distributions of hydrophobicity in protein allows identification of the areas of high discrepancy, which, as observed in crystal forms of protein-ligand complexes, can be recognized as ligation sites in proteins.
Grid system: The grid system (with constant step size) is constructed for the protein molecule localized with its geometrical center in the origin of the coordinate system ( ) 0 , 0 , 0 and oriented as follows: longest inter-effective atoms (side chains represented by the geometrical centers) distance along the X-axis and longest distance between projections (on YZ plane) of effective atoms along the Yaxis. The size of the ellipsoid can be calculated by taking the maximum and minimum values of the X, Y and Z coordinates found in the molecule, oriented as above.

Theoretical hydrophobicity distribution
The theoretical hydrophobicity value for each grid point can be calculated according to a three-dimensional Gaussian function: and σ x , σ y , σ z -the ellipsoid size ( ⅓ of the maximum length along each axis, respectively). The

Empirical hydrophobicity distribution
The empirical hydrophobicity distribution can be calculated using the original function introduced by Levitt

Conclusion:
The many proteins of unknown biological function, identified on the basis of genome analysis, await a unified automated method for determining their biological activity. [12] The next step is to develop methods able to predict a protein's function from an examination of its structure. Some of the techniques used to identify functionally important residues from the sequence or structure are based on searching for homologues of proteins of known function. [13,14] However, homologues need not have related activity, particularly when the sequence identity is below 25%.
[15] The model presented in this paper is oriented on localizing the area responsible for ligand binding, based on the characteristics of the spatial distribution of hydrophobicity in a protein molecule. It is generally accepted that the core region is not well described by a spheroid of buried residues surrounded by surface residues due to hydrophobic channels that permeate the molecule. [16,17] This being so, we should be able to identify regions with high deviation versus the ideal model by making a simple comparison of the theoretical (idealized according to the Gaussian function) and empirical spatial distribution of hydrophobicity in a protein. The regions recognized by high hydrophobicity density differences seem to reveal functionally important sites in proteins.