Comparative sequence-structure analysis of Aves insulin

Normal blood glucose level depends on the availability of insulin and its ability to bind insulin receptor (IR) that regulates the downstream signaling pathway. Insulin sequence and blood glucose level usually vary among animals due to species specificity. The study of genetic variation of insulin, blood glucose level and diabetics symptoms development in Aves is interesting because of its optimal high blood glucose level than mammals. Therefore, it is of interest to study its evolutionary relationship with other mammals using sequence data. Hence, we compiled 32 Aves insulin from GenBank to compare its sequence-structure features with phylogeny for evolutionary inference. The analysis shows long conserved motifs (about 14 residues) for functional inference. These sequences show high leucine content (20%) with high instability index (>40). Amino acid position 11, 14, 16 and 20 are variable that may have contribution to binding to IR. We identified functionally critical variable residues in the dataset for possible genetic implication. Structural models of these sequences were developed for surface analysis towards functional representation. These data find application in the understanding of insulin function across species.


Background:
Diabetes mellitus (DM) is one of the predominant diseases that affect presently ~ 382 million people all over the world and its incidence is expected to increase to 592 million by 2035(according to international diabetes federation). Insulin level or binding ability to IR is the major determinant factor of DM. Insulin is a globular protein central to the regulation of vertebrate carbohydrate metabolism. It is one of the most important hormones, carrying messages that describe the amount of available sugar from moment to moment in the blood. Insulin is the primary regulator of carbohydrate homeostasis and has effect on lipid and protein metabolism [1,2]. The mechanism of action of these hormones is mediated by their specific binding to the Insulin Receptor (IR) [3]. The binding of insulin to IR leads to activation of the tyrosine kinase function of the intracellular part of the receptor and subsequent transporter activation as well as increase cellular uptake of glucose [4]. A confirmatory repeat blood sugar level ≥140 mg/100 ml proved valuable in defining a high risk group for diabetes in human [5]. But there is an unusually high blood glucose level found in birds without diabetes or any associated consequences (Figure 1). Normal plasma glucose levels in some birds is three to four times higher than human [6]. May be birds have some intrinsic mechanism to control blood glucose levels without showing diabetic symptoms. Comparative analysis of Aves insulin may gives some ideas about the mechanism.
Insulin is made in the pancreas and added to the blood after meals when sugar levels are high. This signal then spreads throughout the body, to the liver, muscles and fat cells. Insulin tells these organs to uptake glucose from the blood and stores in the form of glycogen or fat. The mechanism of insulin binding to insulin receptor and signal transduction through the transmembrane domain has vital role to maintain blood glucose levels [7]. Structure of insulin is the key to protein function and interaction to IR. Evolution of insulin gene and its promoter has started over a 450 million-year period [8]. Protein structure analysis can provide lots of complex information about protein functions related disorders. Wet lab based research requires the trial and error method and cannot make a prediction before the original result. This problem can be overcome by the use of computational biology. Alteration in protein structure leads to altered protein function which in turn leads to development of diseases [9]. The target of this research is to give an intrinsic view of Aves insulin that may suggest an important idea about control mechanism of blood sugar level as well as recombinant human insulin development.  [15]. There is a drastic fluctuation of normal blood glucose level among Aves, Reptile and Mammal. Aves glucose level is four times higher than Reptile and seven times higher than Mammal. Around 20 % of total amino acid in Aves insulin is Leuchin. It is hydrophobic amino acid and the reason behind this high percentage of Leu is not fully known yet.

Methodology:
Protein sequence retrieval Thirty two Aves insulin and ten mammalian insulin sequences were collected from UniProt (http://www.uniprot.org/). We preferred most commonly available mammal and all Aves sequences found UniProte database until mid June, 2013. Those sequences were used for further analysis by online or freely available computational tools.

Analysis of Physico-chemical properties
The ProtParam tool (http://web.expasy.org/protparam/ ) of ExPASy was used to compute amino acid composition (%), molecular weight, theoretical isoelectric point (pI), number of positively and negatively charged residues, extinction coefficient, instability and aliphatic index, Grand Average of Hydropathy (GRAVY).

Analysis of Secondary structural properties
Secondary structural properties of the protein including alpha helix, 310 helix, Pi helix, beta bridge, extended strand, beta turns, bend region, random coil, ambiguous states and other states were computed by the use of SOPMA (Self Optimized Prediction Method with Alignment, http://npsapbil.ibcp.fr/ cgibin/npsa_automat.pl?page=/NPSA/npsa_sopma.html) tool of NPS (Network Protein Sequence Analysis) [10].

Prediction of functional properties
The motif prediction analysis was carried out with the help of Expasy's prosite tool. For functional analysis, the motifs of the insulin protein sequences were identified by using Prosite (http://prosite.expasy.org/). Input data type was in FASTA format and motifs were scanned against prosite patterns.

Identification of Signature Logo using Web tool
Logo of Aves insulin was generated using Web Logo tool (http://weblogo.berkeley.edu/). In this overall height of the stack indicates the sequence conservation at that position, while the heights of the symbols within the stack indicate the relative frequency of each amino acid at that position.

Sequence alignment
Insulin sequences were align by using MEGA5.1 and identify the variable region that may be responsible for functional activity of high plasma glucose level. Direct comparison between human and turkey insulin sequence is given below and box shows the changes of amino acids. It indicate completely different types of (in term of hydrophobic and hydrophilic) amino acid changes in between this two species.

Phylogenetic analysis
Thirty two sequences of Aves insulin were aligned by ClustalW tool and output file of this program was used for generation of phylogenetic tree (http:// www.ebi.ac.uk/ Tools/msa/clustalw2/) by using Neighbor-Joining method.

Results & Discussion:
The Physiochemical characterization, secondary structure properties, motif and phylogenetic analysis was done by using different computational tools for 32 Aves insulin sequences. Insulin sequences contain leucine around 20% of their amino acids which is significantly higher than other (Figure 2). The total number of positively (Arg + Lys) and negatively (Asp + Glu) charged residues were quite same, that's why pI was ~7. Extinction coefficient for all Insulin was observed higher. High extinction coefficient means higher concentration of lysine, tryptophan and tyrosine. This prediction is useful to study protein-protein interaction studies. The higher aliphatic index indicates higher thermostability and higher concentration of alanine, valine, isoleucine and leucine occupying the relative volume of a protein. A protein is stable or not can be described by its instability index. Instability index for Insulin in most case is higher than 40 and thus describing these proteins as unstable [11].
Average of Hydropathy (GRAVY) was computed for all the members. A broad range of GRAVY value was observed from 0.304 to -0.006 for Insulin Table 1 (see supplementary  material). SOPMA analysis was done for all insulin members and it showed a high value for random coil in all the members Table 2 (see supplementary material). High value for random coil bears important significance in the study of protein tertiary structure and related functions. Functional analysis of these proteins includes identification of important motifs Table 3 (see supplementary material). Only eight proteins show functional motif within 32 sequences. These motifs were 14 amino acids in length arise because specific residues and regions proved to be important for the biological function of a group of proteins, which are conserved in both structure and sequence during evolution [12]. For observing variability of Aves insulin sequences, MEGA5.1 software was used ( Figure  3). WebLogo was designed (from weblogo.berkeley.edu) to show variable and constant amino acid position (Figure 4). Amino acid position 11, 14, 16 and 20 are variable on species to species. Phylogenetic tree was constructed with distance based Neighbor-Joining method. A number of clusters were found ( Figure 5). 3D structure of turkey and human insulin are determined (Figure 6) Proteins in close evolutionary relationship may be analyzed together for their involvement in similar biological processes. Another important finding is the structural differences between IR. There are sequence deletion found between 743 to 755 and 1007 to 1012 in Aves IR (comparison between human and turkey). This IR and insulin sequences differences may play a major role for binding affinity as well as intracellular signaling pathway that control blood glucose level.

Conclusion:
In this study, detail information of Aves insulin was sequentially identified using various computational tools. Insulin is related to diabetic, a group of metabolic diseases in which a person has high blood sugar, either because the pancreas does not produce enough insulin, or because cells do not respond to the insulin that is produced [13]. Some information influence to carry on the study like South Asian people have higher blood glucose levels than white European people [14] and SNP alleles in the IR gene are associated with typical migraine [15]. Present investigation and information may provide a possible explanation for high blood glucose in Aves as well as species specificity of insulin. This information will help to design effective recombinant insulin for therapeutic application. However, this finding is not enough to establish the hypothesis and need further study and validation by experimental approaches.    Uria aalge  20  0  0  0  11  5  0  19  0  0  Pagodroma nivea  22  0  0  0  12  4  0  17  0  0  Fulmarus glacialis  22  0  0  0  12  4  0  17  0  0  Rissa tridactyla  20  0  0  0  11  5  0  19  0  0  Pica pica  18  0  0  0  13  7  0  17  0  0  Melopsittacus  undulatus   19  0  0  0  12  7  0