Prediction of functions for two LEA proteins from mung bean.

LEA (late embryogenesis abundant) proteins are associated with tolerance to water stress resulting from desiccation and cold shock. Although various functions have been proposed to LEA proteins, their precise role is not fully defined. In silico analysis of the amino acid sequence of two LEA proteins (early methionine-labeled Vigna, EMV) from the tropical legume crop, Vigna radiata identified a 20 residues motif 'GGQTRKQQLGSEGYHEMGRK' characteristic to group 1 LEA proteins. Structural analyses hypothesize these proteins to function like DNA/RNA binding proteins in protecting macromolecules/ membrane stabilization at the time of dehydration process.


Background:
Some proteins are highly expressed during late stage(s) of seed development and are referred to as LEA (late embryogenesis abundant) proteins. These proteins are found in a wide range of plant species and are suggested to involve in desiccation tolerance based on their accumulation and physicochemical properties. [1] LEA-type proteins fall into a number of families with diverse structures and functions, that differ in the arrangement and number of conserved motifs. These proteins are also hydrophilic in nature, and are transcriptionally regulated in response to ABA. [2] Prediction of secondary structures suggests that these proteins exist as largely unfolded molecules in their native state although a few members do exist as dimers or tetramers. [3] The precise function of LEA-type proteins is largely unknown. However, their considerable synthesis during the late stage of embryogenesis, induction by stress and other biophysical characteristics, such as hydrophilicity, random coils and repeating motifs permit prediction of some of their possible functions. LEA-type proteins are reported to act as water-binding molecules, in ion sequestration and in macromolecule and membrane stabilization. [2,4] LEA proteins are ubiquitous among photosynthetic organisms and have been reported in mono-and dicot plants as well as in nematodes, yeast, bacteria and cyanobacteria. [5] These proteins are encoded by multigene families with different number of conserved residues motifs arranged in tandem as reported in cotton, maize, barley, Arabidopsis, mung bean, soybean etc. [6] Our earlier work showed the occurrence of earlymethionine (Em)-labeled proteins in the mung bean (Vigna radiata) axes, referred as EMV proteins, the first ever report of such proteins in the Fabaceae family. The cDNAs encoding these proteins were isolated, characterized and found to show certain level of similarity with other Em/ LEA proteins.
The results of an in silico analyses of these proteins based on their deduced amino acid sequences prove that these belong to group 1 LEA protein with possible DNA/RNA binding function. Such a property may facilitate hydrogen bonding of these proteins with essentially any macromolecule or membrane thus protecting the internal structures of the cell from being damaged due to altered physiological conditions.

Results and Discussion:
Sequence homologues and presence of 20 residues long conserved motifs WU-BLAST2 analysis of EMV proteins revealed the highest sequence identity to a group 1 LEA protein from the black locust, Robinia pseudoacacia (87 %; UniProt: P93509) followed by several other group 1 LEA sequences including Arabidopsis thaliana (83 %; UniProt: Q42489), water-stress protectant protein from Gossypium hirsutum (80 %; UniProt: Q03791), Em-like protein from Daucus carota (78 %; UniProt: Q5KTS7) and Quercus robur (79 %; UniProt: Q7XBA7). Analysis of EMV2 protein sequence showed maximum sequence identity (87 %) to Glycine max Em protein (UniProt: P93165) followed by Em protein from Robinia pseudoacacia (80 %; UniProt: P93510), LEA protein from Arachis hypogaea (83 %; UniProt: Q4U4M1) and Arabidopsis thaliana Em like protein GEA6 (81 %; EM6). In addition, sequences with similarity to EMV proteins were detected in the moss, Physcomitrella patens and in Bacillus subtilis, specific for the gsiB gene encoding a stress-related protein identified based on the glucose starvation-inducibility. [16] The other matches include hypothetical proteins from different organisms with less significant E-values.
The occurrence of a 20 residues long repeat in group 1 LEA proteins (Table 1) identified a 20-mer motif 'GGQTRKQQLGSEGYHEMGRK' at positions 44-63 and 64-83 in EMV1 and at position 44-63 in EMV2, characteristic in plants and other organisms indicating that these proteins belong to group1 LEA family based on the revised classification system proposed by Wise [17]. This remarkable conservation points to an important role of LEA proteins in stress adaptation.

EMV proteins are hydrophilic and belong to pfam00477 Cluster
The accumulation of hydrophilic transcripts was demonstrated in E. coli and S. cerevisiae as well as from nematode and moss. Hydropathy plots revealed that EMV proteins are highly hydrophilic with over 95 % residues falling in the hydrophilic regions with negative scores (  The structure prediction program and intrinsic disorder prediction suggest that EMV proteins maybe largely or entirely unstructured in solution. Firstly, the consensus identified for EMV proteins by structure prediction programs revealed predominantly random coil structure with two small regions of β sheets and five distinct helical blocks ( Figure 2). Secondly, the GLOBPLOT analysis ( Fig. 3) revealed intrinsic disorders in EMV1 (residue positions 20-29, 40-46, 49-66 and 80-91) and EMV2 (residues 20-28, 40-46, 60-71 and 93-97). Secondary structures were also observed as hydrophobic clusters and corresponded with the 1D and 3D representations ( Figure  4). Low hydrophobic levels of proteins with relatively high overall charge are associated with a lack of compactness in proteins under physiological conditions resulting in a natively unfolded structure [18].

Polypeptide chain flexibility and conserved double glycine residues
The internal hydrophilic motif of EMV proteins is flanked by the conserved double glycine residues with approximately 20 amino acid intervals giving a pattern in which the entire sequence of the mung bean LEA proteins could be viewed as consisting of 20 residues domains separated by structurally flexible double glycine residues. Variable number of hydrophilic motif suggests a higher water-binding capacity as seen by the presence of repeats that are most hydrophilic part of mung bean LEA proteins. This observation is in good agreement with the findings of [19] for the barley B19 LEA proteins.  (Table 2). Thus, EMV1 is 'fairly good' while that of EMV2 is a 'good' hypothetical protein model. The homology model of mung bean LEA proteins, thus generated in this study, could aid in determining the mechanistic function of this important class of proteins.

Conclusion:
Vigna radiata EMV proteins are classified under group 1 LEA proteins based on their extreme hydrophilicity and predominantly random-coiled arrangement of the residues along with the adoption of helical conformation as revealed by ab initio secondary structure predictions. Function assignments of these two LEA proteins suggest that they are involved in DNA/RNA binding action like other group 1 LEA proteins. Such a property may facilitate hydrogen bonding of these EMV proteins with essentially any macromolecule or membrane thus protecting the internal structures of the cell from being damaged due to altered physiological conditions. EMV proteins with the consistent spatial arrangements hence point to the possibility that they have a functional role in the plant's response to dehydration.