Structure modeling and comparative genomics for epimerase enzyme (Gal10p).

The Gal10p (UDP-Galactose 4-epimerase) protein is known for regulation of D-galactose metabolism. It catalyzes the inter-conversion between UDPgalactose and UDP-glucose. Knowledge of protein structure, neighboring interacting partners as well as functional residues of the Gal10p is crucial for carry out its function. These problems are still uncovered in case of the Epimerase enzyme. Structure of Epimerase enzyme has already been determined in S.cerevisiae and E.coli, however, no structural information for this protein is available for K.lactis. We used the homology modeling approach to model the structure of Gal10p in K.lactis. Furthermore, functional residues were predicted for modeled Gal10 protein and the strength of interaction between Gal10p and other Gal proteins was carried out by protein -protein interaction studies. The interaction studies revealed that the affinity of Gal10p for other Gal proteins vary in different organisms. Sequence and structure comparison of Epimerase enzyme showed that the orthologs in K.lactis and S.cervisiae are more similar to each other as compared to the ortholog in E.coli .The studies carried by us will help in better understanding of the galactose metabolism. The above studies may be applied to Human Gal10p, where it can help in gaining useful insight into Galactosemia disease.


Background:
Saccharomyces cerevisiae and Kluveromycetes lactis have several common features in galactose metabolism. The gene cluster comprising of GAL10, GAL7 and GAL1 genes, that codes for the enzymes galactose-1phosphate (gal-1-P) uridyl transferase, UDP-galactose-4-epimerase and galactokinase respectively and is common to both the yeast species as well as E.coli [1][2]. When galactose is present as sole carbon source of energy then these organisms employ Leloir pathway to utilize galactose. From the medium, galactose can be taken by the cells by different permeation systems regulated by three enzymatic reactions which are essential for the metabolism of galactose as shown in Supplementary material [3,4].
Gal10p consists of two enzymatic activities. It splits in to mutarotase and UDP galactose 4-epimerase activities [5]. These activities are the basis of catabolic reaction of galactose where mutarotase converts beta-D-galactose into its alpha form and galactose 4-epimerase catalyzes the reversible conversion between UDP-galactose and UDP-glucose [5]. Crystal structure analysis has revealed that the galactose 4-epimerase domain, encoded by the N-terminus domain of the protein, is separated from the C-terminal mutarotase domain by a simple Type II turn [6]. Loss of Gal10p activity prevents cell growth when galactose is the sole carbon source [7]. The Saccharomyces cerevisiae epimerase encoded by the GAL10 gene is about twice the size of either the bacterial or human protein but has nearly similar size as Gal10p of K.lactis [5] In S. cerevisiae Gal10p has a bifunctional role, but in most organisms the mutarotase and epimerase activities are conceded out by different proteins. However, the 3D structure of Gal10p protein from K. lactis is not known, although, sequence is available in public repository GAL10 K.lactis (CAG98170.1). Therefore, we model the 3D structure of Gal10p using homology modeling approach. We furnish the modeled Gal10p to different functional site prediction servers like PINTS [8], PROFUNC [9] and Q-SITEFINDER [10] to find putative interactive amino acid residues for assessment of their interaction with other GAL proteins. We further determine the homology among the Gal10p from S. cerevisiae, K. lactis and E.coli at both sequence and structure level. Finally, we find the interaction network of Gal10p from S. cerevisiae, K. lactis and E.coli and define the strength of interaction between different GAL proteins within the same organism by protein-protein interaction method which are required for the regulation of leloir pathway in galactose metabolism.

Methodology: Input file:
The protein sequences of Gal10p from S. cerevisiae, K. lactis and E.coli (Gal10p (S.cerevisiae (NP_009575.1), Gal10p K.lactis (CAG98170.1), Gal10p E.coli (ACA78806.1)) were downloaded from the gene bank databases. The overall research work was divided into following steps (1) Development of model structure of Gal10p from K. lactis. (2) Prediction of amino acid residues required for protein-protein interaction. (3) Finding evolutionary relationship among the Gal10p of S. cerevisiae, K. lactis and E.coli. (4) Generating putative protein interaction map among GAL proteins of K. lactis, S.cerevisiae and E.coli and estimate their interaction affinity.

Homology Modeling:
Firstly, the protein sequence of Gal10p from K. lactis was subjected to SWISS MODEL software for 3D model development [11]. The table 1 shows details of homology modeling. Furthermore, the quality of developed Gal10p model was estimated via Procheck and ProSA (https:// prosa.services.came.sbg.ac.at/prosa.php). (Note that the model structure will be generated only when the sequence similarity will be more then 30%).

Model Optimization:
The model was further optimized via energy minimization through GROMOS96, incoorporated in Swiss Pdb Viewer software. Additionally,

Results:
We obtained the 3D structure of Gal10p from Kluyveromyces lactis by swiss model software [11] via homology modeling (Figure 2). We subjected the protein sequence of Gal10p through SWISS MODEL by using default parameters. This software developed the 3D structure of Gal10p from K. lactis by using chain A of 1Z45 as the template protein whose 3D structure is known and submitted in protein data bank ( Table 1). The template protein 1Z45 is Gal10p from S. cerevisiae. The protein sequence of Gal10p from K. Lactis showed sequence identity of 54.191% and e-value of 0.00e-1 with 1Z45 protein (Gal10p of S.cerevisiae). The modeled structure is made up of mixture of Helix and sheets (Table 1 see  supplementary material).
Ramachandran plot analysis via procheck estimated the model quality and confirmed that overall accuracy of the developed model was 98.20% where majority of the amino acid residues were in favored [A,B,L]+ additionally allowed [a,b,l,p] regions. The numbers of bad contacts per 100 residues were measured to be only one. ProSA-Web server analysis revealed that the modeled structure occupied region of X-ray predicted native protein structures of same size with Z score of -9.78 (Figure 3). Energy minimization by Gromos96 (Via Swiss Pdb Viewer) stabilized the Gal10p modeled structure from energy of 3281.895 KJ/mol to -23676.174 KJ/mol.
We have done the structure-structure superposition by swiss pdb viewer and calculated the Root Mean Square Deviation (RMSD) value for finding the structure similarity among Gal10p proteins. Superposition of Gal10p of S.cerevisiae produced low RMSD with the Gal10p of K. lactis (RMSD = 0.28A 0 ) as compare to Gal10p of E.coli (RMSD= 0.98 A 0 ) and between Gal10p of K. lactis & Gal10p of E.coli RMSD was 1.03 A 0 ( Table 2 see  The D-galactose pathway is regulated by several proteins which are known to interact with each other and regulate the synthesis of galactose metabolizing enzyme. The Gal10p may also interact with its nearest proteins to carry out its function therefore we determined the affinity between the Gal10p with other GAL proteins present in the K. lactis , S.cerevisiae and E.coli. In order to estimate the strength of interaction affinity between the Gal10p and other Gal proteins within genome of S. cerevisiae, K. lactis and E. coli, we used patchdock software for proteinprotein interaction study. Before estimating the interaction, we also developed the modeled 3D structures of Gal1p, Gal3p, Gal4p, Gal7p, and Gal80p by SWISS MODEL software for S. cerevisiae, K.lactis and E.coli (Table 1 see supplementary material). The Gal10p of S.cerevisiae produced greater affinity for its Gal3p protein with patch dock score 18718 as compared to its other Gal proteins (Table 3 see supplementary  material). On the other hand, Gal10p of K.lactis produced greater affinity for its Gal7p with patch dock score 14986. The Gal10p of E. coli showed greater interaction for Gal1 (galK) with patchdock score 16562 (Table 3 see supplementary material).

Discussion and Conclusion:
Gal10p is an UDP-Glucose 4-epimerase enzyme which participates in Leloir pathway of D-Galactose metabolism. We determine the 3D structure of Gal10p of K.lactis via comparative homology modeling method. This is the first report to ascertain the putative structure of Gal10p from K. lactis. Once the structure is there we can prognosticate the functional residues and the putative interactive partners along with their strength of affinity. These studies will help in understanding the mechanism of action of Epimerase protein. At the same time this information can be used in Biotech industries where Gal proteins are using for protein productions or designing some drugs. Our 3D model may help the biologist to understand the role of Gal10p in K. lactis galactose pathway in a better way. Even we also deduce the comparative genomics study by using the model 3D structure and confirm that S.cerevisiae and K. lactis Gal10p are sharing the common features. Furthermore, the functional site prediction in Gal10p of K. lactis helps in protein-protein interaction analysis and provides information about the residues involved in mutual interaction with other GAL proteins within the genome of K.lactis. This study will help in improving the knowledge about the mechanism of GAL proteins interactions and may provide useful insight in the regulation of Galactose pathway. The above studies may be applied to Human Gal10p, where it can help in gaining useful insight into Galactosemia disease. The protein-protein interaction studies provided by us may find application in industry where GAL pathway is used for protein production.