Analysis of correlated mutations in Ras G-domain

Ras GTPases are most prevalent proto-oncogenes in human cancer. Mutations in Ras remain untreatable more than three decades after the initial discovery. At the amino acid level, some residues under physical or functional constraints exhibit correlated mutations also known as coevolving/covariant residues. Revealing intra-molecular co-evolution between amino acid sites of proteins has become an emerging area of research as it enlightens the importance of variable regions. Here, I have identified and analyzed the coevolving residues in the Ras GTP binding domain (G-domain). The obtained covariant residue position data correlate well with the known experimental data on functionally important residues. Therefore, it is of interest to understand these residue co-variations for designing protein engineering experiments and target oncogenic Ras GTPases.

Hidden information about protein structure and function can be extracted by looking at the correlated mutational behavior of the amino acid residues positions [7]. An amino acid substitution, which partly destabilizes the protein structure or function, could be corrected by a substitution at different site [8]. This change in amino acid relative to another is known as correlated mutation or co-evolution or co variation. These residues at some sites strongly affect the evolution of certain other sites in the three dimensional structure of the protein. Residue co-evolution allows protein to maintain its overall structural-functional integrity while enabling it to acquire specific functional modifications [7,8].
Here, coevolving residue positions were identified and mapped onto the Ras G-domain. These residue positions were calculated using multiple sequence alignment of Ras superfamily (Ras, Rab, Ran, Rho and Arf) members. A comprehensive literature survey using PubMed and PubMed central search was performed to retrieve the description of the experimentally verified functional information for the predicted covariant residue positions. The presented correlated mutation data will be of interest to the wet lab experimentalist to unlock the secret behind the action of undruggable Ras.

Retrieval of Ras G-domain data:
RAS protein sequence was retrieved from UniProt database (UniProtKB accession number -P78460_HUMAN). PSI-Blast search tool of NCBI was used to identify the homologous sequences of Ras GTPases. The results of PSI-blast search were manually screened to include Ras superfamily members such as Rab, Ran, Rho and so on. These sequences were aligned using ClustalX 2.1 tool (http://www.clustal.org/clustal2/). For analysis, only the GTP binding domains that had conserved G1, G2, G3 and G4 motifs were retained, after deleting neighboring domains at their N and/or C terminal sides. For structural study, experimental structure of Ras (PDBID: 5P21) was retrieved from the Protein Data Bank (http://www.rcsb.org/pdb).

Prediction of coevolving sites:
In order to predict the coevolving sites, the MSA file containing the G-domain sequence of Ras superfamily were submitted to InterMap3D server (http://www.cbs.dtu.dk/services/InterMap3D/). InterMap3D predicts co-evolving pairs of amino acids from an alignment of protein sequence [9]. Here, "RCW MI" method was chosen to predict the coevolving positions. The identified Coevolutionary positions were mapped onto the three dimensional structure of the Ras G-domain (PDBID: 5P21) using UCSF Chimera (https://www.cgl.ucsf.edu/chimera/). PubMed and PMC search was performed to retrieve the description of the experimentally verified functional information for predicted covariant residue positions.

Results and Discussion:
Usually, conserved residue positions are used to identify the functionally important sites in proteins, and a little attention has been given to the residues other than the conserved ones. At the primary structure level, some amino acid residues under physical or functional constraints exhibit correlated mutations or coevolution. Revealing intra-molecular Coevolutionary site has become an emerging area of research as it enlightens the importance of variable regions [7,8]. The coevolving residues in the Ras GTP binding domain were identified and analyzed using InterMap3D (see methods). Mapped onto Ras G-domain; PDBID: 5P21 (Figure 1). A list of 35 pairs of coevolving residue pairs was identified ( Table 1). Analysis of covariant residue revealed that Switch I (residue position 25-40) and Switch II (residue position 57-75) of Gdomain harbor seven (H27, V29, E31, D33, E37, S39, and Y40) and four covariant residues (E63, R68, D69, Q70), respectively. The coevolving residues G13, V29, E31 and D33 were observed within 5Ǻ from the ligand GNP (Figure 1 and Table 2). Noticeably, G13 from p-loop coevolved with V29 of Switch I and a set of four pairs of residues (V29-D69, V29-Q70, S39-E63 and S39-R68) from Switch I-Switch II regions showed co-evolution with each other. The conformational changes at switches depict the active and inactive state of the Ras signaling process [6]. Switch I facilitates GTP hydrolysis through GAP molecules whereas Switch II selectively binds GEFs to carry out exchange of GTP and GDP [3,4]. Therefore, presence of coevolving residue positions around the catalytic pocket indicates their role in imparting functional diversity. However, a larger number of covariant site at Switch I, with seven covariant sites, compared to Switch II with four covariant sites suggest a high vulnerability and a larger role of switch I compared to switch II in regulation of GTPase cycle and cellular signaling. It will be of interest to explore the role of correlated amino acid residues, which are located away (more than 5Å) from the GTP binding sites ( Table 2). These residues are: T20, I21, R41, V45, I46, T50, E92, D93, H95, R98, E99, V103, K104, T124, P140, E153, T158, E162, and I163 (Figure 1 and Table  2).  Regions away from ligand binding site (>5Å of  nucleotide)   T20, I21, R41, V45, I46, T50, E92, D93, H95, R98 E99, V103, K104, T124, P140, E153, T158,  E162, I163 In order to understand, verify and scrutinize the specific molecular and functional role of the reported covariant positions, a comprehensive literature search was performed. PubMed and PMC search revealed that most of the identified covariant residues were associated with regulation of function of Ras ( Table 3). As shown in Table 3, implications of covariant residues of G-domain were found reported in regulation of GTPase cycle, effector binding to Switch I and Switch II region, mediation of water molecule in hydrolysis and sharing regions of allosteric sites [10][11][12][13][14][15][16][17][18][19][20][21][22]. Intriguingly, V29 position of Switch I and K104, located away from the pocket, ( Figure 1A) showed Coevolutionary pattern with eight and seven other residue positions, respectively ( Table 1 and Table 3). Although, analysis of Ras (PDBID: 5P21) revealed that V29 interacts with sugar moiety of GNP (a GTP analog) (Figure 1A), and it is also known to coordinate with conserved water in the catalytic pocket which is essential for hydrolase activity of GTPases [12]. Also, modification at K104 by acetylation affects the conformational stability of the Switch II domain, which is critical for the ability of RAS to interact with guanine nucleotide exchange factors [20]. However, Coevolutionary pressure of eight and seven coevolving residue pairs associated with V29 and K104, respectively, indicate a larger role to be played by these positions, and hence opens up a question for future investigation.  residue positions and their known functional implications   Here, covariant residues T20, H27, E31, I46, T50, Q70, E91, D92,  E98, V103, T124, P140, T158 and E162 were identified as novel sites (Table 3). No experimental report was available, to the best of my knowledge, for these sites. Therefore, it is of interest to understand the role of these residue positions in the functionality of Ras Superfamily and to target oncogenic Ras.

Conclusion:
Coevolving residue positions are functionally important sites and point mutations at these sites result in conformational change in Ras. Here, residues T20, H27, E31, I46, T50, Q70, E91, D92, E98, V103, T124, P140, T158 and E162 were identified as novel covariant sites for which functional implications are yet to be discovered. Also, understanding the role of co-variant residues with high frequency of correlated mutation pairs, such as V29 and K104, might open new avenues in designing experiments to target Ras oncogene.