Analysis of salt-bridges in prolyl oligopeptidase from Pyrococcus furiosus and Homo sapiens

Hyper thermophilic archaea not only tolerate high temperature but also operate its biochemical machineries, normally under these conditions. However, the structural signatures in proteins that answer for the hyper thermo-stability relative to its mesophilic homologue remains poorly understood. We present comparative analyses of sequences, structures and salt-bridges of prolyl-oligopeptidase from Pyrococcus furiosus (pfPOP - PDB ID: 5T88) and human (huPOP - PDB ID: 3DDU). A similar level of hydrophobic and hydrophilic residues in pfPOP and huPOP is observed. A low level of interactions between shell-waters and atom-types in pfPOP indicated hyper thermophilic features are negligible. Salt-bridge-forming-residues (sbfrs) are high in pfPOP's core and surface (pfPOP). Increased sbfrs largely indicate specific-electrostatic is important for thermo stability in pfPOP. Four classes of sbfrs are found namely positionally non-conservative (PNCS), conservative (PCS), unchanged (PU) and interchanged (PIC) type of substitutions. PNCS-sbfrs constitutes 28% and it is associated with the topology of pfPOP at high temperature. PCS helps to increase the salt-bridge population. It is also found that PU maintains similar salt-bridges at the active site and other binding sites while PIC abolishes mesophilic patterns.

©Biomedical Informatics (2019) organisms, it has been demonstrated that not only the frequency but also the stability of salt bridge is greater in the former [1] than the latter. The stabilizing and destabilizing roles of salt-bridges are largely dependent on their location either in the protein's core or surface. By computational and experimental studies, it has been demonstrated that the buried salt-bridges are more stabilizing than the exposed ones [1, 2, 12]. Oppositely, it has been claimed that buried salt-bridges fail to overcome the desolation cost and thus are mostly destabilizing [13]. Alternately, if salt-bridge forming residues are substituted by hydrophobic isosteres, stability increases [13]. This conjecture is supported by a designed experiment that by removing a salt-bridge triad (networked) from the core of Arc-repressor by various combinations of hydrophobic isosteres not only increase the net stability by 2-4 kcal/mol than the wild-type protein, but also the specificity of mutant-variants remains similar as the wild-type protein [14].
Comparative analyses of the electrostatic contribution of orthologous thermophilic and mesophilic proteins have been an important and active research area [15]. Our understanding of the stabilizing and destabilizing effects of buried and networked saltbridges still remains an enigma [2, 13,14]. The dielectric constant under the hyperthermophilic condition is drastically low (~55), yet hydrophobic force was proposed to be the dominant contributor to thermostability. On the other hand, at high temperature, the solvation of charged amino acids is severely affected, which would facilitate easy desolvation of the partners of salt-bridge and thereby making the latter stabilizing [1]. The role of specific-electrostatic interactions in hyperthermophilic archaea, Pyrococcus furiosus and human prolyl oligopeptidase (pfPOP and huPOP) are yet to be understood. Prolyl endo peptidase is a serine protease that cleaves peptide at internal Proline sides. While catalytic triad forming residues, SER, HIS, and ASP are similar as other serine proteases (trypsin, subtilisin etc), the overall structure of the enzyme is different [16]. POP's molar mass is about 3 times higher than other proteases. Relative to huPOP, pfPOP has a temperature optimum of 85°C. At this temperature (>85°C), pfPOP is stable for 12 hours [17]. pfPOP is shorter by 94 residues compare to huPOP [18]. The structures of pfPOP and huPOP are solved at 1.9 Å and 1.56 Å resolutions respectively. The protein has two domains, the catalytic domain, and the β-propeller domain. The catalytic domain, which is constituted by two different sequence segments of POP, is present at the C -terminal end. The β-propeller domain is situated in between these two segments [19]. Structural information of pfPOP in the form of literature is not yet available, although the structure of the protein is solved at high resolution. In this study, we undergo extensive analysis of the salt-bridge pattern of pfPOP in comparison to its human homologue (huPOP). The study involves sequence, structure and evolutionary criteria along with detailed binary items of salt-bridges to gain insight into the structural features responsible for the thermophilic adaptation of pfPOP. The study also highlights the substitution pattern of salt-bridge forming residues in aligned sequence. This analysis highlights the evolutionary effects of thermophilic adaptation of pfPOP in comparison to its human homologue (huPOP). Overall, our study involves comparative analysis on salt-bridges, which we believe would have potential applications in protein-engineering and structural bioinformatics.

Dataset:
The 3D structure of pfPOP and its homologous huPOP are procured from the Research Collaboratory for Structural Bioinformatics (RCSB) protein data bank (PDB) [20]. Sequence identity of pfPOP and huPOP are 29%, although both are functionally identical. Few important structural features that are procured from the RCSB summary pages are shown in Table 1.

General comparative analysis:
Detailed analysis on physicochemical and sequence properties along with preparation of BLOCK of pfPOP and huPOP are done using PHYSICO [22] and PHYSICO2 [23]. The sequence BLOCK of pfPOP and huPOP are used for the analysis of evolutionary parameters using APBEST program [24]. Salt bridges are computed using SBION [25] and SBION2 [26] programs. Notably, although salt bridge analysis is possible using other analytical programs [27], residue-specific binary items could only be analyzed by SBION2. The core and surface compositions of crystal structures of POP are extracted using COSURIM [28].
A structural analysis of huPOP (3DDU) is performed on A-chain. In 5T88 (pfPOP), there are two chains (A and B). Chains are separated before any structure related analysis. The structure of huPOP and pfPOP are minimized for 1000 steps using AUTOMINv1.0, if not mentioned otherwise [29]. The shell-water interactions are analyzed on structures of huPOP and pfPOP using POWAINDv1.0 [30].
©Biomedical Informatics (2019) 216 Figure 1: Category (Isolated vs Networked) and class-based (core vs surface) normalized frequencies of the binary item of salt-bridges from pfPOP (5T88_B; red) and huPOP (3DDU_A; black). Here sum of the binary items (e.g. HB and nHB) of any class is equal to the total frequency (Q) of that class.

Salt-bridge's binary items:
Extraction of salt-bridges is performed on un-minimized structures of huPOP (3DDU) and pfPOP (5T88). Salt-bridges thus obtained per-structure is divided into two categories: isolated and networked [31,2]. Each of this binary category is then divided into two classes: core and surface [28]. Now, each of this four classes (namely: isolated-core, isolated-surface, networked-core, and networked-surface) are further grouped into single-bonded vs multiple-bonded, local vs non-local, salt-bridges in secondarystructure (helix and strand) vs salt-bridges in the coil, hydrogenbonded vs non-hydrogen-bonded categories. Some more sub-classes are also made, such as salt-bridge in intra-helix vs saltbridges in intra-strand, salt-bridges in inter-helix vs salt-bridges in inter-strand. Although all these terms are extracted directly in an automated manner [26], there is certain qualitative checking of the protein prior binary item analysis, which is performed by its lower version [25].

Alignment and homologous positional analysis of salt-bridge forming residues:
The sequence of huPOP (UniProt ID: P48147) and pfPOP (UniProt ID Q51714) are extracted from UniProt [32] database. The FASTA files are aligned using T-COFFEE program [33]. The alignment is then used manually to position salt-bridge forming residues, which are procured from the supplementary table of SBION2. The substitutions in pfPOP are divided into four classes based on the types of substitution. If the substitution is hydrophobic to hydrophilic it is taken as NCS (non conservative substitution) type (marked by blue shade). If the substitution is hydrophobic to hydrophobic or hydrophilic to hydrophilic it is CS (conservative substitution) type (marked by green shade). If the substitution is acidic to basis (relative to huPOP), it is taken as design-changer (marked by red color shade). Unchanged partners are shown by cyan-color shade.  In the latter, salt-bridge residues are presented in different color shades. Green-shade: conservative substitution, blue-shade: non-conservative substitution, red: acid to base or base to acid substitution, cyan: unchanged with respect to 3DDU (upper). In 3DDU, salt bridge residues are shown in red (acidic) and blue (basic) colors. Core/surface (e/b) and helix/strand/coil (H/S/C) characteristics of each residue position are also shown in this alignment.

Results: General characteristics of pfPOP and huPOP:
The enzyme prolyl oligopeptidase is a typical protease, which possesses the same catalytic triad, which is constituted by SER-HIS-ASP residues. However, although it possesses the same catalytic residues, it has remarkable differences from other proteases such as trypsin, chymotrypsin and subtilisin [17]. The enzyme, which is an internal proline cutter, is quite abundant in human and hyperthermophilic organism, Pyrococcus furiosus. Unlike huPOP, pfPOP functions optimally at 85°C to 90°C in the cytoplasm [16]. Do these enzymes differ in sequence and structural properties? We are interested to identify the sequence and structural features and to correlate such differential with the gain in stability under hyper temperature conditions. To check these, we have made a detailed comparison of sequence and structure for these two proteins, whose results are presented in Table 2. Following points are noteworthy. First, the huPOP is longer than the pfPOP by about 100 residues. Alignment of two sequences showed 13 insertion regions. Second, although, length is shorter in pfPOP, its hydrophobic and hydrophilic composition in the sequence has been similar to huPOP. Third, aliphatic index of both these proteins is quite high (with little more in pfPOP). This parameter is the indicator of protein stability [34]. Forth, pI of pfPOP is lower than the huPOP indicating acidic residues are higher in the former. GRAVY shows that the hyperthermophilic pfPOP is more hydrophilic than huPOP. The NCS: CS is more in pfPOP than huPOP. A higher ratio indicates more incorporation of the non-conservation type of substitution (i.e. hydrophobic to hydrophilic and vice versa). Fifth, homologous positions of these two proteins show the remarkable difference (74.9%). Notably, such difference is not reflected in the overall compositions of these proteins (see above). How the class compositions (hydrophobic and hydrophilic residues) vary in the structures of these proteins? It is seen that in pfPOP, the surface is relatively less hydrophobic and more hydrophilic than huPOP. Surprisingly it is seen that cores of the protein possess a high amount of hydrophilic residues in both pfPOP and huPOP. In the case of pfPOP, acidic and basic residues are higher in the core and in the surface than that of huPOP. To check the contribution of shell-waters in the stability of these proteins, we made a detailed comparison between huPOP and pfPOP. Notable, the latter functions near the boiling point of water. It is noteworthy that the number of detected waters is much lower in pfPOP than huPOP. It is seen from the table that all type of interactions (at a distance ≤3.2Å) are much lower in pfPOP than huPOP. It is of interest as to how much of these interactions are happening in the core. Interactions in the interior of protein and cavity are the indicator of stability [35]. The latter fractions of shell-waters and protein interactions are also much lower in pfPOP.

Binary characteristics of salt-bridges
To check the pattern of salt bridges for these two proteins (3DDU of huPOP and 5T88 of pfPOP), we have investigated the binary items, whose results are shown in Figure 1. Salt-bridges are divided into two categories i.e. isolated and networked type. Each of this category is then divided into two classes i.e. core and surface. For comparison purpose, the absolute frequency for each of this class is normalized as the length of huPOP and pfPOP are 710 and 616 respectively. Several points are noteworthy from the figure. First, the normalized frequency (Q) is higher in 5T88 than 3DDU in isolated-core (Figure 1, a1), isolated-surface (a2), networked-core (a3) and networked-surface (a4) cases. Similarly, for binary items such as single (SQ) vs multiple (MQ) bonded ( Figure 1, b1-b4), local (L) vs non-Local (nL) (Figure 1, c1-c4), secondary-structured (SS) vs coiled-structure (CC) (Figure 1, d1-d4), hydrogen-bonded (HB) vs non-hydrogen (nHB) (Figure 1, e1-e4) salt-bridges are higher in 5T88 (pfPOP) than 3DDU (huPOP). Second, although 5T88 largely shows a higher proportion of binary items, there are few details here. In surface-networked case, SQ is less but MQ is more in hyperthermophilic 5T88 (pfPOP) (Figure 1, b4). In isolatedcore/surface, networked-core/surface cases, secondary-structured salt-bridges are much higher in 5T88 than that in the coiled case ( Figure. 1, d1-d4). In some cases, the latter is lower in 5T88.
There are nine combinations of secondary structures (S, H, and C), which are HH, HC, CH, SC, CS, SS, HS, SH, and CC. HH and SS can also be INTRA (hh, ss) and INTER (HH, SS) types. How intra and inter-type of salt bridges are populated in huPOP and pfPOP? To check this, we have presented Figure 2. Due to lower or absent frequency, we have compared only hh, HH, ss and SS populations between 5T88 and 3DDU. Several points are noteworthy from the figure. First, although both hh and ss are absent in isolated-core class, hh is present in isolated-surface class ( Figure 2, f1-f2). Here, it is seen that intra-helical salt-bridges are much higher in 5T88 than 3DDU. Remarkably, in networked-core and networked-surface classes, although, 5T88 shows its presence with moderate to high frequency, it is almost absent in the case of 3DDU except for hh in isolated-core class (Figure 2, f3-f4). Third, in isolated-core and isolated-surface classes inter-helical (HH) salt-bridges are completely absent for both the proteins (Figure 2, g1-g2). However, in this case, inter-strand type (SS) shows its presence with much higher frequency for 5T88 (Figure 2, g1-g2). In networked-core and networked-surface population, both HH and SS are present. Interestingly, in these cases, 5T88 shows the higher relative population of these salt-bridges than 3DDU (Figure 2, g3-g4).

Isolated and networked salt-bridges:
In this study, we have compared the salt bridge architecture of two proteins (3DDU and 5T88) that are functioning at two different environments. Salt-bridges are divided into two categories: isolated and networked. The details of these salt-bridges are secondary structure type (one of eleven possible combinations: HH, hh, SS, SS, HC, CH, SC, CS, HS, SH, CC), average distance, bondmultiplicity (one or more bonds between bridging partners), interresidue distances, core/surface locations (Co/Su), hydrogenbonded/non-hydrogen bonded (HB/nHB), local/non-local (L/nL) and if local, its type. Table 3 and Table 4 show details of isolated and networked salt-bridges of 3DDU and 5T88 respectively. There are 19 and 28 isolated and networked salt-bridges in the case of 3DDU. The protein is 710 residues long. Although the length of the hyper-thermophilic protein (5T88) is shorter by about 100 residues, it has 27 and 39 isolated and networked salt-bridges. Surprisingly, although HIS mediated salt-bridges are frequent in both isolated and networked types of 3DDU, they are rare in the case of hyperthermophilic 5T88.    Figure 3. A networked saltbridge (Figure 3a) is formed by more than one acidic and basic group. In the intra-helical salt-bridge, both the acidic and basic partners are seen to be present at the same side of the helix and further; the acidic-partner is present in the N-terminal end. The basic partner is present at (i+4) residues away, where (i) is the position of acidic partner. SBION2 [26], is the program that extracts this type of salt bridges from the crystal structure, which identifies this type as orientation-I (Figure 3b). In the inter-helix salt-bridge ( Figure. 3c), it is seen that basic partner is present in one helix and ©Biomedical Informatics (2019) 222 the acidic residues in the other. These three candidates together are forming a networked salt-bridge. A typical inter-strand salt bridge is shown in Figure. 3d. It is present partly in the core and thus shown with is accessible surface. Here, base-partner is present in one strand and out of two acid-partners, one is present in the strand and the other is present in the coil. It is similar to the inter-helix salt bridge ( Figure. 3c) in terms of the arrangement. However, the three candidates together form a networked sat-bridge.
The alignment of 3DDU and 5T88 are shown in Figure 4 with details of salt-bridges and their core/surface and helix/strand/coil characteristics. Several points are noteworthy. First, although 5T88 has many deletions wrt 3DDU ( Figure. 4), its frequency of saltbridge residues is much higher than the latter (Table 3 and 4). At least four kinds of substitutions are notable in these salt-bridge forming residues. It is seen that salt bridge forming residue undergoes i] non-conservative, ii] conservative, iii] acid to base or base to acid types of substitutions wrt 3DDU. At the same time, about one-fourth of salt-bridge forming residues are kept positionally constant as 3DDU. Second, the active site residues (orange shade) and salt-bridge pattern remain largely similar in these two proteins. Third, the secondary structural positions and core-surface locations remain almost similar in both these proteins. Forth, PNCS, PCS and PU types each constitute 28% of partners of salt-bridge. Interestingly, 3/4 of each of the PNCS and PCS types are present in secondary structures. Rest 14% is constituted by PIC type. thermophiles are living as a pure culture in their ecosystems as mesophiles can't grow there. Thermophilic proteins were shown to start functioning when the temperature of the medium is increased to the level of the growth temperature of these microbes [6]. These observations unequivocally suggest that these organisms and their biomolecules are adapted to their unusual ecosystems via evolution.
Pyrococcus furiosus is such a hyperthermophilic archaeon that thrive at the boiling point of water. As a consequence, the whole of its biochemical machineries are operating at this high-temperature. Because protein is the most exposed biomolecules in a cell for cellular functions and because high temperature (~80-100C) is also known to denature mesophilic proteins, understanding the stability of thermophilic proteins has been the major research focus for last 40 years [36]. Substitution, deletion, insertion, conjugation, and ©Biomedical Informatics (2019) endosymbiosis are the mechanism of adaptation, of which thermophiles seem to relay more on substitution/deletion/insertion than conjugation/endosymbiosis for their adaptation, as a mix-culture state is the prerequisite for the success of the latter mechanism. Such a state is unlikely, as mesophiles can't withstand the ecological niche of thermophiles. Deletion is the preferred mechanism in thermophiles in general over the insertion, as the latter increase chain/loop flexibility and hampers overall packing of proteins at high temperature [36]. Thus, functionally identical proteins (orthologous) are shorter is the size in thermophiles than the mesophiles. In our comparative analysis, we found 5T88, that introduced 13 deletions, is much shorter than 3DDU. Notably, these deletions are not always in the loop regions but also have overlap with secondary structural positions, may indicate this evolutionary decision is related to overcome the topological strain at high temperature. Remarkably, the hydrophobic residues in 5T88 are kept almost similar (little lower) than 3DDU. However, in the sequence of 5T88, both acidic and basic residues show their increase, of which higher and lower increases are constituted by surface and core of the protein respectively. What are the implications of maintenance of hydrophobic residues as 3DDU with the increase of acidic and basic ones in the surface and in the core of 5T88?

The basis of thermostability in 5T88
The central theme of the study is to understand the evolutionary strategy that may have been designed in 5T88 in comparison to its mesophilic homologue for its stability and functionality under hyperthermophilic conditions. Keeping the level of normalized hydrophobic residues (in 5T88) as mesophilic one (3DDU) seems to be an evolutionary decision as at 100°C where pfPOP functions, the dielectric constant decrease to 55.51 [1]. In such a low dielectric medium, it appears that the formation of a typical mesophilic-likehydrophobic core is difficult under thermophilic conditions. The fact that hyperthermophilic situation trends to cause more flexibility, additional stabilizing interactions would be necessary to maintain the above mentioned characteristic balance. Intuitively, it appears that it is not the hydrophilic force that could replenish the deficit of required additional stability under hyperthermophilic conditions.
To understand the contribution of bound-waters (especially in core and cavity) to thermostability, we compared these interactions (isolated, bridged and a cavity in core and surface) between 5T88 and 3DDU. We observed that such interactions between boundwaters and atoms of 5T88 are much less than 3DDU. We found the normalized frequency of salt-bridges is much higher in 5T88 than 3DDU. Similar observations are also entertained in many thermophilic proteins [1, 10, 11, 15]. We further partitioned the overall increase of salt-bridges into isolated and networked categories, and core and surface classes. Salt-bridges of each class is further partitioned into different binary items such as single vs multiple bonded, local vs non-local, hydrogen bonded vs nonhydrogen bonded, in secondary-structure vs in coiled-structure, intra-helix vs intra-strand and inter-helix vs inter-strand using automated procedure [25,26]. Comparison of each binary item of salt-bridges between 5T88 and 3DDU allows us to reach to the conclusion that it is salt-bridges but not the water-protein and hydrophobic interactions that act as the prime force for replenishing the deficit of required additional stability in the former. The increase of salt-bridges at all level also accounts for the higher melting temperature of these proteins. It is similar to the Tm of DNA segment, where it is always less in AT-rich DNA than a GC-rich one. The increase of salt-bridge interactions at all level of binary items may account the enhanced stability and the Tm of thermophilic protein, 5T88 in particular and others in general.

Evolutionary design of salt-bridges in 5T88:
Earlier we pointed out partners (acidic and basis residues) of saltbridges increases both in the core and in the surface. We also pointed out the difference in homologous positions of 5T88 (hyperthermophilic) from 3DDU (mesophilic) is 75%. What are the types of substitutions in this positional difference in terms of saltbridge partners? In 5T88, 109 partners are involved in forming 27 isolated and 39 networked salt-bridges. From the alignment, we identified and classified these 109 partners into 4 classes such as partners i] remain positionally conserved (PU), ii] undergo nonconservative substitutions (PNCS), ii] undergo conservative substitutions (PCS) and iv] inter-changed from acidic to basic or vice versa (PIC). In PU, PNCS, PCS and PIC groups there are 31 (28%), 31 (28%), 31 (28%) and 16 (15%) partners. In PNCS and PCS, 23 (74%) and 22 (71%) are present in the secondary structures (Helices and Strands). It has been claimed that NCS has little or no structural role in the proteins [37], which contradict with our observations. The appearance of NCS in the secondary structure seems to be related with the tuning of the topology of 5T88 [38]. PIC class seems to be critical in changing the mesophilic design of salt-bridges into a thermophilic one. Active site and other binding site, salt-bridges are maintained by PC class and the PCS class allows increasing the proportion of salt-bridges (71% in secondary structure) by keeping the overall properties of the protein similar.
Overall, these four classes of salt-bridge partners play a critical role in increasing the frequency (PCS), in producing new design (PNCS), in the maintaining (PC) and abolishing mesophilic pattern (PIC) of salt-bridges.

Conclusion:
We performed a comprehensive analysis of salt-bridges in hyper thermophilic prolyl oligo-peptidase (PDB ID: 5T88) in comparison to its mesophilic homologue (PDB ID: 3DDU). Majority of increased acidic and basic residues in the core and in the surface form additional isolated (core and surface) and networked (core and surface) salt-bridges. It is found that 5T88 has more normalized frequency than that of 3DDU. These enhanced levels of salt-bridges have relation with the thermo stability and higher Tm of the protein. It is further found that 30% of partners of salt bridges are for maintenance as mesophiles for the active site and other sites. Moreover, 28% of partners in salt-bridges are due to NCS and 75% of which are in the secondary structures. This population of saltbridges is important for the topology of the protein in hyper thermophilic conditions. The remaining 14% of the partners of saltbridges are inter-changed types (e.g. acid to the base and vice versa). Overall, the comparative study on salt-bridges provides insights into the thermostability, which have potential implication in protein-engineering.