Protein beta-turn assignments

A classical way to analyze protein 3D structures or models is to investigate their secondary structures. Their predictions are also widely used as a help to build new 3D models. Thus, hundreds of prediction methods have been proposed. Nonetheless before predicting, secondary structure assignment is required even if not trivial. Therefore numerous but diverging assignment methods have been developed. β-turns constitute the third most important secondary structures. However, no analysis to compare the β-turn distributions according to different secondary structure assignment methods has ever been done. We propose in this paper to analyze and evaluate the results of such a comparison. We highlight some important divergence that could have important consequence for the analysis and prediction of β-turns.

The description of protein structures in terms of secondary structures is widely used for analysis or prediction purposes. The secondary structures are classically described as composed of two repetitive states, the α-helix [1] and the β-sheet [2] states. All residues not associated to these states are assigned to the coil state, an undefined state. Numerous research teams have developed their own secondary structure assignment methods (SSAMs) using different criteria to describe the repetitive structures. DSSP remains the most widely-used program for secondary structure assignment. It is based on the detection of hydrogen-bonds defined by an electrostatic criterion. Secondary structure elements are then assigned according to characteristic hydrogen-bond patterns. [3] STRIDE is directly related to DSSP as it also uses hydrogenbond patterns, even if their definitions are slightly different. [4] In addition, STRIDE takes into account (Φ/Ψ) angles to assign secondary structures. SECSTR belongs to the same family of methods.
[5] XTLSSTR uses distances and angles calculated from the backbone geometry an is concerned with amideamide interactions. [6] PSEA only considers Cα atoms. It is based on distance and angle criteria. [7] DEFINE relies on Cα coordinates only and compares Cα distances with distances in idealized secondary structure segments. [8] KAKSI is a recent approach based on distance between Cα atoms and dihedral angles. [9] SEGNO uses also the Φ and Ψ dihedral angles coupled with other angles to assign the secondary structures.
[10] Nonetheless, only half of the residues are concerned with α-helices and β-strands. So, a more precise description of protein structures requires assignment of other local protein structures. β-turns are the most interesting local protein structures alongside the α-helices and the β-strands. They are constituted of 4 consecutive residues with a distance between Cα of first and fourth residues that has to be smaller than 7 Å. This restrictive distance implies a particular geometry to the backbone which turns back on itself.
[11] As they orient αhelices and β-strands, they play a major role for the final protein topology. As an additional requirement, the central residues have to be non-helical in order to distinguish them from α-helices. Numerous analyses and prediction methods have been performed on the β-turns, but none comparison of β-turns assignment has been performed. In the present paper, we analyze the distribution of β-turns assignment according to different SSAMs.

Description:
Classically, the comparison of SSAMs only focuses on αhelix, β-strand and coil states. [9, 13, 14] Here, we have added the assignment of β-turns and compared their corresponding distribution. A high quality non-redundant set of 887 protein structures was selected from the PDB database according to the following criteria: X-ray structures with 1.6 Å or better resolution, and, no more than 20% pair wise sequence identity. In a first step, the secondary structure assignment was done with DSSP methods. Some methods assigned other states, e.g. turn using distance or hydrogen bond criteria between residues i and i+3, bend using kappa angle between residues i-2, i and i+2, polyproline II which is a helix with dihedral angle values in β-sheet region of Ramachandran map or β-bridge, single pair beta-sheet hydrogen bond formation. So, the description was reduced as follows: α corresponds to α-, 3 10 -and π-helix, β corresponds to β-sheet and β-strand, and, coil encompasses all the rest. In a second step, β-turns were assigned following classical rules [12], i.e. distance between residues i and i+3 less than 7 Å and the central residues of turns must be non-helical. Table 1 summarizes all the results of this analysis.
Repetitive structures corresponding to ~ 60% of the residues for all the SSAMs ranging between 58.05% and 61.51% (cf.  Table 1: Distribution of secondary structure states (left) and confusion matrix for turn states assignments (right) a coil state frequency corresponds to residues not associated to α-helix, β-strand or turns. b turn state frequency corresponds to residues assigned as β-turn and not associated to α-helix or βstrand (our assignment). c number in parenthesis are the frequency of turns originally given by the corresponding methods (original assignment method). For DSSP, it corresponds to turn and bent state. Coil frequencies are higher for STRIDE, DSSP, SEGNO, SECSTR, and XTLSSTR (between 19 and 21%, i.e.), while they are clearly lower for KAKSI, PSEA and DEFINE (~15%). DSSP and STRIDE turns frequencies (in parenthesis in Table 1, col. 5) are very close to the ones we determined with applications of classical rules. For XTLSSTR, it is very different (+8.43%). Analysis of turn frequencies gives two major clusters. The first ones are associated to a frequency of turn residues near 20% (STRIDE, DSSP, SEGNO, SECSTR, XTLSSTR and KAKSI), the second ones are associated to a higher frequency (> 25%%, i.e. PSEA and DEFINE). This higher frequency is at the detriment of α-helix assignment. We compute a confusion matrix of β-turn assignment between each pairs of methods. It is defined as the number of times a residue assigned as turn by a SSAM i is also assigned in turn by SSAM j . The turn confusion matrix (

Conclusion:
This analysis shows that β-turn frequencies are as stable as other repetitive secondary structures depending on the used SSAM. For residues non-assigned in repetitive structures, 20% are in β-turn. The use of β-turn is so quite interesting because less than 1/5 of the amino are left associated to a non-defined state. Nonetheless, this study shows also that using different SSAMs can give very different β-turn assignments. In fact, these divergences are directly related to the strong discrepancies in assignment of helix and sheet ends, as the turn assignments are performed in a second step. This problem can greatly influence sequence -structure analysis of β-turns and also could have repercussion on prediction methods (e.g. [15]). In future work, we would like to study thoroughly the different beta-turn types between different SSAMs, examine the local environment of misassignments and consequences on the sequence-structure relationships.