Identification and validation of T-cell epitopes in outer membrane protein (OMP) of Salmonella typhi

This study aims to design epitope-based peptides for the utility of vaccine development by targeting outer membrane protein F (Omp F), because two available licensed vaccines, live oral Ty21a and injectable polysaccharide, are 50% to 80% protective with a higher rate of side effects. Conventional vaccines take longer time for development and have less differentiation power between vaccinated and infected cells. On the other hand, Peptide-based vaccines present few advantages over other vaccines, such as stability of peptide, ease to manufacture, better storage, avoidance of infectious agents during manufacture, and different molecules can be linked with peptides to enhance their immunogenicity. Omp F is highly conserved and facilitates attachment and fusion of Salmonella typhi with host cells. Using various databases and tools, immune parameters of conserved sequences from Omp F of different isolates of Salmonella typhi were tested to predict probable epitopes. Binding analysis of the peptides with MHC molecules, epitopes conservancy, population coverage, and linear B cell epitope prediction were analyzed. Among all those predicted peptides, ESYTDMAPY epitope interacted with six MHC alleles and it shows highest amount of interaction compared to others. The cumulative population coverage for these epitopes as vaccine candidates was approximately 70%. Structural analysis suggested that epitope ESYTDMAPY fitted well into the epitope-binding groove of HLA-C*12:03, as this HLA molecule was common which interact with each and every predicted epitopes. So, this potential epitope may be linked with other molecules to enhance its immunogenicity and used for vaccine development.


Background:
Typhoid fever is caused by Salmonella typhi, a Gram-negative bacterium, which is an obligatory human pathogen [1].It has three outer membrane proteins-OmpC, OmpF and OmpA, those are good as vaccine candidate [1].Among these three, OmpF, a major general diffusion porin of Salmonella typhi, has more conserved sequences and higher antigenicity than others [1,2].In spite of serious health complications, the scientific knowledge of the epidemiology and ecology of this virus is limited [3,4,5] to design an universal vaccine.Bioinformatics, especially immunoinformatics, is an emerging field in vaccine design.The combination experimental and in silico methods are crucial to solve complex problems such as revealing immune responses and vaccine design [6].Available bioinformatics tools provide the searching option to scan for probable epitope candidate from huge sets of protein antigens which are encoded by complete viral genomes.This computational vaccine design approach has proven very effective in fighting against few diseases such as multiple sclerosis [7], malaria [8], and tumors [9].The most crucial job in this experiment is the identification of HLA ligands and T-cell epitopes [10].Through T-cell epitope prediction tools to identify allele-specific binding peptides, it is possible to reduce the number of potential peptides considered as vaccine candidates.Along with these tools, different types of methods have been developed for the identification of proteosomal peptide cleavage sites, major histocompatibility complex (MHC) binding peptides and transporters associated with antigen presentation (TAP) molecules [6, 11-14].
T lymphocytes only recognize processed peptides or antigens and usually these peptides are presented by antigen presenting cells in association with HLA molecules.As a result, an epitope will only able to elicit immune response in human individual if he or she expressing this particular HLA and capable of binding with it efficiently [15].In addition, over thousands of different HLA allelic variants have been identified so far [16].And specific HLA alleles are prevalent in different ethnic group.Therefore, binding of predicted epitopes with different HLA results increased population coverage.It has been found that as much as 90% population coverage with different ethnic groups can be obtained by targeting different HLA molecules [17].Due to the tedious and expensive nature of experimental screening procedures, computational compound screening has been pursued extensively in recent years [18], which make ligand-protein interaction more vivid in knowledge before any expensive wet lab trial.

Methodology: Retrieving the protein sequences
The sequences of outer membrane protein F (OmpF) of Salmonella typhi were retrieved from the NCBI protein database (http://www.ncbi.nlm.nih.gov/protein/)from different isolates.

Multiple sequence alignment (MSA)
Retrieved sequences were used as a platform to generate multiple sequence alignments, which were the basis of finding the conserved regions of the protein sequences.The MEGA5 software package (www.megasoftware.net) was used to build the alignment.

T cell epitopes processing prediction
To evaluate the immunogenicity of the conserved peptide, different bio-computational tools were employed.In order to discover the immunogenicity of the conserved peptide, firstly a reverse immunogenic approach was employed for the selection of candidate epitopes.The T-cell epitopes from the conserved peptide were predicted using the NetCTL (http://tools.immuneepitope.org/stools/netchop/netchop.do) prediction method, which is based on neural network architecture and predicts the epitope candidates based on the processing of the peptides in vivo.The threshold was set at 0.50 during analysis.A combined algorithm integrating MHC class I binding, transporter of antigenic peptides (TAP) transport efficiency and proteosomal cleavage prediction was involved to predict a total/overall score.Based on this overall score the best epitope candidates were selected for further analysis.
The Immuno Epitope Database (IEDB) MHC-1 binding prediction tool made the option of using a different prediction method available.Stabilized Matrix Method-based (SMM) prediction was used to calculate IC50 values for peptides binding to specific MHC molecules.Note that all the alleles were selected and length was set at 9.0 prior to the prediction.For the selected epitopes a web based tool "Proteosomal cleavage/TAP transport/MHC class I combined predictor" was used to predict proteosomal scores, TAP score, processing score and MHC-1 binding score using the SMM based prediction method for each individual peptide.The parameters for immunogenicity detection (TAP score, proteosomal score, IC50 values) were normalized on a scale of 0 to 1 and were given a weighted score to prioritize the parameters in order to determine immunogenicity.The binding energies for HLAepitope interactions were predicted using the MHC-1 peptide binding energy predictor tool.

Epitope conservancy analysis
Epitope conservancy analysis was done with Epitope Conservancy Analysis of Immuno Epitope Database (http://tools.immuneepitope.org/tools/conservancy/iedb_input) by comparing the conservancy of predicted epitope with the sequence of Omp F.

Retrieving 3D structure of HLA
The 3D structure of HLA-C*12:03 (PDB ID: 1EFX) was downloaded from the Protein Data Bank Database (www.rcsb.org/) and visualized in the PYMOL molecular graphics system.Prior to docking, all the water molecules were removed from the 3D structure of epitope free HLA-C*12:03.

Designing the 3D epitope structure
For the docking study, the ESYTDMAPY epitope was chosen because it showed the maximum interactions with different HLAs.The 3D structure of ESYTDMAPY was designed using the PEP-FOLD server: De novo peptide structure prediction (http://bioserv.rpbs.univ-paris-diderot.fr/PEP-FOLD/).

HLA-epitope binding prediction
The AutoDock tool from the MGL software package (version 1.5.6) was employed for docking purpose.Both the protein [HLA-C*12:03] and ligand (epitope ESYTDMAPY) files were firstly converted to PDBQT format to use them for the docking study.The grid space for center was set manually to facilitate the binding of epitope at the binding groove of HLA-C*12:03.

Population coverage
Population coverage for individual epitope was predicted by the population coverage tool (http://tools.immuneepitope.org/ tools/population) from the IEDB analysis resource.The allele frequency of the interacting alleles was used to measure the population coverage for the corresponding epitope.In case of epitopes from Omp F, the maximum coverage was found for the population of Europe, South Africa and East Asia.

B-cell epitope prediction
Linear B-cell epitopes were predicted from the given protein sequence using the B-cell epitope prediction tool from the IEDB analysis resource (http://tools.immuneepitope.org/ tools/bcell/iedb_input).The Kolaskar and Tongaonkar antigenicity prediction method was employed for prediction in which, predictions are based on a table that reflects the occurrence of amino acid residues in experimentally known segmental epitopes.This method can predict antigenic peptides with approximately 75% accuracy.

Results: Identification of evolutionary conserved proteins/regions
From the output of multiple sequence alignment for Omp F protein, it has been found that the sequence is conserved over all of the sequences.But, it"s not true for Omp A and Omp C. That"s why we exclude those proteins from out study.

Immunogenicity of conserved peptides
The NetCTL prediction tool predicted 355 overlapping potential epitopes from the given sequence, but only 16 potential T-cell epitope candidates were chosen on the basis of a total score (above 0.5) that was combined result of different parameters Table 1 (see supplementary material).SMM based IEDB MHC-1 binding prediction tool retrieved 784 possible MHC-1 allele interactions with the 16 different T-cell epitope those were selected before.The MHC-1 alleles for which the epitopes showed higher affinity (IC50<100) were selected for further analysis Table 2 (see supplementary material).Among

Epitope conservancy analysis
Epitope conservancy analysis with all of the Omp F protein sequences from different strains revealed that all 14 out of 16 peptides have a 100% protein sequence match.The results are summarized in Table 4 (see supplementary material).

Molecular docking study of HLA-epitope interaction
AutoDock Vina predicted nine possible binding models.On the basis of higher binding energy with HLA-C*12:03, the best output model for ESYTDMAPY epitope was found to have a binding energy of -5.4 kcal/mol.The interacting and binding of HLA-epitope is illustrated in Figure 1.

Population coverage
The population coverage for different population is summarized in Table 5 (see supplementary material).And the graph shows that Europe, South Africa and East Asia have highest population coverage and it"s over 60% (Figure 2).

Prediction of B-cell epitopes
The antigenic determinant plot was presented in Figure 3; the x-axis shows sequence position and the y-axis shows antigenic propensity.Average antigenic propensity of this protein is 1.003.There are 13 antigenic determinants in the sequence and shown in Table 6 (see supplementary material).The average for the whole protein is above 1.0; all residues above 1.0 are potentially antigenic.

Conclusion:
In vitro and in vivo studies are required to determining the actual effectiveness of the peptide for mounting an immune response.Binding chip assay for HLA and peptide would also be useful to determine the binding affinity of the peptide as a whole.The best predicted T cell epitope (ESYTDMAPY) is nonamer and it covers 126 to 134 positions of amino acids.At the same time, there is a potential B-cell (YGIVYDVESY) epitope which covers 119 to 128 positions of amino acids.If we take peptide from 119 to 134 amino acid positions and add potential molecules to enhance immunogenicity, then it would be possible to design a universal epitope based vaccine against Salmonella typhi.

Figure 1 :
Figure 1: Docking to predict the binding of predicted epitopes to MHC class I molecule, HLA-C*12:03.Binding of "ESYTDMAPY" to the binding grooves of the retrieved structure of HLA-C*12:03 (binding energy: -5.4Kcal/mol).The yellow colored portion and green portion in both figure (A & B) represent HLA-C*12:03 molecule and ESYTDMAPY epitope, respectively.

Figure 2 :
Figure 2: Population coverage by MHC Class I + II restricted epitopes from outer membrane protein F (OmpF) of Salmonella typhi.In case of epitopes from Omp F, the maximum coverage was found for the population of Europe, South Africa and East Asia.

Figure 3 :
Figure 3: Linear B-cell epitope prediction using Kolaskar and Tongaonkar prediction tool of IEDB.The pick shows above the threshold corresponding to those sequence position which can act as B-cell epitope to elicit humoral immune response.

Table 1 :
Selected epitopes on the basis of total score (above 0.5) predicted by NetCTL.

Table 2 :
Allele selection for different peptide sequences.The proteasome complex contains enzymes that cleave peptide bonds, converting proteins into peptides.The antigenic peptides from proteasome cleavage associate with class I MHC molecules, and the peptide-MHC complexes are then transported to the cell membrane.TAP transports the peptide to the endoplasmic reticulum.

Table 3 :
Selected epitopes and the list of HLA molecules with those they interact.The highlighted epitope interacts with six HLA molecules and no other epitope interact with so many HLA molecules.

Table 5 :
Population coverage for best predicted epitope is summarized here.

Table 6 :
Potential linear peptides predicted to be antigenic determinants