A comprehensive analysis of predicted HLA binding peptides of JE viral proteins specific to north Indian isolates

Japanese encephalitis (JE), a viral disease has significantly increased worldwide especially, in the developing region due to challenges in immunization, vector control and lack of appropriate treatment methods. An effective, yet an expensive heat-killed vaccine is available for the disease. Therefore, the design and development of short peptide vaccine candidate is promising. We used immune-informatics methods to perform a comprehensive analysis of the entire JEV proteome of north Indian isolate to identify the conserved peptides binding known specific HLA alleles among the documented JEV genotypes 1, 2, 3, 4 and 5. The prediction analysis identified 102 class I (using propred I) and 118 class II (using propred) binding peptides at 4% threshold value. These predicted HLA allele binding peptides were further analyzed for potential conserved region using IEDB (an immune epitope database and analysis resource). This analysis shows that 78.81% of class II (in genotype 2) and 76.47% of HLA I (in genotype 3) bound peptides are conserved. The peptides IPIVSVASL, KGAQRLAAL, LAVFLICVL and FRTLFGGMS, VFLICVLTV, are top ranking with potential super antigenic property by binding to all HLA allele members of B7 and DR4 super-types, respectively. This data finds application in the design and development of short peptide vaccine candidates and diagnostic agents for JE following adequate validation and verification.

of India in 2005, which affected 5,737 lives and 1344 deaths [6]. JE disease is characterized by several primary and secondary clinical symptoms such as brain membrane inflammation, continuous fever cause irreversible neuron damage, psychiatric and neurological disorder with limb paralysis etc. [4,5,7]. JE virus has five known genotypes, which are distributed in various worldwide geographical areas Table 1 (see  supplementary material) [8,9,10,11,12].
Nakayama JE virus strain is widely used in JE vaccine that belongs to genotype III. Genotype III is the most widely distributed genotype and it is the only genotype isolated from the Indian subcontinent. Furthermore, The JE disease burden is increasing day by day in developing countries due to the impracticality of immunization, vector control methods and lack of therapeutic treatment [2, 13,14]. As a result, vaccination is the only way to prevent JE [15]. In present scenario, a number of vaccines have been developed in several countries, but only inactivated mouse brain derived Nakayama strain vaccine is the most commercially used vaccine [16,17]. Now-a-days, Nakayama strain vaccine has been replaced by Vero cell derived JE vaccine (IXIARO) which can effectively boost the immunity [18,19]. There are various drawbacks of this vaccine such as vaccine production shortage, high cost and neurological adverse effects especially in low-income countries, which increase the disease burden of JE with time [17, 20, 21,22].
Among all available JE vaccines, an epitope vaccine is more potent than killed, attenuated and cell cultured derived vaccines, gives better immunity and devoid of adverse effects of entire viral proteins [23,24]. The majority of available current vaccines have involvement of only structural proteins but nonstructural proteins cannot be ignored. As reported earlier, nonstructural proteins are produced in live virus forms, show a good immune response [25] and can work as a major target for human anti JEV specific T cells produced during natural infections [26,27].
The development of epitopes based vaccines generally requires the knowledge of the adaptive immune system. TH cells and TC cells can recognize antigen when bounded with MHC class II and I molecule, respectively [28,29]. Major Histocompatibility Complex (MHC) which is also known as Human Leukocyte Antigen (HLA) in humans is a membrane glycoprotein and extremely polymorphic in nature. These HLA molecules can bind to a spectrum of antigenic linear epitopes derived from antigen processing, which initiate an immune response, but HLA binding does not assure to generate T-cell immune response alone. The peptide binding specificity varies for different HLA alleles in a combinatorial manner among ethnic populations. It has been reported that the majority of alleles can be covered within few HLA supertypes, where different members of a supertype bind similar peptides; these similar peptides are called super antigens. Recently, nine major HLA class I supertypes (HLA-HLA-A1, A2, A3, A24, B7, B27, B44, B58, and B62 and seven HLA class II supertypes (main DR, DR4, DRB3, main DQ, DQ7, main DP, and DP2) have been determined by comparing peptide-binding data [30,31]. Peptides exhibiting super antigenic property by binding to a maximum number of HLA alleles or HLA supertypes with their conserved nature can surmount the problem of HLA allele's population coverage and chance of antigen escape related to antigenic drift or shift. Therefore, the present study is designed for a comprehensive analysis of predicted HLA binding peptides of JE viral proteins specific to north Indian isolates.

Methodology:
The complete genome and protein sequences of JEV of north Indian origin strain (Accession No.ABU94628) were obtained from sequence database NCBI (http://www .ncbi.nlm.nih.gov/ entrez). DDBJ database was used to calculate the number of adenine, cytosine, guanine and thymine bases in the genome. The physiochemical properties of all viral proteins such as iteration of amino acids within proteins, their molecular weight and pI value of predicted epitopes were analyzed using proteomics analysis platform of ExPasy [32]. In addition, the variation and conservation of envelope protein residues in all five genotypes, were done by using a protein variability server at 0.46 threshold value of Simpson diversity. The envelope protein of SA14-14-2 strain (PDB ID-3P54) was taken as a base structure to map the variable and conserved regions in genotypes 1,2,3,4 and 5 [33]. The flowchart of methodology has been represented in (Figure 1).

Screening of T cell epitopes
All the structural and non-structural proteins of JEV (Accession No.ABU94628) were analyzed for screening of possible dominant T cell epitopes using immunoinformatics tools such as Propred I and Propred. Propred I and Propred were used at 4% threshold for binding analysis of all possible peptides to 47 class I and 51 class II HLA alleles respectively. These tools are highly valuable to recognize antigenic HLA binding peptides [34,35].

Predicted T cell epitopes worldwide conserved region study
All the predicted T cell epitopes of JEV north Indian origin strain, were undergone for worldwide conserved region study among JEV genotypes 1, 2, 3, 4 and 5. Before conserved region study it is necessary to retrieve all proteins sequences of all genotypes from NCBI database. Therefore maximum 5 sequences of each JEV protein were retrieved from the NCBI database randomly for genotypes 1, 2, 3, 4 and 5.
The predicted T cell epitopes of each protein of JEV strain along with 1 to 5 same protein sequences of a single genotype were taken to Immune Epitope Database and Analysis Resource (IEDB) conservancy tool [36]. This cycle was repeated for all five genotypes for all proteins of JEV strain.
The nanomer T cell epitopes having 70-100% conserved region with a maximum single and double mutation were selected while discarded the epitopes having less than 70% conserved region with more than two mutations. After conserved region analysis, isoelectric point (pI) value of predicted peptides was calculated for all mutated and conserved epitopes [37].

Result & Discussion:
The result of the study indicated that JEV genome comprises 10976 base pairs with GC content 51.35 %. The GC content was found to be 2.7% higher than AT contents. The genome translates into a polyprotein that afterwards separated into structural and non-structural proteins. The structural envelope protein has the highest molecular weight 52975.81 kDa. All protein physiological properties such as molecular weight, amino acid number and frequency amino acid were listed in Table 2 (see supplementary material). Frequency of amino acid in protein is directly associated with the pI value and their binding with HLA alleles. The study of envelope protein variability among all genotypes, the amino acid sequence positions 129, 222, 327 and 369 were observed with high Simpson variability (Figure 2) which was also shown in 3D mapped structure (Figure 3).
Total 118 HLA class II binding T cell epitopes were extracted by propred tool Table 3 (see supplementary material). The highest number of T cell epitopes was represented by envelope protein comprising 28.81% of all predicted HLA class II epitopes. Envelope protein predicted epitopes such as LVTVNPFVA, VGRLVTVNP, FRTLFGGMS, LKGAQRLAA and FNSIGKAVH were found to be potential binders of 20-50 HLA II alleles.
In case of HLA class I binding T cell epitopes, total 102 epitopes were extracted using propred I (Table 3). Again, the highest number of T cell epitopes was represented by envelope protein comprising 23.51% of all the predicted HLA I epitopes.
The conserved region analysis of total 118 predicted HLA class II binding epitopes, 29 epitopes were found to be 100% conserved in all genotypes. The 118 predicted HLA II peptides showed 72 % conserved nature with genotype I, 78.81% with genotype II, 75% with genotype III, 54% with genotype IV and 39.83% with genotype V (Figure 4). Predicted HLA II binding epitopes were found highly conserved in genotype II (78.81%). Similarly, the conserved region analysis of total 102 predicted HLA class I binding epitopes, 21 epitopes were found 100% conserved in all genotypes. The 102 predicted HLA I peptides showed 70.58% conserved nature with genotype I, 75.49% with genotype II, 76.47% with genotype III, 62.47% with genotype IV and 52% with genotype V (Figure 4). Predicted HLA I binding epitopes were found highly conserved in genotype III (76.47%). LMTINNTDI, MINIEASQL, LVTVNPFVA, IPIVSVASL, and VLTLATFFL epitopes were found as common binders for HLA class I and II alleles. LVTVNPFVA epitope of envelope protein was found best binder in term of the HLA allele coverage with 100% conserve nature in all genotypes.
As discussed earlier, the concept of HLA supertype has a profound role in the understanding of T cell epitope selection, degeneration and discrimination during T cell mediated immune response [30]. In the HLA supertype analysis, IEDB web server was also used to check binding of best epitopes with also those HLA alleles, which are not included in propred server. For an example, DR4 HLA II supertype members such as DRB1*0401, 0405 and 0802 are not available in propred server. Findings revealed that LVTVNPFVA, IPIVSVASL, KGAQRLAAL, LAVFLICVL epitopes binding to all members of B7 HLA I supertype (B*0702, B*3501, B*5101, B*5102, B*5301, B*5401) but these peptides also show selective binding to some members but not all members of the other HLA I supertypes. FRTLFGGMS, VFLICVLTV epitopes were binding to all members of DR4 HLA II supertype (DRB1*0401, 0405 and 0802) but not all members of the other HLA II supertypes. Therefore LVTVNPFVA, IPIVSVASL, KGAQRLAAL, LAVFLICVL, FRTLFGGMS and VFLICVLTV epitopes were also showing their super antigenic property. These predicted potential novel epitopes are sufficient to work as vaccine rather than using whole proteins as vaccines candidates because it has been confirmed few epitopes can represent complete antigenicity of any protein [23]. Similar to this study, epitope based vaccines have given promising result against several highly infectious diseases such as H1N1, HIV and Tuberculosis [24,38,39]. Thus in the present study, propred I and propred server were used for screening of best T cell epitopes from proteome of JEV north Indian isolate followed by worldwide conserved region analysis in all genotypes (1,2,3,4 and 5). The predicted epitopes were nanomers and could be used as vaccine candidates and diagnostic reagents for JE.

Conclusion:
The need for the design and development of HLA specific short peptide vaccine candidate is necessary. We document the identification of class I and class II HLA specific JE viral peptides at 4% threshold value by using Propred I and Propred, respectively. We report the presence of 29 class II and 21 class I specific conserved peptides in all known genotypes. The HLA specific predicated are seen to be highly conserved in genotypes 2 and 3, while limited in 1, 4 and 5. We further found that the peptides IPIVSVASL, KGAQRLAAL, LAVFLICVL and FRTLFGGMS, VFLICVLTV, are top ranking with potential super antigen property by binding to all HLA allele members of B7 and DR4 super-types, respectively. This data finds application in the design and development of short peptide vaccine candidates and diagnostic agents for JE following adequate validation and verification.