Implications from predicted B-cell and T-cell epitopes of Plasmodium falciparum merozoite proteins EBA175-RII and Rh5

The leading circumsporozoite protein (CSP) based malaria vaccine, RTS,S, though promising, has shown limited efficacy in field studies. There is therefore, still a need to identify other malaria vaccine targets. Merozoite antigens are potential vaccine candidates, since naturally acquired antibodies generated against them inhibit erythrocyte invasion and in some cases result in the clinical protection from disease. We thus used in silico tools (BCPreds, NetMHCcons and NetMHCIIpan 3.0) to predict B-cell epitopes (BCEs) and T-cell epitopes (TCEs) in two merozoite invasion proteins, EBA175-RII and Rh5. Initially, we validated these tools using CSP to determine whether the algorithms could predict the epitopes in the RTS,S vaccine. In EBA175-RII, we prioritised three BCEs 15REKRKGMKWDCKKKNDRSNY34, 420SNRKLVGKINTNSNYVHRNKQ440 and 528WISKKKEEYNKQAKQYQEYQ547, a CD8+ epitope 553KMYSEFKSI561 and a CD4+ epitope 440QNDKLFRDEWWK VIKKD456. Three Rh5 epitopes were prioritised, a BCE 344SCYNNNFCNTNGIRYHYDEY363, a CD8+ epitope 198STYGKCIAV206 and a Rh5 CD4+ epitope 180TFLDYYKHLSYNSIYHKSSTY200. All these epitopes are in the region involved in the proteins’ interaction with their erythrocyte receptors, thus enabling erythrocyte invasion. Therefore, upon validation of their immunogenicity, by ELISA using serum from a malaria endemic population, antibodies to these epitopes may inhibit erythrocyte invasion. All the epitopes we predicted in EBA175-RII and Rh5 are novel. We also identified polymorphic epitopes that may escape host immunity, as some variants were not predicted as epitopes, suggesting that they may not be immunogenic regions. We present a set of epitopes that following in vitro validation provide a set of molecules to screen as potential vaccine candidates.

©2016 Figure 1: Flowchart showing the epitope prediction pipeline starting with a validation of the algorithm using CSP as a control, followed by BCE and TCE predictions in EBA175-RII and Rh5.
The mechanism by which the merozoite selects and successfully invades a RBC is complex, involving various receptor-ligand interactions [13]. The Duffy binding ligands (DBLs) and reticulocyte binding-like homologues (Rhs) located in the micronemes and rhoptries, respectively, are two main families of proteins thought to play key roles in the invasion process [14]. DBL molecules are thought to be essential in the formation of the tight junction, which precedes entry into the RBCs [15]. The first merozoite ligand identified to bind to RBCs was erythrocyte binding antigen-175 (EBA175) [16]. EBA175 interacts with glycophorin A (GypA) on the RBC surface via its erythrocyte binding domain (EBD) or region II (RII). EBA175-RII is a target for invasion inhibitory antibodies [17][18][19][20][21] and the EBA175-GypA interaction is a major RBC invasion pathway [19]. It has also become a leading malaria vaccine candidate [22,23], thus immunogenic epitopes within EBA175-RII should be exploited as potential vaccine candidates.
The Rh family includes Rh1, 2a, 2b, 3, 4 and 5, only the latter two have defined RBC receptors, complement receptor 1 [24] and basigin [25], respectively. Rh5 has recently become a leading malaria vaccine candidate [26], due to evidence from previous studies, which have shown it has a limited number of single nucleotide polymorphisms (SNPs), only five nonsynonymous SNPs [27]. There has been no demonstration of Rh5 allele-specific immunity [28], additionally Rh5 antibodies were shown to inhibit RBC invasion [29][30][31] and have been associated with protection against malaria [32]. The crystal structures for both EBA175-RII (PDB code 1ZRL) [33] and Rh5 (PDB code 4U0Q) [34] have been published and the residues involved in binding to their respective RBC receptors identified. This therefore makes these two proteins, ideal candidates for the in silico discovery of vaccine targets.

Figure 2:
A schematic view of the EBA175-RII predicted epitopes mapped to the full EBA175-RII sequence, not drawn to scale. (A) EBA175-RII predicted BCEs. (B) EBA175-RII predicted CD8+ epitopes. (C) EBA175-RII predicted CD4+ epitopes. The numbers displayed above each epitope and separated by hyphens represent the amino acid regions that each epitope encompasses. The residues in bold and underlined represent polymorphic sites within the respective epitopes. Of the polymorphic epitopes, those marked with " represents the epitopes that were predicted. The epitope positions marked with * represent epitopes that overlap with residues involved in binding to GypA.

©2016
The aim of this study was to predict BCEs and TCEs in EBA175-RII and Rh5 and to map them back to their crystal structures to determine their location in the tertiary protein. We validated the prediction tools using immunogenic, in vitro verified circumsporozoite protein (CSP)  , to obtain unique haplotypes (Figure 1).

Validation of B-cell Epitope Prediction Algorithms
The selection of servers, epitope length and antigenic score cut offs were based on previous studies. We used the BCPREDS server (http://ailab.ist.psu.edu/bcpreds/index.html) for the prediction of BCEs [40,41] and two algorithms were selected, AAP and BCPred Figure 1. The CSP sequence was submitted to the server with the default parameters and BCE lengths of 20mers [42]. Predicted BCEs with an antigenic score of >0.8 were selected [40,41] and included CSP BCEs identified from both algorithms after clustering them at 100% identity to exclude duplicates. The final predicted epitopes were then clustered at 50% identity with the in vitro verified NANP3 BCE to determine epitopes that were similar. This criterion lowers the stringency and identifies a larger number of epitopes, taking into account any limitations in the tools to predict epitopes.

Figure 3:
A schematic view of the Rh5 predicted epitopes mapped to the full Rh5 sequence, not drawn to scale: A) Rh5 predicted BCEs; B) Rh5 predicted CD8+ epitopes; C) Rh5 predicted CD4+ epitopes. The numbers displayed above each epitope and separated by hyphens represent the amino acid regions that each epitope encompasses. The residues in bold and underlined represent polymorphic sites within the respective epitopes. Of the polymorphic epitopes, those marked with " represents the epitopes that were predicted. The epitope positions marked with * represent epitopes that overlap with residues involved in binding to basigin

Validation of T-cell Epitope Prediction Algorithms
We selected HLA alleles that were common globally, from malaria endemic areas and those associated with resistance to malaria infection ( Table 1). The HLAs that confer protection against malaria were obtained from malaria endemic regions in Africa and Asia, with the rationale that individuals expressing these alleles are likely to generate an immune response during an infection. We therefore selected 6 class I HLA alleles for cytotoxic T-cell lymphocytes ( We then determined if these tools could predict the in vitro verified TCEs directly from the full CSP protein sequence (Figure 1). The CSP sequence was submitted to the NetMHCcons server with the default parameters, a peptide length of 8-11mers and the HLA class I alleles mentioned earlier were selected. For NetMHCIIpan 3.0, the parameters for the CSP sequence were similar to those of NetMHCcons except for the HLA class II alleles and the 15mer epitope length. We then determined how well the prediction algorithms identified the experimentally verified CSP TCEs, firstly, by identifying promiscuous epitopes (those that bound to multiple HLA alleles) then clustering them against the experimentally verified CSP TCEs to identify overlaps at a threshold of 50%. This took into consideration the limited number of HLA alleles used and excluded any duplicated epitopes.

Figure 4:
The crystal structure of EBA175-RII showing (A) the overlap between the predicted CD8+ epitope (aa 553-561) and the glycan binding sites at residues Lys-553 and Met-554, (B) the overlap between the predicted CD4+ epitope (aa 440-456) and the glycan binding sites at residue Asp-442, (C) the overlap between the predicted BCE (aa 15-34) and the glycan binding sites at residues Lys-28, Asn-29, Arg-31, Ser-32 and Asn-33, (D) the overlap between the predicted BCE (aa 420-440) and the glycan binding sites at residue Lys-439 and (E) the overlap between the predicted BCE (aa 528-547) and the glycan binding sites at residues Gln-542 and Tyr-546.

EBA175-RII and Rh5 BCE and TCE predictions
The selected parameters and cut-offs that gave suitable results for CSP were used to predict BCEs and TCEs in both EBA175-RII and Rh5 (Figure 1). Since the epitopes were generated from the haplotype sequences for both EBA175-RII and Rh5, we aligned them to their respective 3D7 lab isolate sequence to identify the polymorphic epitopes. We considered as one epitope, multiple epitopes aligning to the same loci. We prioritised the number of predicted EBA175-RII and Rh5 epitopes for in vitro validation by clustering them at 100% identity to eliminate duplicates. We then identified epitopes in the regions that are involved in the EBA175-GypA and Rh5basigin interactions. These epitopes were mapped onto the published Rh5 and EBA175-RII crystal structures using Pymol Version 1.7.2.1 to identify their locations in the folded protein.

Results: Prediction of BCEs and TCEs in CSP
After clustering the 22 predicted BCEs, we remained with 18 unique epitopes (Table 2) of which 7 contained the CSP BCE, NANP3. Since all the predicted CSP BCEs had antigenic scores of 1, we used this value in our selection of EBA175-RII and Rh5 BCEs.

EBA175-RII Epitope Predictions
The twelve haplotypes for both EBA175-RII and Rh5 ( predicted as an epitope. All the variants were predicted as epitopes in the other polymorphic regions. The conserved epitope 553-561 overlapped with residues, K553 and M554, which are involved in binding to GypA. Three EBA175-RII CD4+ epitopes were predicted (Figure 2C), of which 2 were conserved ( 38 PDRRIQLCIVNLSIIKTY 55 and 362 DKNLLMIKEHI LAIAIYE 379 ) and 1 was polymorphic ( 440 QNDKLFRDEWWKVIKKD 456 ). The four variants KA, EA, QE and KE at codons 440 and 448, respectively were identified in the polymorphic epitope and only the KA and EA variants were predicted as epitopes. This epitope also included residue D442 that interacts with GypA.

Rh5 Epitope Predictions
The 3 predicted Rh5 BCEs (Figure 3A), 40 TLLPIKST EEEKDDIKNGKD 59 , 254 YDISEEIDDKSEETDDETEEVEDSI 278 and 344 SCYNNNFCNTNGIRYHYDEY 363 , were all conserved and epitope 344-363 is in the region shown to interact with basigin that includes residues F350, N352, N354, R357 and E362. Eleven Rh5 CD8+ epitopes were predicted Figure 3B) . Within the polymorphic epitope 77-112, codon 88 was a singleton SNP and consisted of a D or N and only the N variant was predicted as an epitope. Both variants (codon S197Y) in the other polymorphic epitope were predicted and it also included residue Y200 that interacts with basigin.

Mapping of candidate epitopes to their respective crystal structures
For purposes of selecting candidate epitopes for in vitro validation, we considered the epitopes located in regions previously described as being involved in ligand-receptor interactions. We mapped these epitopes onto the protein tertiary structures to determine their spatial positioning within the erythrocyte binding domains. They included EBA175-RII CD8+ epitope 553-561 ( Figure 4A), CD4+ epitope 440-456 ( Figure 4B) and three EBA175-RII BCEs including 15-34, 420-440 and 528-547 ( Figure 4C, 4D & 4E). The Rh5 epitopes included a CD8+ epitope 198-206 ( Figure 5A), a CD4+ epitope 180-200 ( Figure 5B) and a BCE 344-363 ( Figure 5C).  (*) The underlined epitopes highlight the predicted peptides that contained the in vitro verified CSP epitope (NANPNANPNANP). The prediction scores ranged from 1 (most antigenic) to 0 (least antigenic). was predicted with an antigenic score of 1, the highest possible score for a predicted epitope. This suggests that these epitopes are likely to be the most antigenic in comparison to all other predicted epitopes. We did not predict all the in vitro verified CSP TCEs, perhaps due to the limited panel of 15 class I and II HLA alleles selected. We also did not predicted epitopes shown in previous studies to potently inhibit invasion.    We determined the impact of polymorphisms on epitope prediction in EBA175-RII and Rh5. Fewer BCEs and TCEs were predicted in the polymorphic regions than in the conserved regions and some variants were not predicted as epitopes. For instance, the polymorphic codons 147 and 148 in the Rh5 CD8+ polymorphic epitope 144 FLQYHFKEL 152 , consisted of three variants, YH, YD and HD, and the YD and HD variants were not predicted as epitopes. It appears that in this in silico analysis, particular amino acid combinations escape prediction as immunogenic epitopes. The polymorphisms in P. falciparum merozoite antigens are thought to be the result of immune selection, thus allowing the parasites to escape detection by host immune responses. In natural infections, immune escape has been demonstrated in polymorphic antigens MSP2 and apical membrane antigen 1 (AMA1), as allele-specific immunity [47,48]. Subsequently, immune responses generated to one allele of AMA1 or MSP2 only protects against the same allele and not a different allele. Perhaps, in silico tools could indicate ©2016 potential variant epitopes that may escape immunity. Allelespecific immunity has not been described for either EBA175-RII or Rh5 and more recently a study by Gandhi et al. (2014) [49] found no evidence of allele-specific immunity in CSP. Nevertheless, we hypothesize that the polymorphisms in these antigens may be driven by host immunity, resulting in allelespecific immunity or escape from immune detection or a redirection of the immune response away from important functional regions, such as those involved in allowing the antigen to bind the RBC receptor. In the case of Rh5, it appears that the polymorphic codons 147 and 148 fall outside the region required for the interaction with basigin. in vitro validation is required to test these assumptions.

Open access
All BCEs and TCEs predicted for EBA175-RII and Rh5, both polymorphic and conserved are novel. However, to prioritise epitopes for in vitro validation we focused on epitopes that would interfere with the functional roles of EBA175-RII and Rh5 in erythrocyte invasion. We rationalized that if we target regions of the proteins that can inhibit ligand-receptor interactions, these molecules if immunogenic may be effective in preventing parasite invasion and ultimately malaria pathology.
We prioritized 8 epitopes for in vitro validation, three EBA175-RII BCEs, 15  These epitopes cover both conserved and polymorphic regions, since we recognize that a combination of both regions is likely to be more effective in inhibiting RBC invasion. We recommend the aforementioned epitopes for in vitro validation, by testing their immunogenicity using sera from a malaria endemic population. In particular, the TCEs are of interest, since to the best of our knowledge no study has evaluated T-cell responses to Rh5 and only Malhotra et al. (2005) [50] have evaluated T-cell responses to EBA175-RII, but the epitopes were not mapped.

Conclusion:
The BCE and TCE prediction algorithms resulted in multiple putative epitopes. This can be attributed to a lack of sufficient training data to further benchmark these tools and improve their performance. It also highlights the need to couple the use of in silico epitope prediction tools with in vitro validation of predicted epitopes to improve the accuracy of the pipeline and provide the training data required. Nonetheless, in silico tools provide a quick way to identify potential vaccine targets that can then be screened in vitro to determine their immunogenicity and viability as possible malaria vaccine candidates.