A HLA-DRB supertype chart with potential overlapping peptide binding function.

HLA-DRB alleles are class II alleles that are associated with CD4+ T-cell immune response. DRB alleles are polymorphic and currently there are about 622 named in the IMGT/HLA sequence database. Each allele binds short peptides with high sensitivity and specificity. However, it has been suggested that majority of HLA alleles can be covered within few HLA supertypes, where different members of a supertype bind similar peptides showing distinct repertoires. Definition of DRB supertypes using binding data is limited to few (about 29) known alleles (< 5% of all known DRB alleles). Hence, we describe a strategy using structurally defined virtual pockets to group all known DRB alleles with regard to their overlapping peptide binding specificity.


Background:
Class II human leukocyte antigen molecules (HLA II) are glycoproteins that binds to various antigenic peptides processed by endocytic pathway and present them to CD4+ T cells for immune response [1].The antigen binding groove is made up of 2 domains (α 1 and β 1) from α chain and β chain [2].The amino acid residues lining these domains interact with antigenic peptide residues and form a stable HLA-II p complex which is recognized by the CD4+ T cells.Several autoimmune diseases (good pasture's syndrome, type 1 diabetes etc.) and parasitic diseases (malaria, filariasis etc.) are associated with CD4+ T cells involving HLA II molecules [3].Class II HLA molecules are highly polymorphic and 622 DRB alleles are listed in IMGT/HLA database (release 2.22) [4].The observed sequence polymorphism in class II is a challenge in the design of peptide based vaccines directed against CD4+ T cell associated diseases.An ideal peptide based vaccine is a cocktail of peptides with broad specificity to different ethnic groups.Thus, it is important to identify peptides with overlapping specificity to multiple alleles covering a wide range of ethnic diversity.It has been suggested that majority of alleles can be covered within few HLA supertypes, where different members of a supertype bind similar peptides, yet exhibiting distinct repertoires [5].
Sette and colleagues (1998) analysed DRB1*0101, DRB1*0401 and DRB1*0701 using a collection of 384 synthetic peptides with competitive binding assay data (IC50).Peptide binding data for these three alleles and nine other DRB alleles (DRB1*1501, DRB1*0405, DRB1*0802, DRB1*0901, DRB1*1101, DRB1*1201, DRB1*1302, DRB5*0101, DRB4*0101) were used to show overlapping peptide binding repertoires [6].Maillere and colleagues (2002) described a new supertype for HLA DP4 alleles using DPA1*0103/DPB1*0401 (DP401) and DPA1*0103/DPB1*0402 (DP402) binding assay data (considering IC 50 values for binding affinity) of various peptides derived from allergens, viruses, or tumor antigens [7].Sette and colleagues (2002) showed that HLA-DQA1*0501/B*0201 (DQ2.3) and DQA1*0301/B*0302 (DQ3.2) share large overlapping peptide binding functions (using IC 50 values for binding affinity) [8].They also showed the cross reactivity of DR restricted epitopes with DQ alleles.Thus, a significant amount of functional overlap is observed using peptide binding assay data between different class II alleles in multiple layers.The number of class II alleles known and defined till is more than 1000.Therefore, it is important to establish theoretical methods to group alleles exhibiting peptide binding functional overlap.Lund and colleagues (2004) developed specificity weight matrices using Gibbs sampling algorithm to define nine DRB supertypes [9].Flower and Doytchinova (2005) clustered Class II HLA alleles into twelve supertypes (five DR, three DQ and four DP) based on hierarchical (using similarity field generated by CoMSIA) and non hierarchical (k-means) clustering [10].Clustering of MHC peptidebinding repertoires was utilized elsewhere for supertype definition by Reche & Reinherz (2007) [11].These results provide frameworks for understanding class II supertypes.The current update for class II alleles is 1000 at IMGT/HLA.Therefore, it is important to develop novel methods to group class II alleles with overlapping peptide binding function.Here, we describe a method based on virtual pockets defined from structural data to group HLA DRB alleles with overlapping peptide binding repertoires.We chose DRB alleles which constituted 62% of the class II HLA alleles known till date for this study.

Methodology: DRB specific peptides from MHCBN database:
We retrieved 1580 DRB specific immunogenic (CD4+ cytotoxicity) peptides from MHCBN, a database of MHC binders and non-binders [12].We then created a subset data of 1064 DRB specific peptides that are documented to bind DRB alleles defined using sequence based nomenclature.This corresponds to 37 DRB alleles.It should be noted that this subset of DRB specific peptides have both binding specificity and CD4+ cytotoxicity.The remaining 516 DRB specific peptides are documented to bind DRB alleles defined using serological methods without sequence level specificity.Hence, this subset is neglected in further analysis.

DRB supertypes in MHCBN dataset:
The dataset consisting of 1064 DRB specific peptides were further analyzed to identify functional peptide binding overlap between alleles.This exercise identified 145 peptides binding to two or more DRB alleles, thus exhibiting peptide binding functional overlap (Table 1 in supplementary material).The 145 peptides cover 29 DRB alleles.The grouping of DRB supertypes in MHCBN is illustrated in Figure 1.

DRB allele specific sequences from IMGT/HLA database:
We downloaded 622 DRB alleles (62% of known class II alleles) from IMGT/HLA for further analysis.

Peptide binding domain:
The peptide binding groove is formed by two domains each from alpha and beta chains (Figure 2).The β1 domain (first 90 residues in the N terminal) from beta chain that constitutes the peptide binding groove is considered for further analysis.

Virtual pockets in DRB molecules:
The peptide binds with the DRB molecules through peptide residue position specific interactions with the pockets in the groove.These virtual binding pockets accommodate the side chains of amino acid residues of antigenic peptides.Mohanapriya and colleagues (2009) defined nine virtual pockets using 15 non-redundant class II HLApeptide crystal structures [13].The HLA amino acid residues lying in these virtual binding pockets show polymorphism which determines the specificity and sensitivity.These residue positions forming the virtual pockets are called highly essential residue positions (HERP) as defined elsewhere by Mohanapriya and colleagues (2009) [13].The number of HERP defined for the beta chain of class II molecules is twenty-five.

Grouping of HLA DRB alleles for overlapping peptide binding:
We extracted the residues corresponding to the 25 HERP in 622 DRB alleles in IMGT/HLA.The sequence stretch formed by the discontinuous HERP in 622 DRB alleles was compared among themselves to cluster those having similar sequence stretch.Thus, we obtained 395 groups by this procedure such that alleles within the same group share the HERP sequence stretch.This procedure generated 73 groups (  The binding groove in DRB molecules is formed by alpha and beta chains (Figure 2).It accommodates peptides of length 12-35 [13].The peptides bound to the groove have an extended conformation in class II unlike class I.The sequence similarity between defined class II alleles is more than 70% and hence their structural similarity is high (Figure 5).The receptor backbone is highly similar and only their side-chain orientations vary.Therefore, the peptide binding specificity is determined by the side chains influenced by polymorphism of the MHC alleles.Mohanapriya and colleagues (2009) defined virtual pockets using HERP extracted from HLA-peptide structural complexes [13].The hypothesis is that the 25 residues at the HERP forming the virtual pockets are deterministic of peptide binding and its specificity.The high degree of sequence homology between known DRB alleles and hence their structural similarity suggests the influence of polymorphic residues at the virtual pockets to determine peptide specificity (Figure 6).We thus theoretically grouped the known 622 DRB alleles using virtual pockets defined from structural datasets.The grouping (Figure 4) using the procedure illustrated in Figure 3 produced 73 groups consisting of at least 2 alleles covering about 300 alleles.Thus, the 73 groups exhibit overlapping peptide binding function.This grouping is validated using known peptides (32 peptides bind to DRB1*1101 and DRB1 *1104 and 6 peptides bind to DRB1*1301 and DRB1*1302) that are clustered within the same groups in this study.The data presented here serves in general as a framework for understanding peptide binding overlap in particular for HLA-DRB supertype definition and groupings.Each DRB allele contains nine virtual pockets by definition.Thus, the 622 DRB alleles theoretically contain a pool of 5598 virtual pockets made of HERP residues in the dataset.The current analysis shows that the 622 DRB alleles accounts for only 569 unique pockets (Figure 7).This constitutes only about 10% of theoretically possible virtual pockets suggesting overlap of virtual pockets among 90% of the remaining pocket combinations.Thus, the study demonstrates the possible degree of overlap within virtual pockets for potential functional overlap among DRB alleles.The described framework finds application in the design of epitopes with cross reactivity across DRB specific ethnic population towards peptide vaccine development.

Figure 5:
The sequence similarity between defined class II alleles is more than 70% and hence their structural similarity is high.

Figure 6:
The high degree of sequence homology between known DRB alleles and hence their structural similarity suggests the influence of polymorphic residues at the virtual pockets to determine peptide specificity.The average virtual pockets are generated using a dataset of 15 structures described elsewhere [13].

Figure 1 :
Figure 1: The grouping of DRB supertypes in MHCBN is illustrated.

Figure 2 :
Figure 2: The peptide binding groove is formed by two domains each from alpha and beta chains

Figure 3 :
Figure 3: The workflow for the identification of DRB alleles in IMGT/HLA with overlapping function is given.

Figure 4 :
Figure 4: The workflow for the grouping DRB alleles is shown.

Table 3
The binding of peptides to HLA alleles is usually assessed using competitive binding assay in IC 50 values.Thus, supertypes are defined using peptide binding IC 50 values known for two or more alleles.We use the IC 50 values known for two or more alleles that are made available at the MHCBN database (database of allele specific binders and nonbinders).We extracted 1064 HLA-DRB (sequence based definition) specific peptides from MHCBN.145 peptides in the dataset are found to bind more than one allele covering 29 alleles.These alleles exhibit supertype like function.This accounts for only 4.6% of known 622 DRB alleles till date.Therefore, it is important to develop framework charts to group defined DRB alleles into potential supertypes with overlapping peptide binding function.Here, we define a methodology for grouping DRB alleles from sequence data using virtual pockets.
[10] supplementary material) also shows 6 peptides (retrieved from MHCBN) binding to DRB1*1301 and DRB1*1302 with CD4+ cytotoxicity.Table2(see supplementary material) shows that these two alleles (DRB1*1301 and DRB1*1302) fall under the same category.Thus, peptide data in Table3(see supplementary material) validates the theoretical supertype like grouping in Table2(see supplementary material).Discussion:HLA-DRB alleles are associated with CD4+ T-cell immune response.DRB alleles are polymorphic (sequence level variation) in the population and about 622 DRB alleles are named in IMGT/HLA sequence database till date.Each of these alleles bind short peptide antigens with high sensitivity and specificity for CD4+ T-cell immune response.However, it has been suggested that majority of alleles can be covered within few HLA supertypes, where different members of a supertype bind similar peptides, yet exhibiting distinct repertoires[6].*0101)wereusedtoshow overlapping peptide binding repertoires[6].However, this study accounts for just 12 alleles, which is far less than the number of known 622 alleles.Flower and Doytchinova (2005) clustered 347 DRB alleles into five DR supertypes based on hierarchical (using similarity field generated by CoMSIA) and non hierarchical (k-means) clustering[10].