Structure guided homology model based design and engineering of mouse antibodies for humanization

No universal strategy exists for humanizing mouse antibodies, and most approaches are based on primary sequence alignment and grafting. Although this strategy theoretically decreases the immunogenicity of mouse antibodies, it neither addresses conformational changes nor steric clashes that arise due to grafting of human germline frameworks to accommodate mouse CDR regions. To address these issues, we created and tested a structure-based biologic design approach using a de novo homology model to aid in the humanization of 17 unique mouse antibodies. Our approach included building a structure-based de novo homology model from the primary mouse antibody sequence, mutation of the mouse framework residues to the closest human germline sequence and energy minimization by simulated annealing on the humanized homology model. Certain residues displayed force field errors and revealed steric clashes upon closer examination. Therefore, further mutations were introduced to rationally correct these errors. In conclusion, use of de novo antibody homology modeling together with simulated annealing improved the ability to predict conformational and steric clashes that may arise due to conversion of a mouse antibody into the humanized form and would prevent its neutralization when administered in vivo. This design provides a robust path towards the development of a universal strategy for humanization of mouse antibodies using computationally derived antibody homologous structures.


Background:
Antibodies render higher specificity than small molecules (drugs) against a given biological target.The mouse is the most favored model system for producing specific antibodies to an antigen of interest.Mouse monoclonal antibodies (mAb) are routinely used not only for diagnostic purposes but also as therapeutic agents.However, administration of mouse mAbs in patients can lead to allergic reactions against the therapeutic antibody, otherwise known as the human anti-mouse antibody (HAMA) response [1].To address this problem, chimeric antibodies have been designed, which could lessen the HAMA response but not eliminate it [2].Further engineering was undertaken to replace the mouse scaffold with that of human origin by transplanting mouse complementarity determining regions (CDRs) onto a fully human scaffold [3].As the CDR paratope usually binds and engages the antigen epitope, the mouse CDRs in the humanized antibody are minimally exposed to the human immune system and, in theory, should be non-allergenic in nature.
Humanization of mouse antibodies refers to the replacement of mouse framework regions with similar human germline framework regions based on amino acid sequence similarity.The mouse scaffold is replaced with a human scaffold primarily to reduce immunogenicity of the mouse antibody and prevent its neutralization when administered to humans.This process renders the humanized antibody with a greater therapeutic index than its mouse counterpart.Humanization of mouse antibodies has been shown to prolong the serum half-life, improve interactions with effector cells and reduce immunogenicity.The rat antibody CAMPATH-1G, upon humanization (CAMPATH-1H), demonstrated not only a longer half-life but also improved therapeutic effects [4].
Other strategies that have been used to minimize immunogenicity of the humanized antibody include veneering and super-humanization.In veneering (re-surfacing), the mouse mAb sequence in the crystal structure or homology model is examined for surface exposure or solvent accessibility, and the protruding residues (e.g., lysine) are mutated to human germline residues [5].This process drastically reduces the immunogenic potential of the humanized antibody.In addition, certain mouse CDR residues have been further eliminated via the super-humanization technique, which improved the therapeutic value of the antibodies [6-8].
Current approaches for antibody humanization involves replacement of numerous mouse residues in the scaffold, as well as CDR regions, with human antibody-derived residues, but the consequences is usually a reduction or in some cases complete loss of affinity towards the antigen.Therefore, these methods are not viable options for the conversion of mouse antibodies to the humanized form, and further optimizations are needed to improve binding kinetics of humanized antibodies to match their original parental mouse monoclonal antibodies.Indeed, antibody humanization is an iterative process that requires engineering at all stages of development to produce a successful therapeutic agent.
As an alternative to antibody humanization based on the linear sequence, an approach relying on the 3D structure and conformation provides an ideal platform to address the above concerns in a rational manner.X-ray crystallography and nuclear magnetic resonance (NMR) imaging are currently the only tools that can deduce the 3D structure of antibodies at an atomic level.However, these techniques require elaborate infrastructure and time.The rapid pace of deposition of antibody structures in the Protein Data Bank (PDB) has helped in the creation of various structure based molecular prediction programs for biologics.
De novo antibody structure prediction via homology model building is being used currently for antibody design, engineering and humanization to reduce immunogenicity and restore affinities similar to those of parental mouse antibodies.This process entails PDB searches, simultaneously carried out for both frameworks (scaffold) as well as CDRs for light and heavy chains, for the most homologous 3D antibody structure to the query sequence and results in the creation of a structurebased homology model from the primary sequence of the mouse antibody.These approaches tend to save time from the computational prediction to experimental validation stages.To validate the applicability of structure-based biologic design as a universal strategy for humanization of mouse antibodies, we applied our humanization strategy to 17 unique mouse antibodies.In addition, this study highlights the importance of conformational folding for antibody design given the limitations of the linear sequence method.A threshold filter was placed to consider only mouse antibody structures released in the PDB since 2010 in order to prevent redundancy with previously published studies.Importantly, no benchmark studies on antibody structure predictions and homology model building as a platform for antibody design for the purpose of humanization of mouse antibodies have been reported since that period.
This study involved creation of an antibody homology model from mouse antibody primary sequences and subsequently introduced mutations to match the most highly similar human germline gene sequence.Furthermore, a surface accessibility screen was performed to locate conformationally exposed residues, and they were mutated to minimize or eliminate potential immunogenicity.This humanized model was then subjected to simulated annealing (energy minimization).In order to synchronize the structural disparity between the human scaffold with mouse CDRs, simulated annealing was performed to energetically minimize this hybrid structure.This procedure allowed the homology model to fold systematically and mimic the most favorable native conformation state.Force field errors resulting from this simulation were then observed for further analysis and optimization.Therefore, this study extends our knowledge of antibody design for purposes of converting mouse antibodies to fully accommodate a human germline scaffold for therapeutic drug development.It also demonstrates the advantages of coupling structure-based antibody design with simulated annealing (energy minimizations) for the deduction of important conformational residues required for proper antibody folding, function and affinity.

Methodology: Mouse antibody sequences and structures
Mouse Fab (fragment antigen binding) antibody structures were searched in the Protein Data Bank (PDB).In order to prevent any redundancy in antibody modeling with earlier studies, only mouse antibodies released since 2010 were taken for further evaluation.Out of 104 mouse antibody-antigen crystal structures, 17 unique unbound mouse antibodies were chosen randomly for the final study.

Prediction of Immunoglobulin Structures (PIGS)
Each mouse antibody sequence obtained from the PDB as a FASTA file was truncated to contain only the variable heavy and light chain sequences (Fv, fragment variable format) and submitted to the PIGS server (http: // www.biocomputing.it/pigs/) via single sequence submission.Default server settings were used for template selection and generation.Top scoring heavy and light chain templates were chosen from 20 templates displayed.These mouse homology models were then subjected to a surface accessibility screen set at a threshold of 30% using Swiss-PdbViewer to determine the location of various surface accessible residues, which may be immunogenic in nature [12].

Rosetta antibody modeling (RAM)
Mouse FASTA sequences for the heavy chain and light chain were submitted in the Fv format to the Rosetta Online Server that Includes Everyone (ROSIE) antibody modeling server (http://antibody.graylab.jhu.edu)[13].The first antibody model was selected among the top ten scoring antibody models based on their energy minimization scores.Root mean square deviation (RMSD) scores were calculated using the PyMOL built-in combinatorial extension (CE) module alignment tool to gauge the validity and model prediction properties as shown in Supplementary Table 1 (see supplementary material).In addition, PDBsum structural analysis with PROCHECK and Verify3D programs were used to validate the homology models.All homology models are available upon request.

Humanization
The IMGT DomainGap alignment module was used to find and align each mouse sequence to the most homologous human germline sequence (http://www.imgt.org/3Dstructure-DB/cgi/DomainGapAlign.cgi)[14].Except the CDR regions, mutations were manually introduced in the mouse framework regions of the homology model to match the human germline sequence via the PyMOL mutagenesis wizard tool.Furthermore, a surface accessibility screen was performed, and the identified surface-exposed mouse residues were manually mutated to reflect the human germline sequence.

Numbering of antibodies
The IMGT database unique Lefranc numbering system for antibodies was used for sequence alignment.Antibody crystal structures derived from the PDB have Kabat numbering, and homology models generated via the PIGS server have the Kabat-Chothia numbering scheme.Meanwhile, the RAM server utilizes the Chothia numbering scheme.

Energy minimization
The Swiss-PdbViewer (DeepView) software was downloaded and run locally for energy minimization (simulated annealing) [12].The humanized homology model was subjected to the GROMOS force field energy minimization.Default settings were used, and the output model was further examined for residues with various force field errors, which were displayed by energy minimized model.These residues were individually examined via PyMOL.Mutations were then introduced to rationally correct select steric clashes, which were predicted after simulated annealing.All homology models were again subjected to force field simulated annealing to validate the chosen mutations or substitutions.

Results:
A total of 17 mouse antibody homology models generated via the PIGS and RAM servers were carefully examined for modeling errors.Among these, the 3I75 model obtained via the PIGS server, as well as the 4DCQ and 3MNV models via the RAM server, contained a broken chain or modeling failure.Therefore, they were omitted from the rest of the study (Table 1).The amino acid sequence alignment of the 17 different mouse antibody sequences with the closest human germline sequence found in the IMGT database revealed a high sequence similarity in the heavy and light chains with ~60% identity or higher to the human germline sequence (Figure 1a).This high level of sequence identity was crucial in minimizing the mutations required for humanization of the mouse antibodies.The human germline sequence with the highest Smith-Waterman score obtained via IMGT/DomainGap alignment through the IMGT database was selected as a final template for humanization of the mouse antibody Table 2 (see supplementary material).

Figure 1:
Sequence and structural alignment of mouse antibody 3UJT.A) 3UJT heavy and light chains were sequence aligned to the most homologous human germline genes IGHV1-2*02 and IGKV4-1*01, respectively, via IMGT DomainGap alignment.28 mutations (shaded red) in the heavy chain and 16 in the light chain were made to humanize (green) the mouse antibody 3UJT.CDRs (yellow) were unchanged; B) Structural alignment of the mouse antibody 3UJT crystal structure (gray) and mouse homology models generated via the PIGS (purple) and RAM (blue) servers.RMSD between the Cα backbones of the crystal structure (grey) and PIGS homology model (purple) was 0.9 Å, whereas that for the RAM homology model (blue) was 2.5 Å over 224 residues.
Furthermore, RMSD analysis between the X-ray crystal structure (PDB) and mouse homology models generated from the PIGS and RAM servers via PyMOL revealed a deviation between the Cα backbones (3.3 Å maximum and of 0.7 Å minimum deviation over > 200 residues) as shown in Figure 1b and (Table 1).Therefore, these antibody homology models appeared to be relatively reliable for generating the overall tertiary fold and homologous structures.In addition, certain solvent accessible residues (outside the CDR regions) revealed from a surface accessibility screen of the mouse homology model were manually mutated to the human germline sequence to reduce or avoid immunogenicity (Figure 2).
The humanized homology model exhibited various force field problems when subjected to simulate annealing.Further examination of these problems in the energy minimized model revealed steric clashes, unfavorable geometry and incorrect hydrogen bonding (Figure 3a).These residues were either mutated to the parental mouse sequence (back mutation) or substituted with similar amino acids.The "back mutations" or substitutions did ameliorate these force field errors in the humanized antibodies.As shown in (Figure 3b), the humanized homology model 3UJT generated via the PIGS server indicated a close proximity between residues Val 13 and Leu 78 in the light chain of 3UJT.Therefore, Val 13 was back mutated to Met 13 , its parental mouse residue, which corrected the proximity error.In addition, Asn 57 was substituted to Gln 50 to resolve the steric clash between the Trp 50 and Asn 57 residues in the heavy chain (Figure 3c).
Output data of the GROMOS force field simulated annealing (energy minimization) revealed that most of the force field errors originated from the "non-bonded" parameter.Humanization performed through antibody homology models generated by the PIGS server had significantly more steric clashes compared to that from the RAM server (Supplementary Figure 1).When these residues were closely examined, a suitable mutation or substitution seemed to dramatically lower the non-bonded energies, hence energetically stabilizing the antibody molecule.

Discussion:
The majority of approaches for conversion of a mouse antibody to humanized form are based either on sequence or structure guided methods.These strategies help reduce the immunogenicity of mouse antibodies but fail to address the loss of affinity usually observed using this procedure.To our knowledge, antibody humanization servers are not publicly available.Accordingly, our study lays the platform for generalized structure guided homology model based predictions, coupled with simulated annealing for humanization of mouse antibodies, which can be easily adapted.For the current study, the RAM and PIGS servers were used for the prediction and generation of antibody homology models.For each antibody, the first model among the top 10 models generated via the RAM and PIGS servers was selected for this study.
Most humanization strategies tend to rely heavily on primary sequence differences, rather than conformational folding.The former sequence-based strategy may lead to minimized immunogenicity, but the humanized antibody may be improperly folded, resulting in loss or reduced affinity towards the target.Antibody humanization undertaken by homology model building alone does offer information on the overall folding, conformation and delineation of surface exposed residues.However, humanization of antibodies via sequence alignment and homology model generation with conformational sampling, together with the simulated annealing approach, offer additional knowledge-based information.This approach provides a rational strategy to correct mutations that not only aid in antibody folding but also improve overall kinetics for antigen binding.
Critical Assessment of Protein Structure Prediction (CASP) studies and other investigators have shown independently that all of the publicly available antibody homology servers (i.e., WAM, RAM and PIGS) generate very similar homology models, and none of them can accurately predict the loop conformation of the CDR3 of the heavy chain (CDRH3) [15-16].Simulated annealing is a powerful tool to be used after homology modeling for humanization of a mouse antibody to examine the effects of individual mutations.Most humanization methods omit this important step that can potentially hinder their antibody kinetics against a cognate target.In this study, homology models generated via the RAM server had fewer steric clashes compared to those from the PIGS server upon humanization of the mouse antibodies.This finding may be attributed to additional optimization protocols utilized in the RAM server, which include simultaneous optimizations of the VH/VL orientation, side chain and loop backbone.The PIGS server usually generates rigid homology models, which frequently contain steric clashes.
As with most computational predictive studies, certain limitations exist in this study as well.In silico predictions can reveal potential problems in humanization of mouse antibodies as a proof of concept, but those predictions need to be experimentally validated.

Conclusion:
The intricate process of humanizing mouse antibodies requires rational introduction of amino acid substitutions to not only correct force field errors but also to minimize immunogenic reactions or anti-drug responses against the humanized antibody.This study highlights the importance of certain canonical and non-canonical residues responsible for the loss of affinity and addresses antibody protein folding problems.Our results also confirmed previous studies that revealed the benefit of simulated annealing after humanization of a mouse homology model [7,8].This step pinpoints the importance of certain force fields, as well as contributions of certain residues in the overall folding and kinetics of antibody-antigen interactions.For example, framework residues that are determined to interact and contribute to the overall conformation of the CDR loops may be reverted back to mouse residues during the process of humanization [17][18][19][20].Predictive structure guided homology models such as those developed in this work help to save time and effort through rapid in silico predictions and screening which act as a rational guide for specific experimental validations.Thus, structure-based predictive antibody homology modeling and force field simulated annealing strategies may be adopted universally for antibody humanization due to their ease of use and efficiency.

Figure 2 :
Figure 2: Strategy for humanization of a mouse antibody based on in silico homology modeling and energy minimizations (simulated annealing).The mouse Fv sequence was submitted to the PIGS/RAM server to generate a homology model, and IMGT DomainGap alignment module was used for sequence alignment to identify the most homologous human germline sequence.Mutations were made in the framework regions (red) to humanize the mouse antibody model.The Swiss-PdbViewer (DeepView) energy minimization tool was applied to the humanized homology model to find force field errors in the model.The identified residues were carefully examined and

Figure 3 :
Figure 3: GROMOS force field errors and rationally chosen mutations to correct them.A) Simulated annealing of a humanized antibody (PdbID-3UJT) revealed force field errors highlighted in red/yellow/orange.These errors range from steric clashes, unfavorable geometry and irrelevant hydrogen bonding; B) Upon humanization of the light chain framework1, residue Val 13 was in close proximity with Leu 78 of framework3 in the light chain.Therefore, Val 13 was back mutated to its parental mouse residue Met 13 , corrects the clash; C) Humanized residue Asn 57 in framework3 of the heavy chain clashed with Trp 50 in framework2 of the heavy chain.Therefore, Asn 57 was back mutated (mouse) to Gln 57 .
Homology model-based humanization of antibodies has been successfully demonstrated computationally, as well as validated experimentally in the recent past [7-8, 17].Mader et al. utilized CDR grafting to humanize mouse antibodies along with the above-mentioned methods and demonstrated that conservative CDR grafting via a homology model-based design could restore affinities similar to the parental mouse/chimeric version, whereas aggressive super-humanization led to complete loss or reduced affinity.