Molecular modeling and identification of substrate binding site of orphan human cytochrome P450 4F22.

Cytochrome P450s are superfamily of heme proteins which generally monooxygenate hydrophobic compounds. The human cytochrome P450 4F22 (CYP4F22) was categorized into "orphan" CYPs because of its unknown function. CYP4F22 is a potential drug target for cancer therapy. However, three-dimensional structure, the active site topology and substrate specificity of CYP4F22 remain unclear. In this study, a three-dimensional model of human P450 4F22 was constructed by comparative modeling using Modeller 9v5. The resulting model was refined by energy minimization subjected to the quality assessment from both geometric and energetic aspects and was found to be of reasonable quality. Docking approach was employed to dock arachidonic acid into the active site of CYP4F22 in order to probe the ligand-binding modes. As a result, several key residues were identified to be responsible for the binding of arachidonic acid with CYP4F22. These findings provide useful information for understanding the biological roles of CYP4F22 and structure-based drug design.


Background:
Cytochromes P450 (CYP) which are a superfamily of heme containing enzymes play an important role in detoxification and metabolic activation of large number of xenbiotic chemicals such as drugs, carcinogens and many endogenous compounds like hormones and vitamins.There are 57 CYP isoenzymes found in humans which are well-known for their monooxgenase reaction.However, one quarter of the CYP family remains orphan, which means that their function, expression studies and substrate information are not yet clearly understood.Thirteen members of this superfamily are classified as orphans by Guengerich [1].
Human CYP4F22 was considered as one such orphan CYPs, amino acid sequence of which has been identified recently.The three-dimensional structure of this protein is not yet known.Experimental studies of CYP4 family members are previously reported to metabolize arachidonic acid (AA) [2] like endogenous compounds to its hydroxyl product.CYP family members were found to share their substrates with one another.CYP4F22 gene as a part of a cluster of cytochrome P450 genes on chromosome 19 encodes an enzyme thought to play a role in the 12(R)-lipoxygenase pathway.Mutation in this gene is the cause of ichthyosis lamellar type 3.It was recently found that CYP4F22 was specifically expressed in human breast cancer tissues.For this reason, human CYP4F22 has been suggested to be a potential drug target for cancer therapy.However, to date, information regarding the structure and ligand binding site is not available for CYP4F22 and hence structural information might help us to understand the ligand interaction.
In this study, a 3D model of CYP4F22 was constructed using comparative modeling method, followed by energy minimization to refine the initial model.The quality of the resulting 3D model of P450 4F22 was critically assessed through geometric and energetic assessments.After that, arachidonic acid (AA) was docked into the active site of the proposed 3D structure of CYP4F22 and certain key residues responsible for substrate specificity were identified.Further 2D QSAR study was used to find out the pharmacological characteristics of the ligand.The results obtained in this study will be useful for further elucidation of the biological role and the structure-based drug design of P450 4F22.

Methodology: Sequence retrieval
The protein sequence (531 amino acids) of human CYP4F22 was obtained from the UniProtKB database (accession number: Q6NT55).

Template identification and sequence alignment
For template selection, BlastP

Comparative modeling of human CYP4F22
The alignment was used as the input and the homology model of P450 2F22 was generated using Modeller 9v5

Energy minimization and structural quality assessment
The resulting model was further refined by energy minimization using the Amber9 package

2D QSAR study:
The selected ligand arachidonic acid (AA) is a long-chain fatty acid with a hydrophobic tail.The arachidonic acid was downloaded from Pubchem in Structure Data Format (SDF).Conversion of SDF to Protein Data Bank (PDB) format was carried out using Swiss PDB viewer.The molecular structure of ligand was optimized using the Polak-Ribiere algorithm until the root mean square gradient was 0.01 kcal mol -1 using Hyperchem V 8.09 [12].The molecular descriptors were calculated for the ligand using QSAR properties of Hyperchem.

Prediction of key residues in homology model of CYP4F22:
The homology model of CYP4F22 was submitted to automated ConSurf web server [13] for the identification of functional regions of CYP4F22 by estimating the degree of conservation of the amino-acid sites among their close sequence homologues.

Molecular docking of CYP4F22-ligand complex:
Both the modeled protein and ligand were optimized using "dock prepare" of the USCF chimera  .The sequence conservation and the signature motifs of CYP4F22 were examined using multiple sequence alignment with templates (Figure 1).The essential signature motif FxGxxxCxG is characteristic motif for the CYP superfamily which includes a conserved cystine residue that ligates to the Fe of the heme.The amino acid sequence identity between target and the templates was above 40% which is high enough to construct a 3D model.The initial 3D model of CYP4F22 was energy-minimized and assessed for both geometric and energy aspects.The Procheck evaluation for good stereochemistry of the refined model resulted in 97.8% in the allowed regions and 2.1% in disallowed regions.Verify 3D analysis resulted in a model with about 81% of residues with a score over 0.2, indicating a good model.The Prosa analysis showed that z-score in the CYP4F22 model was negative in most residues, further indicating a reliable model.The 3D structure model of CYP4F22 exhibits a fold similar to those of other P450s and contains the heme group sandwiched between helices (Figure 2a).CYP4F22 model of this study also reveals that a highly conserved threonine is located in the middle of helix I which is similar to other reported human CYP structures.This conserved residue has been suggested to participate in the protein delivery and play an important role in the dioxygen bond cleavage during the catalytic cycle functional residues of CYP4F22 model with high conservation scores were identified Table 1

(see supplementary material).
The energy minimized model of arachidonic acid was also subjected to 2D QSAR study using Hyperchem.Topological (surface area of 603.66 Å2) and constitutional descriptors (LogP of 10.10, molecular weight of 303.46 amu, polarizability of 36.83Å3 and refractivity of 62.82 Å3) were computed and found to be desirable pharmacological characteristics for docking.To determine the key residues responsible for ligand interaction, the homology model of CYP4F22 was docked with arachidonic acid (Figure 2c).The free energy of binding (ΔGbind kcal/mol) for the designed ligand was found to be -3.53kcal/mol.The negative and low value of ΔGbind indicates a strong favorable bond between CYP4F22 protein and the AA ligand in most favorable Several important residues were detected in homology model of P450 4F22.The residue ARG 156 appears to play an important role in substrate specificity.The corresponding TRP 152 and THR 160 pairs were conserved and they were directly in contact with ligands.In addition, the rest of the ligand-interacting residues GLY 471, CYS 475, ILE 476and GLN 478 were mostly located near the key residues (Figure 2d).

Conclusion:
The human P450 4F22 theoretical 3D model was constructed using homology modeling, refined with energy minimization and assessed for sterochemical and energetic aspects.Docking approaches were employed to dock arachidonic acid ligand into the active site of the proposed 3D model in order to probe ligand binding modes.The docking studies as well as the ConSurf analysis revealed the important residues involved in the substrate specificity.The docking and ConSurf analysis are in concurrence with each other reflecting the importance of crucial residues.Interacting key residues gave information on the substrate specificity of the CYP4F22 active site when they interact with arachidonic acid ligand of hydrophobicity.Also these residues might act as ligand-channeling and could help the reaction center of the protein.

[ 3 ]
search was used against Protein DataBank (PDB).The templates having high similarity were selected for the model building.The templates were aligned and examined for conserved sequence with the target by ClustalW [4].

[ 5 ]
. Loops were modeled using ModLoop server [6].The coordinates for heme were obtained from 1TQN and positioned as in the template.

[ 7 ]
and subjected to quality assessment.The Procheck [8] and Verify 3D [9] were utilized for geometric evaluation.The Prosa[10] was employed to examine the energy of residue-residue interactions using a distance-based pair potential approach.The energy was transformed to a score called z-score where residues with negative z-score indicate reasonable side-chain interactions.The final model was then submitted to the protein modeling database [11] as PM0077605.

Figure 1 :
Figure 1: Multiple sequence alignment of target (CYP4F22:Q6NT55) and two templates (1W0E and 1TQN) represented using ClustalW Program.The alignment figure was prepared using Espript 2.2.The red colour indicates the residues conserved in all three sequences.Results and Discussion:The first step in homology modeling is to detect appropriate template proteins whose structures are known that are expected to be similar to the target protein.A BlastP search confirmed that there are several mammalian CYPs structures which could serve as potential templates for P450 4F22 modeling.The topranked two templates are 1W0E [17] and 1TQN[18].The sequence conservation and the signature motifs of CYP4F22 were examined using multiple sequence alignment with templates (Figure1).The essential signature motif FxGxxxCxG is characteristic motif for the CYP superfamily which includes a conserved cystine residue that ligates to the Fe of the heme.The amino acid sequence identity between target and the templates was above 40% which is high enough to construct a 3D model.The initial 3D model of CYP4F22 was energy-minimized and assessed for both geometric and energy aspects.The Procheck evaluation for good stereochemistry of the refined model resulted in 97.8% in the allowed regions and 2.1% in disallowed regions.Verify 3D analysis resulted in a model with about 81% of residues with a score over 0.2, indicating a good model.The Prosa analysis showed that z-score in the CYP4F22 model was negative in most residues, further indicating a reliable model.The 3D structure model of CYP4F22 exhibits a fold similar to those of other P450s and contains the heme group sandwiched between helices (Figure2a).CYP4F22 model of this study also reveals that a highly conserved threonine is located in the middle of helix I which is similar to other reported human CYP structures.This conserved residue has been suggested to participate in the protein delivery and play an important role in the dioxygen bond cleavage during the catalytic cycle[19].The CYP4F22 model was further tested using ConSurf server for the prediction of active residues (Figure2b) that are expected to be accessible, either at the surface or in a pocket of the protein.The [19].The CYP4F22 model was further tested using ConSurf server for the prediction of active residues (Figure 2b) that are expected to be accessible, either at the surface or in a pocket of the protein.The BIOINFORMATION open access ISSN 0973-2063 (online) 0973-8894 (print) Bioinformation 7(4): 207-210 (2011) 209 © 2011 Biomedical Informatics

Figure 2 :
Figure 2: (a) Ribbon schematic representation of the homology model of CYP4F22.The heme is shown in the center of the figure in red stick.Various visible regions of secondary structure elements are labeled in the figure, (b) Predicted active sites of CYP4F22, (c) 2D structure of arachidonic acid and (d) a close view of CYP4F22 with arachidonic acid.Key residues are shown in red spheres and labeled.The substrate arachidonic acid is shown as grey stick.

Table 1 :
The current results might be very helpful for the determination of biological role to provide insights into arachidonic acid.Function regions of CYP4F22 predicted using ConSurf server with high amino acid conservation score (9 -conserved, 1variable).