Identification and classification of detoxification enzymes from Culex quinquefasciatus (Diptera: Culicidae)

Molecular characterization of the insecticide resistance has become a hot research topic ever since the first disease transmitting arthropod (Anopheles gambiae) genome sequence has unveiled in 2002. A recent publication of the Culex quinquefasciatus genome sequence has opened up new opportunities for molecular and comparative genomic analysis of multiple mosquito genomes to characterize the insecticide resistance. Here, we utilized a whole genome sequence of Cx. quinquefasciatus to identify putatively active members of the detoxification supergene families, namely cytochrome P450s (P450s), glutathione-S-transferases (GSTs), and choline/carboxylesterases (CCEs). The Culex genome analysis revealed 166 P450s, 40 GSTs, and 62 CCEs. Further, the comparative genomic analysis shows that these numbers are considerably higher than the other dipteran mosquitoes. These observed speciesspecific expansions of the detoxification super gene family members endorse the popular understanding of the involvement of these gene families in protecting the organism against multitudinous classes of toxic substances during its complex (aquatic and terrestrial) life cycle. Thus, the generated data set may provide an initial point to start with to characterize the insecticide resistance at a molecular level which could then lead the development of an easy to use molecular marker to monitor the incipient insecticide resistance in field environs.


Background:
The present day sustainable vector control activities are primarily dependent on use of chemical insecticides. Because of this reason, almost all of the mosquito-vectors around the globe have successfully learned to defend themselves from the existing insecticides that are being recommended by WHO. A decade ago, due to lack of genome sequence information from the disease transmitting mosquito species, it was challenging to understand the molecular aspects behind the evolution of insecticide resistance. The genome sequence of Anopheles gambiae (African malaria mosquito) was first published in 2002 [1], followed by Aedes aegypti in 2007 [2] and very recently Culex quinquefasciatus [3] genome in 2010, has opened-up new possibilities to look into the insecticide resistance at a molecular level. Many conclusive reports on the candidate genes behind the molecular mechanisms of insecticide resistance from African malaria vector, An. gambiae have been published [4-6]; however, translating these studies for practical application is still a due. The present Cx. quinquefasciatus genome sequence further enhanced the capability to understand the molecular science of insecticide resistance through the comparative genomic studies. Culex species have acted as a model organism to study the population genetics and evolution of the insecticide resistance both in field and laboratory conditions [6][7][8]. Many of the long-term studies on monitoring the role of different processes that are important for insecticide resistance have been conducted on Culex species [8]. Some of important aspects of Culex species research are; origin of new adaptive mutations against the insecticide used in the vector control program, and their interaction with the existing insecticide resistance mutations, interaction with the environment, cost of a mutation in the presence and absence of an insecticide, establishment and migration of the mutations to a wide geographical areas, pleiotropic effect of a gene mutation on the fitness characteristics of the mosquito, etc [7]. The Culex species being as an urban vector, many of its control efforts are focused on the usage of organophosphorous group based larvicides. Because of this the Culex species has been extensively investigated for the mechanisms behind the OP resistance. The established insecticide resistance mechanisms for OP compounds includes both target site mutations in acetylcholinesterase (ace) gene and over production of the detoxification enzymes, majorly esterases through gene amplification. The resistance mechanisms against insecticides in Culex species are similar to that of other disease causing mosquito-vectors (An. gambiae and Ae. Aegypti) and are grouped into two major groups; target site insensitivity and up-regulation of the detoxification enzymes. The detoxification enzymes consists of hundreds of genes from three supergene families, namely cytochrome P450s (P450s) or monoxygenases, glutathione-S-transferases (GSTs), and carboxyl/cholinesterases (CCEs). A plethora of information is available on the insecticide resistance in public domains that confirms the important role of detoxification enzymes (P450, GST, and CCE) in the evolution of insecticide resistance. These enzymatic groups possess a capability to virtually detoxify myriad classes of xenobiotics that are found in nature. In contrary, the target site mutations contribute resistance against a particular selected insecticide. Due to continuous vector control efforts using various strategies by placing the chemical insecticides at a center stage has created multiple insecticide selection pressure on the mosquito vectors. This particular situation has resulted in the appearance of mosquito isolates that are resistant to more than one insecticide. One such mosquito species with multiple insecticide resistance mechanisms is Culex. Nonetheless, this species has shown multiple resistances to all of the four major classes of insecticides, namely organochlorines, organophosphates, carbamates, and pyrethroids, especially in field situations [9]. These aspects coupled with the availability of a past history of chemical control activities and multiple-insecticide resistance information makes this species special to investigate the molecular insecticide resistance aspects in-depth. Giving a due importance to dissect the molecular basis of insecticide resistance through analysis of the detoxification supergene families, here, we utilized Cx. quinquefasciatus genome sequence to in silico fish-out the detoxification enzymes that belong to three major groups; P450s, GSTs, and CCEs. The aim of the present study was to investigate the detoxification enzymes from the Culex whole genome sequence and to classify them into respective gene families such that the information could be easily retrieved for further studies on delineation of the insecticide resistance processes. In addition, the comparative genomic analysis of Culex detoxification genes with Drosophila, Aedes and Anopheles was performed. Apart from common disease vectors (Aedes and Anopheles-model organisms for hostparasite interaction), the Drosophila was selected for comparative genomics due to its importance as a model organism.

Methodology:
Utilizing the published sequences of P450s, GSTs, and CCEs from An. gambiae and D. melanogaster, the whole genome sequence of Cx. quinquefasciatus (Cx. quinquefasciatus JHB CpipJ1.2, June 2008 data base) was scanned using the tBLASTn with default parameters (E-value-10, word size-3, similarity matrix-Blosum62, Gap penalties-opening: 11 and extension: 1) as a first step to find out the putatively active detoxification enzymes. Following which, the special characteristics of each of the enzyme group, namely cysteine heme-iron ligand signature i.e. conserved FXXGXXXCXG motif and ~ 500 amino acids (a. a.) protein length for P450s; SNAIL/TRAIL motif and ~ 200 a. a. protein length for GSTs; and catalytic triad sequence (Ser-His-Glu) and ~ 500 a. a. protein sequence length for CCEs, were applied to preliminarily confirm the status to their candidature. Following this, each of the putatively identified sequence was evaluated for having complete protein domains' structure and absence of multiple domains that are characteristics of a functional protein. Final list of detoxification enzymes were tabulated by removing the proteins with incomplete domain structure (possible pseudogene), and/or with multiple domains. For this, Conserved Domain (CD) search was performed against Conserved Domain Database using the protein sequence as a query. CD-search uses the RPS-BLAST to scan the pre-calculated PSSM. The result of CD search are graphical that identify and enlist the domain architecture present in the given protein sequence (the CD-search can be performed through NCBI and can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml). Any partial domains in a given protein can be identified through this procedure.
Further to this, the confirmed enzymes were classified into various gene families based on their phylogenetic relationship with the classified gene family members' from An. gambiae and/or Ae. aegypti. The phylogenetic analysis of Cx. quinquefasciatus detoxification enzymes was performed by downloading the An. gambiae P450s and GSTs from VectorBase database AgamP3 build available at http://www.vectorbase.org/Anopheles_gambiae/Info/Index and the D. melanogaster esterase sequences from FB2012_02 available at http://www.flybase.org. The P450 sequences of the Culex and Anopheles; GST sequences of Culex and Anopheles; and CCE sequences of Culex and Drosophila were analyzed for drawing the evolutionary relationships among the genes. For all the phylogenetic analysis MEGA4.0 software was employed as described in Raghavendra et al. [10]. To construct phylogeny, the final protein multiple sequence alignment was used as an input with Jones-Taylor-Thornton (JTT) evolutionary model to assess the genetic distance between various taxa. Finally, the obtained phylogenies were statistically evaluated using the bootstrap test with 500 replicates.

Discussion:
The recent genomic sequences from Anopheles and Aedes species have enabled us to utilize the genomic sequence to develop and standardize the procedure to fish-out the detoxification enzymes. In the year 2010, Arensburger et al. have published the genome sequence of Cx. quinquefasciatus [3], and to date, to the best of our knowledge the Culex detoxification enzymes' related information is yet to be made available. Although the post-genomic era has brought simplifications in the way to analyze the genomic data, the manual screening and annotation is necessary in order to obtain specific function related information from the genomes [11]. Ever since the first disease causing mosquito genome has completed, the two mosquito biology research areas, namely insecticide resistance and understanding of the processes or basic genomic elements that are responsible for blockage of the pathogen growth inside the mosquito have flourished in comparison to the other scientific areas. The genomic and bioinformatic analysis of Cx. quinquefasciatus genome revealed 166 P450s, 40 GSTs, and 62 CCEs  Today it is known that CYP3 and CYP4 clan members from P450s, Delta-Epsilon gene family members from GSTs, and alpha-beta esterases from general esterases are primarily responsible for the insecticide resistance [6]. The substantial expansion of detoxification enzymes in Culex might have occurred due to the species breeding preference to highly polluted water. Due to this, Culex mosquitoes might get exposed (some chemicals might have similar chemical structures as that of the insecticides that are being used in the vector control programs) to numerous kind of chemical molecules during the early stages of their development (aquatic phase of life cycle). David et al. [12] showed that larval breeding site has a significant influence over detoxification responses of the mosquitoes to various pesticides. According to Liska [13] the detoxification processes can be classified into two steps: (1) functionalization-where the foreign compound(s) get oxidized to create a reactive site (electrophilic site) by the phase I detoxification enzymes (P450s and esterases), (2) conjugationutilizing the reactive site facilitated by the phase I system a water soluble compound will be added to the reactive site by the GSTs. This particular action results in the biotransformation of lipophilic xenobiotic compounds into a more water soluble byproducts and thus facilitates in easy excretion [13,14].
The phase I detoxification enzymes (P450s) are categorized into four clans, viz. CYP2, CYP3, CYP4, and mitochondrial P450s [15]. Of these, mitochondrial and CYP2 clans are important for performing the developmental regulations by facilitating in the production of juvenile hormone, while CYP3 and CYP4 clan members are important for detoxification of the xenobiotics. Furthermore, each of these four clans are divided into individual gene families based on the protein sequence identity (>55% and >40-55% protein identity is used to define a subfamily and a gene family, respectively). Of 16 P450 gene families identified in An. gambiae, the CYP4, CYP6, CYP9, and CYP325 gene families are important for insecticide resistance in insects [16][17][18][19][20][21][22]. Due to the involvement of CYP2 and mitochondrial P450s in developmentally important functions, these gene families are least prone to gene duplications (Table  1a); in contrary CYP3 and CYP4 clan gene families' that are implicated in metabolizing and detoxification of foreign compounds are expanded grossly in the mosquito genomes (Table 2b). Of a total 166 Culex P450s, 77 and 66 genes belong to CYP3 and CYP4 clans, respectively that accounts for 86% of total P450s Table 2b (see supplementary material) & ( Figure  1a). The comparative genomic analysis of P450s from the dipteran species revealed that CYP3 and CYP4 clans are alone contribute to 64-86% of the P450s  material). Furthermore, they have shown 1:1 secure orthologs in dipteran species (data not shown). The comparative genomic analysis shows that Drosophila has got least numbers of CYP3 and CYP4 clan members as compared with the other disease causing dipteran species. This may be due to restricted exposure of Drosophila to pesticides. In contrast, the rest of the three disease vectors are primary targets of human interventions to control the disease/s that are basically centered in using the insecticides to kill the vectors.
The GST supergene family of insects is divided into eight (that include one unclassified GST class) classes, namely Delta, Epsilon, Theta, Sigma, Omega, Iota, Unclassified, and microsomal GSTs. Of these, Delta-and Epsilon-classes are important for the detoxification of xenobiotics [11,23]. The GST supergene family forms the phase II detoxification system where the conjugation reactions occur to render the xenobiotics more soluble or to make them sequestered so that xenobiotics or insecticides will become inactive in the cell. In Culex 57% (23/40) of total GSTs belong to the Delta-Epsilon class Table 2b (see supplementary material) & (Figure 1b). The comparative GST supergene family analysis suggests that 55 to 66% of total GSTs belong to the Delta-Epsilon class. Delta-Epsilon classes are primarily responsible for detoxification process while the function of other GST classes is yet to be elucidated [23]. The comparative genomic analysis show that classes Iota, Unclassified, and Microsomal GSTs are absent from the Drosophila, while class Zeta is absent from the Culex. The maximum number of variations in the gene copy numbers is observed in Delta-and Epsilon-classes Table 1 (see  supplementary material).
Esterases are classified into two major groups based on their cellular functions; (a) metabolic enzymes (dietary detoxification, hormone and pheromone processing esterases) (b) neuro/developmental functions [24,25]. These two groups are further classified into gene families, namely alpha, beta, acetylcholinesterases, neurotactin, neuroligin, gliotactin, glutactin, juvenile, and unknown (still to classify) gene families [24,25]. Of which alpha esterases (phase I detoxification enzymes) are majorly involved in xenobiotic-detoxification processes. Of a total 62 CCEs identified in Culex genome 50% (28 genes) of which are belonging to the alpha-esterases Table 2 (see supplementary material) & (Figure 1c). The comparative analysis of dipteran CCEs revealed that 30 to 50% of total CCEs are alpha-esterases. The rest of the gene families identified in the Culex genome and their respective copy numbers are given in Table 1 (see supplementary material). As described in the methods section, the classification of the Culex esterases were preformed based on the phylogenetic relationship with the reference Drosophila esterases (Figure 1c). Similar to the case with P450s and GSTs, the gene families that are responsible for the detoxification of the xenobiotics are expanded in esterases, i.e., alpha esterases. The highest number of alpha esterases is reported from Culex (28 genes). The comparative genomic analysis show that except Drosophila (3 genes) rest of the dipteran mosquitoes have lost the integument esterases during the evolution. Furthermore, there is considerable expansion of the juvenile hormone gene copies observed in the mosquitoes (10-13) while only three were reported from Drosophila. Interestingly, in all the analyzed species, a single ortholog gene copy that is classified under uncharacterized esterases is reported Table 2a (see supplementary material). Finally, the Cx. quinquefasciatus detoxification enzyme's data further corroborate the popular understanding that detoxification enzymes undergo adaptive evolution to satisfy the need of an organism for its broad environmental adaptability [26]. Furthermore, it is evident from the analysis that the strong Darwinian selection will favor the organism to evolve new functions through the extensive duplication of genes. Such a mechanism is evident from the significant expansion (locally and globally in the genome) of CYP4, CYP6, CYP9, and CYP325 cytochrome P450 gene families, Delta and Epsilon GSTs, and alpha esterases Table 2b (see supplementary material) that are implicated in causing insecticide resistance in the class Insecta.
In conclusion, the present study identified 268 detoxification genes that belong to P450, GST, and CCE supergene families. This is the first report on the full information about these genes in Cx. quinquefasciatus. These data may act as a raw material for further studies on insecticide resistance. Molecular characterization of the detoxification enzymes involves retrieval, identification, confirmation, and transcriptional profiling of the genes. This needs an expert curated detoxification gene's data set, and the process involved herein is not straightforward [3,24,25,27]. The comprehensive listing of the detoxification enzymes along with their groupings may helps in easy in silico retrieval of the enzyme related information for molecular characterization of insecticide resistance. The present information may also help in understanding evolution of the detoxification supergene families that are directly and/or indirectly responsible for insecticide resistance in insects. However, as the generated information on Culex detoxification genes is based on in silico analyses and thus further studies are needed to confirm the   9,329), and CYP4 clan (CYP4, 325 gene families). In the similar way, GST supergene family has been classified into Delta, Epsilon, Sigma, Theta, Omega, Unclassified, and microsomal GST classes. Likewise, the CCE supergene family divided into alpha, beta, Juvenile hormone processing, glutactin, gliotactins, neuroligins, and neurotactins (pl. see ref [24][25] for the basis of classification of these supergene families).