3D-QSAR studies on fluroquinolones derivatives as inhibitors for tuberculosis

A quantitative structure activity relationship (QSAR) study was performed on the fluroquinolones known to have anti-tuberculosis activity. The 3D-QSAR models were generated using stepwise variable selection of the four methods - multiple regression (MR), partial least square regression (PLSR), principal component regression (PCR) and artificial neural networks (kNN-MFA). The statistical result showed a significant correlation coefficient q2 (90%) for MR model and an external test set of (pred_r2) -1.7535, though the external predictivity showed to improve using kNN-MFA method with pred_r2 of -0.4644. Contour maps showed that steric effects dominantly determine the binding affinities. The QSAR models may lead to a better understanding of the structural requirements of anti-tuberculosis compounds and also help in the design of novel molecules.


Background:
Tuberculosis is a common and often a deadly infectious disease caused by Mycobacterium tuberculosis H37Rv claiming more lives each year than any other bacterial disease.Despite the availability of effective direct observed treatment or shortcourse chemotherapy (DOTS) and Bacille Calmette-Guerin (BCG) vaccine, the tubercle bacillus continues to claim more lives than any single effective agent [1].Due to the extensive period of treatment, patients fail to complete the therapy leading to the emergence of drug resistance-multidrug resistance (MDR) and extensive drug resistance (XDR) in tuberculosis.The widespread emergences of drug resistant strains together with human immunodeficiency virus (HIV) have been witnessed in the increased incidence of tuberculosis in both developing and industrialized countries [1].It has been estimated that tuberculosis accounts for around 32 per cent deaths in HIV infected individuals [2].The recent rise in tuberculosis cases and especially the increase in number of drug resistant mycobacteria strains indicate an urgent need to develop new anti-tuberculosis drugs.The quinolones are a group of a wide range of synthetic antibiotics derived from nalidixic acid.They are characterized by an atom of fluorine and an aryl substituent at position 6 and 7 respectively and are known to exhibit an antibacterial spectrum including Mycobacterium tuberculosis [3].When using fluroquinolones for the treatment of tuberculosis, individual susceptibility, pharmacokinetic and toxicity profiles should be carefully considered [4].
The present study is aimed to understand the structure activity of the available fluroquinolones and to obtain the predictive quantitative structure activity relationship (QSAR) models.A stochastic search method such as multiple regressions (MR), partial least square regression (PLSR), principal component regression (PCR) and an artificial neural network (kNN-MFA) [5-9] was developed to provide an insight to the various interactive fields of different compounds in concord with the in vitro experimental data.Different QSAR approaches have been developed through the years and the rapid increase in threedimensional (3D) structure of the biological molecules have led to the development of 3D structural descriptors and associated 3D QSAR methods [10].The generated models could provide a valuable reference in the design of pharmaceuticals with improved anti-tubercular activity.

Methodology:
3D QSAR study of the molecules was performed using VLife Molecular Design Suite [11] allowing users to choose the probe, grid size, grid interval and other parameters for generation of descriptors.

Dataset for 3D QSAR
Depending on the antibacterial activity and the toxicity to humans, anti-tuberculosis drugs are classified into first-line drugs as the first choice of treatment in new cases of TB and second-line drugs for treatment of patients with M. tuberculosis resistant to the first-line drugs and these were considered for QSAR.The chemical structures of the molecules were retrieved from Pubchem compound database at National Center for Biotechnology Information (NCBI) [12].Activity of the drugs was taken from reported work given as MIC values [3, 13].The biological activity (IC 50 ) of the molecules were converted to their corresponding pIC 50 values Table 1 (see supplementary material) and used as dependent variables in the QSAR calculations.The molecules were first optimized to their lowest energy state using Merck molecular force field (MMFF) method [14] with RMS gradient of 0.001 and alignment was done based on a template structure.Alignment of the molecules is one of the most important factors in obtaining a reliable model.

Calculation of descriptors and data selection
The aligned molecules were selected for calculation of the descriptors after inserting the biological activity as a dependent variable.The field descriptors were calculated with cutoffs of 10kcal/mol for electrostatic and 30kcal/mol for steric at lattice points of the grid using a methyl probe of charge +1.A grid was generated with grid interval of 2Å in a 3-dimensional lattice around the aligned molecules.The dielectric constant was set to 1.0 and with suitable charge type [11].Invariable columns not contributing to the QSAR were removed.Biological activity was selected as a dependent variable and the descriptors generated were selected as indepenedent variables.The training and test sets were generated using the sphere exclusion method which allows construction of the training set covering all the descriptor space areas occupied by representative points [15] and the dissimilarity values was set to 15.75.The unicolumn statistics of the training and test sets was calculated for the correct selection of the training and test sets.

Variable selection and model building
The aligned molecules were subjected to regression analysis using multiple regressions (MR) which estimates the value of regression coefficient by applying least square curve fitting model, partial least square regression (PLSR), an extension of MR model without imposing any restrictions.Principal component regression (PCR) selects a new set of axis such that for the first set of axis it reflects most of the variations within the data and k-nearest-neighbor (kNN) molecular field analysis (MFA) as building methods.The models were developed using stepwise (forward-backward) variable selection method with cross-correlation limit set to 0.5 with the term selection criteria as r 2 for MR, PLSR and PCR and q 2 for kNN-MFA.F-test 'in' was set 4.0 and f-test 'out' was set to 3.99 and scaling to autoscaling.KNN-MFA parameter setting was within the range of 2-5 and the distance-based weighted average prediction method was selected for the study.

Discussion:
The fluroquinolones which show activity against tuberculosis taken for study were optimized to their lowest energy state and aligned with a template.Of the 33 fluroquinolones considered, 14 quinolones did not align with the template and therefore were not considered for further study.The fluroquinolones aligned to the template (Figure 1  The same data was subjected to PLSR method which results in a correlation of 95% (r 2 = 0.9498) and a low prediction accuracy of 89% (q 2 = 0.8976) with a decrease in external predictivity (pred_r 2 =-1.7493).S_1379 and S_558 contribute negatively and S_498 contributes positively in both MR and PLSR, thus groups should be altered around the points to reduce the interactions.

Model 3 (PCR)
Logp (IC50) = -13.1600*S_1379+ 12.7354*S_560 + 6.8894*S_618 + 0.1260; Optimum Components = 2, n = 13, Degree of freedom = 10, r2 = 0.9700, q2 = 0.8636, F test = 161.7857,r2 se = 0.1188, q2 se = 0.2533, pred_r2 = -1.8522,pred_r2se = 0.7120 To further improve the external predictivity of the model, kNN-MFA analysis of the fluroquinolone groups was performed and found to be statistically significant in terms of the external predictivity of the test with an internal predictivity of 33% (q 2 = 0.3290) measuring the reliability of the prediction to be reliable and accurate.The pred_r2 obtained for the test set showed a decreased but an improved result of the external predictivity of pred_r 2 =-0.4644.Descriptor S_242 contributing to the activity negatively indicates that a negative steric potential would favour for the increase activity.The PCR contour plots (Figure 2) showed the relative position and range of the corresponding important steric field in the model.The experimental and predicted activity of the model is shown as the fitness plot (Figure 3) for PCR.The nearness of the predicted to the observed activity is reported in Table 3 (see supplementary material).The models obtained for QSAR showed that the steric interactions play an important role in determining the biological activity of the models (Figure 4).The negative range in the steric descriptors indicates that the negative steric potential is favourable for the increase in the activity and a less bulky substituent group is favored in the region.

Conclusion:
The present study was aimed at deriving the predictive 3D QSAR models capable of revealing the structural requirement for anti-tuberculosis inhibitors.Comparison of the different statistical parameters of the four models suggests model 1 for further consideration having a better internal predictivity of q 2 = 0.9097 and an external test of pred_r 2 = -1.7535.Model 4 (kNN-MFA) though has a bad internal predictivity of q 2 = 0.3290, the external predictivity (pred_ r 2 =-0.4644) of the test is better as compared to the other models.The models show that steric effects dominantly determine the binding affinities indicating both a positive and a negative contribution suggesting a more bulky group is favored in the region where there is positive contribution and a less bulky group is favourable in the region with negative contribution.Modification of these regions will lead to better and improved compounds for the treatment of ) were considered for QSAR studies.The training and test set selected by sphere exclusion method show that the maximum of the test is less than maximum of training and minimum of test is greater than minimum of training set Table2(see supplementary material) indicating that the test is interpolative and derived within the min-max range of the training set.The mean of the training is higher than the mean of test set showing the presence of relatively less active molecules compared to the active ones.Also, a relatively higher standard deviation in the training set indicates that the training set has a widely distributed activity as compared to the test set.The 3D QSAR study of the fluroquinolones through MR, PLSR, PCR and kNN-MFA analysis using VLifeMDS resulted in the following statistical model.The statistical model (Model 1) shows a significant activity-descriptors relationship accuracy of 95% (r 2 = 0.9507) representing part of variation in the observed data, an activity prediction accuracy of 90% (q 2 = 0.9097) and a decrease in the predictivity for the external test set of about 0.017% (pred_r 2 =-1.7535).

Figure 2 :
Figure 2: Contribution plot of interaction to the MR model

Figure 3 :
Figure 3: Fitness plot of observed vs predicted activities of the MR model tuberculosis.The MR and kNN-MFA models developed show potential predictive ability as determined by testing the external test set.The QSAR models may lead to a better understanding of the structural requirements of anti-tuberculosis compounds and also help in designing of novel molecules.BIOINFORMATIONopen access ISSN 0973-2063 (online) 0973-8894 (print) Bioinformation 8(8): 381-387 (2012) 385 © 2012 Biomedical Informatics Supplementary material: