2D-QSAR model development and analysis on variant groups of anti-tuberculosis drugs.

A quantitative structure activity relationship study was performed on different groups of anti-tuberculosis drug compound for establishing quantitative relationship between biological activity and their physicochemical /structural properties. In recent years, a large number of herbal drugs are promoted in treatment of tuberculosis especially due to the emergence of MDR (multi drug resistance) and XDR (extensive drug resistance) tuberculosis. Multidrug-resistant TB (MDR-TB) is resistant to front-line drugs (isoniazid and rifampicin, the most powerful anti-TB drugs) and extensively drug-resistant TB (XDR-TB) is resistant to front-line and second-line drugs. The possibility of drug resistance TB increases when patient does not take prescribed drugs for defined time period. Natural products (secondary metabolites) isolated from the variety of sources including terrestrial and marine plants and animals, and microorganisms, have been recognized as having antituberculosis action and have recently been tested preclinically for their growth inhibitory activity towards Mycobacterium tuberculosis or related organisms. A quantitative structure activity relationship (QSAR) studies were performed to explore the antituberculosis compound from the derivatives of natural products . Theoretical results are in accord with the in vitro experimental data with reported growth inhibitory activity towards Mycobacterium tuberculosis or related organisms. Antitubercular activity was predicted through QSAR model, developed by forward feed multiple linear regression method with leave-one-out approach. Relationship correlating measure of QSAR model was 74% (R(2) = 0.74) and predictive accuracy was 72% (RCV(2) = 0.72). QSAR studies indicate that dipole energy and heat of formation correlate well with anti-tubercular activity. These results could offer useful references for understanding mechanisms and directing the molecular design of new lead compounds with improved anti-tubercular activity. The generated QSAR model revealed the importance of structural, thermodynamic and electro topological parameters. The quantitative structure activity relationship provides important structural insight in designing of potent antitubercular agent.


Background:
Infectious diseases are influencing the world with their morbidity and mortality.Tuberculosis is one among the major infectious diseases caused by Mycobacterium tuberculosis [1,2].It remains the leading cause of mortality due to a bacterial pathogen.WHO estimated that there were 8.8 million new cases of tuberculosis in 2020.About one-third of the world population is infected with M. tuberculosis, 10% of which will develop the disease at some point in their lives [3].Four prescribed drugs named as Isoniazid (INH), Rifampin, Pyrazinamide, and Ethambutol, are currently used for the treatment of active TB for a period of at least 6 months.Due to long time period, patients generally failed to complete the therapy which leads to the emergence of multidrug resistant TB (MDRTB) and extensively drug resistant TB (XDRTB).The susceptibility of active TB increases rapidly because of human immunodeficiency virus (HIV) infections, which exacerbated the situation.World Health Organization (WHO) has declared TB a global public health emergency [4].Therefore, the development of new drugs with activity against multi drugresistant (MDR) TB, extensively drug-resistant (XDR) TB, and latent TB is a priority task.New drug agents that will shorten the duration of current chemotherapy are also needed.Due to these factors however, tuberculosis still remains a leading cause of death world-wide.Natural products, or their direct derivatives, play crucial roles in the modern day chemotherapy of tuberculosis [5].There is currently a re-emerging interest in natural products as being able to provide novel structures for the drug discovery effort and being particularly effective as antibacterial lead [6-8].In the present study we cover the literature published on natural products or their direct derivatives that exhibit growth inhibitory activity towards mycobacteria, and in particular the causative pathogen of tuberculosis, Mycobacterium tuberculosis.Current natural product and related derivative antimycobacterial agents exhibit wide-ranging in vitro potency towards M. tuberculosis, with minimum inhibitory concentrations (MIC) from 0.2 µg/ml (rifampicin) through 10 µg/ml (cycloserine) [9].The main objective of the present study was the search for novel natural compounds that would show a promise to become useful antimycobacterial agent.A series of compounds of Phenolics and quinines, Peptides, Alkaloids, Terpenes: Monoterpenes, Diterpenes, Sesquiterpenes, Triterpenes, Steroids and their synthetic, semisynthetic derivatives were selected as novel antimycobacterial agents for 2D-QSAR studies.In the present study, we screen out potential anti-tubercular activity through quantitative structure activity relationship (QSAR) for which a multiple linear QSAR regression model was developed which successfully establishes the anti-tubercular activity of different compounds in accord with the in vitro experimental data.The 2D-QSAR model provides the activity dependent structural descriptors and predicts the effective dose of other derivatives, and suggests their possible toxicity range.The relationship correlating measure of QSAR model was 74% (R 2 =0.74) and predictive accuracy was 72% (RCV 2 = 0.72).Using Lipinsky's 'Rule of Five' Druggability of studied compounds was evaluated and in silico ADME analysis was done through bioavailability filters.QSAR studies indicate that dipole energy and heat of formation correlate well with anti-tubercular activity.These results could offer useful references for understanding mechanisms and directing the molecular design of new lead compounds with improved antitubercular activity.

Methodology:
The natural and synthetic drug compounds exhibiting the potent antimycobacterial Activity was taken from the reported work [14 -29].The literature values and general structure of the molecule are given in (Table 1, see Supplementary material) The activity data given as MIC values.The biological activity value [MIC (µg/ml)] reported in literature are converted to -log scale and subsequently used as the dependent variable for the QSAR analysis.The -log values of MIC along with the structure of the 24 compounds in the series is presented in (Table 2, see Supplementary material), the chemical structures of known drugs were retrieved through the PubChem compound database at NCBI (http://www.pubchem.ncbi.nlm.nih.gov), while others are drawn in chem-axon /marwin sketch software.All the computational studies were performed using the Scigress Explorer v7.7.0.47.The optimization of the cleaned molecules was done through MO-G computational application that computes and minimizes an energy related to the heat of formation.The MO-G computational application solves the Schrodinger equation for the best molecular orbital and geometry of the ligand molecules.The augmented Molecular Mechanics (MM2/MM3) parameter was used for optimizing the molecules up to its lowest stable energy state.This energy minimization is done until the energy change is less than 0.001 kcal/mol or the molecules are updated almost 300 times.Then optimized molecules were selected for calculation of the physiochemical descriptors by inserting biological activity as a dependable variable.Various 2D descriptors were calculated for optimized structures of the molecules using QSAR module of Scigress Explorer v7.7.0.47.A large number of descriptors were generated like structural, topological, electro topological and thermodynamic descriptor.The descriptor pool was reduced by removing invariable column in Scigress Explorer.The remaining physicochemical descriptors were taken into account for the reported analysis.Manual data selection method was used for data selection and variable selection.Forward feed multiple linear regression mathematical expression was then used to predict the biological response of other derivatives.QSAR analysis is a mathematical procedure by which the chemical structures of molecules is quantitatively correlated with a well defined parameter, such as biological activity or chemical reactivity.For example, biological activity can be expressed quantitatively as in the concentration of a substance required to give a certain biological response.Additionally, when physicochemical properties or structures are expressed by numbers, one can form a mathematical relationship, or quantitative structure-activity relationship, between the two.The mathematical expression can then be used to predict the biological response of other chemical structures.QSAR's most general mathematical form is; Activity = f (physiochemical properties and/or structural properties) A QSAR model attempts to find consistent relationships between the variations in the values of molecular properties and the biological activity for a series of compounds which can then be used to evaluate properties of new chemical entities [10, 11] Some of the important chemical descriptors used in multiple linear regression analysis are: atom count (all atoms), atom count (carbon), atom count (hydrogen), atom count (oxygen), bond count (all bonds), conformation minimum energy (kcal/mole), connectivity index (order 0, standard), connectivity index (order 1, standard), connectivity index (order 2, standard), dipole moment (debye), dipole vector X (debye), dipole vector Y (debye), dipole vector Z (debye), electron affinity (eV), dielectric energy (kcal/mole), steric energy (kcal/mole), total energy (Hartree), group count (amine), group count (carboxyl), group count (ether), group count (hydroxyl), group count (methyl), heat of formation (kcal/mole), HOMO energy (eV), ionization potential (eV), lambda max UV-visible (nm), lambda max far-UV-visible (nm), LogP, LUMO energy (eV), molar refractivity, molecular weight, polarizability, ring count (all rings), size of smallest ring, size of largest ring, and solvent accessibility surface area (Å2).Lipinski's rule of five pharmacokinetics filter was used as a drug likeness test

Results and Discussion:
Quantitative structure-activity relationship (QSAR) modeling: Structure activity relationship has been denoted by QSAR model showing significant activity-descriptors relationship accuracy of 74% (R 2 = 0.74) and activity prediction accuracy of 72% (RCV 2 = 0.72).Initially a total of 79 drugs were used for QSAR modeling against 42 chemical descriptors.Only two descriptors were found to be significant and seem to be responsible for in vitro anti-tubercular activity (Table 4, see Supplementary material).A forward feed multiple linear regression QSAR model was developed using leave-one-out approach for the prediction of biological activity of anti tuberculosis drug molecules.We looked for a simpler descriptor for the prediction of biological in vitro activity for studied class of compounds.QSAR studies indicate that dipole energy and heat of formation correlate well with biological activity (Table 4, see Supplementary material).The QSAR mathematical model equation derived through multiple linear regression method is given below, showing relationship between in vitro experimental activity (MIC) and dependent two chemical descriptors: The best 2DQSAR and the statistics obtained are listed below: Predicted log MIC) (µg/ml) = +2.4116*DielectricEnergy (kcal/mole)-0.00395099*Heat of Formation (kcal/mole) +1.64401 [RCV 2 = 0.72 (72%) and R 2 = 0.74 (74%)] Eighteen drug molecules with reported anti-mycobacterial activity were included in the training data set for comparison and six molecules used as test set for evaluation of prediction accuracy of QSAR model.Results showed that predicted activity of Oleanolic Acid and Betulin were comparable with experimental activity.A plot between experimental activity and predicted activity for both training and test set, is shown as fitness plot (Figure 1).Results indicate that Betulin molecule had higher anti -tubercular activity than Oleanolic Acid.Compliance of studied compounds also verified by Lipinski's rule-of-five for drug likeness (Table 3, see Supplementary material).Results indicate that compounds follow most of the ADME properties, thus leading to a good drug candidate for antimycobacterial and antitubercular activity.This helped in establishing the pharmacological activity of natural compounds for their use as potential drugs.Moreover, when we calculated the topological polar surface area (TPSA) as a chemical descriptor for passive molecular transport through membranes, results showed lower TPSA of natural compounds than standard drugs (Table 3, see Supplementary material).TPSA allows for prediction of transport properties of drugs and has been linked to drug bioavailability.Generally, it has been seen that passively absorbed molecules with a TPSA > 140 Å2 are thought to have low oral bioavailability [13].On the basis of bioavailability scores, we concluded that natural compounds have marked antimycobacterial activity but higher log P as compared to standard drugs.

Conclusion:
Twenty four natural and synthetic drug compounds are evaluated for antimycobacterial activity by 2D QSAR studies with oleanolic acid and betulin exhibiting good activity.The 2D technique indicates the importance of dielectric energy and heat of formation of the compounds on activity.ADME and Tox Predictions indicate that these compounds do not violate the Lipinski's rule of five.In this sequence, dielectric energy, heats of formation physico-chemical parameter are common in all four models.These parameter show positive contribution in all four models.Therefore it is considered as desirable properties of MTB inhibitors.Comparison of different statistical parameters and validation parameters for models 1-4 suggests model 4 for further consideration.It has good correlation between biological activity and parameters as RCV 2 = 0.72 (72%) and R 2 = 0.74 (74%) variance in inhibitory activity.The low standard error demonstrates accuracy of the model.Descriptors used in the Significant QSAR Model-4 with value is given in Table 4. QSAR model with reliable predictive power for MTB inhibitory activity has been successfully demonstrated.The good correlation between experimental and predicted biological activity for compounds in the test set further highlights the reliability of the constructed QSAR model.The finding of the study will be helpful in the design of the potent MTB inhibitors which are useful for anti-tubercular activity.

Supplementary material:
[10].Briefly, this rule is based on the observation that most orally administered drugs have a molecular weight (MW) of 500 or BIOINFORMATION open access ISSN 0973-2063 (online) 0973-8894 (print) Bioinformation 7(2): 82-90 (2011) 84 © 2011 Biomedical Informatics less, a logP no higher than 5, five or fewer hydrogen bond donor sites, and 10 or fewer hydrogen bond acceptor sites (N and O atoms).In addition, the bioavailability of all derivatives or test compounds was assessed through topological polar surface area analysis.Molinspiration offers free on-line cheminformatics services (http://www.molinspiration.com/cgi-bin/properties).The structures of the molecule were sketched in Molinspiration, and the physicochemical properties such as logP, polar surface area, number of hydrogen bond donors and acceptors and bioactivity scores such as GPCR ligands, kinase inhibitors, ion channel modulators etc. were predicted [12].

Figure 1 :
Figure 1: Fitness plot between experimental and predicted activity (log MIC) of training and test set.

Table 1 :
Biological activity data and calculated -Log value for compounds

Table 3 :
Compliance of compounds with computational parameters of drug likeness.Lipinski's Rule of Five and the number of violations (using Molinspiration online tool)

Table 4 :
Comparison of experimental and predicted in vitro activity data calculated through QSAR modeling based on the two most highly correlated chemical descriptors