Molecular docking based screening of compounds against VP40 from Ebola virus

Ebola virus causes severe and often fatal hemorrhagic fevers in humans. The 2014 Ebola epidemic affected multiple countries. The virus matrix protein (VP40) plays a central role in virus assembly and budding. Since there is no FDA-approved vaccine or medicine against Ebola viral infection, discovering new compounds with different binding patterns against it is required. Therefore, we aim to identify small molecules that target the Arg 134 RNA binding and active site of VP40 protein. 1800 molecules were retrieved from PubChem compound database based on Structure Similarity and Conformers of pyrimidine-2, 4-dione. Molecular docking approach using Lamarckian Genetic Algorithm was carried out to find the potent inhibitors for VP40 based on calculated ligand-protein pairwise interaction energies. The grid maps representing the protein were calculated using auto grid and grid size was set to 60*60*60 points with grid spacing of 0.375 Ǻ. Ten independent docking runs were carried out for each ligand and results were clustered according to the 1.0 Ǻ RMSD criteria. The post-docking analysis showed that binding energies ranged from -8.87 to 0.6 Kcal/mol. We report 7 molecules, which showed promising ADMET results, LD-50, as well as H-bond interaction in the binding pocket. The small molecules discovered could act as potential inhibitors for VP40 and could interfere with virus assembly and budding process.

VP40 is essential for Ebov viral assembly. Crystal structure, biochemistry and cellular microscopy postulated that VP40 rearranges into different structures, each with a distinct function required for the ebolavirus life cycle [3]. Figure 1 shows the 3D structure of the matrix protein VP40 from Ebola virus of Sudan. The structure revealed that VP40 contains distinct N-and Cterminal domains, both are essential for trafficking to and interaction with the membrane. Bornholdt et al. stated that a butterfly-shaped VP40 dimer traffics to the cellular membrane. There, electrostatic interactions trigger rearrangement of the polypeptide into a linear hexamer. These hexamers construct a multi-layered, filamentous matrix structure that is critical for budding and resembles tomograms of authentic virions. A third structure of VP40, formed by a different rearrangement, is not involved in virus assembly, but instead uniquely binds RNA to regulate viral transcription inside infected cells [3]. In addition, the crystallographic structure of VP40-RNA reveals that the R-134 and F-125 of VP40 mainly interact with RNA [4]. Thus, the RNA binding pocket of VP40 could be considered as the drug target site for structure based drug design [5][6][7].
For finding lead compounds for anti VP40 drug design by virtual screening, traditional Chinese medicine (TCM) database and Asinex database were used in one study [5], Zinc database was used in another study [6], and TCM database alone was used in a more recent study [7].
Pyrimidinediones are a class of chemical compounds characterized by a pyrimidine ring substituted with two carbonyl groups. The compound 1-[(2s,4s,5r)-4-hydroxy-5methyloxolan-2-yl]-5-methylpyrimidine,2,4dione was one of 4 selected compounds to show possible inhibition of VP40 using virtual screening [8]. In our study however, we searched for conformers of pyrimidine-2,4-dione in PubChem database. We found 1800 conformers and used them for molecular docking of Crystal Structure of matrix protein VP40 from Sudan Ebola virus targeting AA Arg 134. Docking and ADMET studies were done according to the method previously described [9]. Seven chemical compounds were found to have promising ADMET results, LD-50, as well as H-bond interaction in the binding pocket. These seven molecules are waiting to be studied in vitro.

Methodology:
Ligand generation 2D equivalent structural derivatives of 1-[(2s,4s,5r)-4-hydroxy-5methyloxolan-2-yl]-5-methylpyrimidine,2,4 dione were searched in Pub Chem. By using this compound and inbuilt similarity fingerprinting search, molecules with minimum a 0.5 Tanimoto score had been selected. The search generates a total of 3000 ligands. Only 1800 molecules were tested. Chem Sketch was used for sketching and generating MDL\Mol v2000 molecules with 2D coordinates. The ligands were converted into Protein Databank (PDB) formats.

Active Site
The amino acid Arg 134 is where matrix protein interacts with RNA which lies in the N-terminal domain. This AA was chosen for docking.

Docking Setup
Protein-Ligand docking had been tested by Auto Dock 4 [11], which combines energy evaluation through grids of affinity potential employing various search algorithms to find the suitable binding position for a ligand on a given protein. Docking involved the addition of polar hydrogen to the ligands using the Auto Dock hydrogen module and assigning the Kollman united atom partial charges. A standard docking procedure was used. It involved randomly placed individuals with a population size of 150. Maximum number of energy evaluations were 2.5 X10 7 and the mutation rate was 0.2 with crossing over rate of 0.80. Elitism was 1 for every generation. The results were clustered according to the 1.0 Ǻ rmsd criterion with 10 independent docking runs for each ligand. Auto Grid was used to calculate the grid maps representing proteins. Grid size was set to 60*60*60 points with grid spacing of 0.375 Ǻ. UCSF chimera was used to visualize the co-ordinates of the docked proteins to ligands within a region of 5 Ǻ and the hydrogen bond to stabilize the molecule -protein interaction (http://www.cgl.ucsf.edu/ chimera/).
Molecules showing minimum binding energies were evaluated for drug likeness using "Lipinski Rule of Five" for detecting probable pharmacokinetics. In addition, they were all subjected to molecular properties and drug likeness scores.
LD-50 was determined using PROTOX, a webserver for the prediction of oral toxicities of small molecules in rodents (http://tox.charite.de/tox). Toxic doses are often given as LD50 values in mg/kg body weight. The LD50 is the median lethal dose meaning the dose at which 50% of test subjects die upon exposure to a compound [12].
Toxicity classes are defined according to the globally harmonized system of classification of labeling of chemicals (GHS): (1) Class I: fatal if swallowed (LD50 ≤ 5 mg/kg); (2) Class II: fatal if swallowed (5 < LD50 ≤ 50 mg/kg) (3) Class III: toxic if swallowed (50 < LD50 ≤ 300 mg/kg) (4) Class IV: harmful if swallowed (300 < LD50 ≤ 2000 mg/kg) (5) Class V: may be harmful if swallowed (2000 < LD50 ≤ 5000 mg/kg) (6) Class VI: non-toxic (LD50 > 5000 mg/kg) Toxicity targets are protein targets, which have been associated with adverse drug reactions and toxic effects. They predict possible binding to toxicity targets using a collection of proteinligand-based pharmacophores. All molecules were also subjected to drug likeness, Absorption, distribution, metabolism and elimination (ADME) using the Molsoft (http://molsoft.com/ mprop/) and (http:// ilab.acdlabs.com/ilab2/) websites.   Results: Figure 1 shows 3D of Crystal Structure of matrix protein VP40 from Ebola virus Sudan acquired from RCSB. Figure 2 shows chemical structures of seven chosen molecules with the least minimum docking energy for VP40, best drug likeness and LD-50 scores. They are all conformers of pyrimidine, 2,4 dione. All chemical properties of the seven chosen molecules are shown in Table 1. Figure 3 shows docking interaction of the top seven chosen compounds with matrix protein VP40 argenin 134.Molecules 1 to 6 shows interaction with Arg134 in the form of hydrogen bonds. However, molecule 7 was very interesting since it gave a hydrogen bond with ASN 136, which gave a chance of another hydrogen bonding of Arg 134 with this AA leading to almost the same effect where Arg 134 was engaged in a hydrogen bond and will not be available for RNA binding. Table 2 shows minimum binding energy of the 7 chosen molecules that ranged from -5.21 to -4.08. All agreed to Lipinsk's rule but number 5. All have low Ames probability. LD-50 ranged from 16877 to 8400 and the 7 molecules are thus nontoxic of class 6. They are arranged from the highest to the lowest LD-50.

Discussion:
According to the CDC guidelines, no FDA-approved vaccine or medicine (e.g., antiviral drug) is available for Ebola up until now

Conclusion:
Based on the molecular docking and ADMET studies of 1800 small molecules derivatives of pyrimidine-2, 6-dione we chose 7 molecules with good minimum binding energy and drug likeness. All seven chosen molecules showed hydrogen bonding with amino acid Arg 134 of Ebola virus matrix protein VP40 suggesting that they hinder binding with viral RNA and hence can abort budding process. These seven reported molecules are promising as drugs against EBOV. They wait to be verified experimentally by testing them in vitro and in vivo to reach for candidate drugs.

Author contribution:
Alam El-Din HM wrote the manuscript and revised it. Alam El-Din HM, Lotfy SA, Fathy N, Elberry M, Mayla AM, and Kassem S did the molecular docking and ADMET studies of 300 molecules each. Naqvi A tutored the rest with the methods, chose the protein, its active site and chose the lead compound and its conformers.