Structure prediction of Bestrophin for the induced - fit docking of anthocyanins

Bestrophin, an integral membrane protein existing in basolateral region of the retina is a propitious target for drug discovery. Mutations in the Bestrophin protein cause Best Vitelliform Macular Dystrophy (BVMD) leading to retinal damages and loss of visual acuity. Owing to the lack of three dimensional structure and related structural homologs in the protein data bank, we modeled the bestrophin protein using Robetta ab initio method. Further, no treatment is available for the disease. In this situation, anthocyanins from natural sources are reported to combat retinal damages. Hence, we identified anthocyanins from Syzygium cumini fruit skin using Electrospray Ionization tandem mass spectrometry. These compounds were docked into the predicted bestrophin model to study the interactions within the active site. The results may provide a valuable insight into the structure of bestrophin and efficacy of anthocyanins in molecular docking studies. Abbreviations PTP - Putative transmembrane proteins, VMD - Vitelliform macular dystrophy, BVMD - Best's vitelliform macular dystrophy, RPE - Retinal pigment epithelium, ESI-MS/MS - Electrospray Ionization Tandem Mass Spectrometry, UNIPROT - Universal Protein Resource, PSIPRED - Protein secondary structure prediction, TMH - Transmembrane Helices, SCFS - Syzygium cumini fruit skin DP - Declustering Potential IFD - Induced Fit Docking.

Bestrophin being a PTP is encoded in the gene VMD2. Mutations in the VMD2 gene causes BVMD, an inherited progressive macular dystrophy, first identified in 1905 by a German ophthalmologist Frederich Best. BVMD advances with juvenile onset causing loss of visual acuity due to atrophic macular changes or choroidal neovascularization associated with sub retinal hemorrhages and fibrosis [2]. Early stages of the disease are characterized by abnormal depositions of lipofuscin-like material at the level of the RPE classically resembling an egg yolk [3]. Disintegration of the central yellow lesions progressively leads to vision loss at a later stage. The elemental cause of the Best disease is the BEST1 gene (previously called VMD2, currently as hbest1), identified in 1998 [4].
Determining the three-dimensional structures of PTPs remains a challenge, although they constitute 15-30% of the entire genome. Bestrophin being a plasma membrane protein has a molecular weight of 68 kDa and consists of 585 amino acids [5]. Computationally, bestrophin is predicted as a transmembrane protein with four membrane-spanning α helical domains, while presence of atleast five helices are reported by few studies [6]. An ecumenical outcome suggests that Bestrophin functions as a chloride ion channel [7]. The homology modeling of Asp-rich domain of hbest1 identified two calcium binding sites, yet the complete structural details of the transmembrane protein are not available [8]. Obtaining the protein structure is crucial to understand protein function which eventually leads to drug designing [9, 10]. Hence the hbest1 protein was modeled using the Robetta web server (http://robetta.bakerlab.org) [11].
Natural compounds are tangible as therapeutic drug targets. More significantly, anthocyanins belonging to the flavonoid group are beneficial in curing visual acuity [12]. Anthocyanins are water-soluble pigments, pre-eminent in a variety of plants mainly imparting colour to flowers and fruits [13]. Innumerable reports exist for the rich resource of anthocyanins in the fruits of Syzygium cumini (L.) Skeels (Black Plum) [14,15]. Our study also involves quantification of anthocyanins from the fruit skin by ESI -MS/MS Mass Spectrometer. Small molecular structures of the separated anthocyanins were obtained from publicly available chemical databases and used for molecular docking with hbest1. According to the literature, Resveratrol and Niflumic acid found to be useful in combating retinal damages was also used in our docking studies [16].

Secondary structure Prediction
The secondary structure of hbest1 was predicted using PSIPRED v3.0 [18,19]. Neural nets are used to convert PsiBlast profile data to secondary structure propensities. A putative secondary structure is obtained for each residue associated with a confidence value for the prediction.

Prediction of Transmembrane regions
The predictors available are inaccurate in predicting the ends of TMHs, or TMHs of unusual length [20]. MemBrain, a predictor based on machine learning approach was used to predict the TMHs of hbest1.

Ab Initio modeling
Pertaining to lack of suitable structural template for hbest1, we resorted to Ab initio modeling of the hbest1 using Robetta, a full chain Protein prediction server. The primary sequence was submitted to the server, and the generated models were received by email.

Structure Validation
Five structures were obtained from Robetta server. The qualities of the models were evaluated using the ProQ webserver [21]. The models were validated using PROCHECK program [22].

Experimental Fruit sample
Mature fruits of Syzygium cumini were collected from Sirumalai hills of Dindigul, Tamilnadu, India and identified by Dr P. Eganathan, M. S. Swaminathan Research Foundation, Chennai. Rotten, damaged fruits were removed; skin was peeled from fruits without pulp and shade dried under room temperature (31°C) until complete disappearance of moisture. Dried skin was stored for further analysis.

Solvents and Reagents
Analytical grade chemicals and solvents purchased from Sigma Aldrich (Saint Louis, MO) were used in the present investigations.

Preparation of extract for ESI-MS/MS analysis
SCFS was ground to a fine powder using a Thomas Wiley Machine (Model 5 USA) at room temperature. Subsequently, 50g of powdered plant material was extracted with 1.5 L methanol using soxhlet apparatus at 65°C for 4 hours consecutively. The solvent was removed in vacuo using a Buchi Heating Bath (B-490) rotavapor to yield dried methanol (purple color) extract. The extract was filtered and evaporated to dryness in a vacuum at 40°C with a rotary evaporator.

ESI -MS/MS Analysis
Crude methanol extract of SCFS was diluted with methanol and filtered with 0.22µm nylon membrane filter and subjected to ESI-MS/MS analysis. Anthocyanin identification was performed on a 3200 QTRAP instrument (ABSciex Instruments, Singapore) equipped with Linear Ion Trap Quadrupole Mass Spectrometer and electrospray ionization (ESI) source. Mass parameters DP was adjusted to get the maximum sensitivity. Data was generated by Analyst 1.4.1 software. The MS-MS conditions are : positive ion mode; gas (N2), curtain gas was set to 15 psi, heater gas and nebulizer gas were set to 10 psi and source temperature maintained at ambient.The positively identified compounds (cyanidin,petunidin,malvidin) were subjected to MS-MS analysis to study the fragment patterns and were found to match with that of the earlier reported compounds.

IFD protocol
Induced fit docking combines Glide and Prime modules to arrive at accurate prediction of ligand binding modes and concomitant structural changes in the receptor [27]. Systematic and Simulation methods are adopted by glide for searching poses and ligand flexibility. Incremental construction for searching is employed by the systematic method, with Gscore being the empirical scoring function [28]. The calculation of GScore in Kcal/mol is: G-Score = H bond + Lipo+ Metal + Site + 0.130 Coul + 0.065vdW -Bury P -RotB. Where Hbond= Hydrogen bonds, Lipo = hydrophobic interactions, Metalmetal binding term, Site = Polar interactions in the binding site, vdW = Vander Waals forces, Coul = coulombic forces, Bury P= penalty for buried polar group, RotB= freezinf rotatable bonds. The prepared protein was docked with the minimized ligands. The active sites in the protein hbest1 were selected to be docked with the ligand. IFD was performed and best conformations were selected based on Glide Score, Glide energy, and Glide emodel scores.

Results & Discussion:
The percentage of secondary structures predicted by PSIPRED showed that 35.90% of the total structures were alpha helices, 3.07% were beta sheets and 59.48% were random coils within the target sequence. The protein was predicted to have 16 helices, 18 coils and 4 beta sheets ( Figure 1A). The percentage of coils is more than the helices in the sequence. The C-terminal region of the hbest1 is composed of about 150 residues indicating the presence of a randomly coiled region. The Nterminal region, predicted to have signal peptide region has also a coiled region. Secondary structure prediction provides valuable information about the content of the protein which provides insights into the tertiary structure of hbest1.
Prediction of TMHs in helical membrane proteins provides valuable information about the protein topology when the high resolution structures are not available. The predictors available are inaccurate in predicting the ends of TMHs, or TMHs of unusual length [20]. The prediction accuracy of MemBrain is 97.9%. N-terminal signal peptides were also detected. Five Transmembrane helices were predicted by MemBrain ( Figure  1B).
The models obtained for hbest1 from Robetta server were evaluated by the ProQ program with the following results  Figure 1C). Validation of the model by PROCHECK presented a Ramachandran plot analysis rendering 98.1% residues in the most favoured regions, 1.9% in the additionally allowed regions indicating the efficacy of Ab initio modeling. Secondary structural elements of the predicted hbest1 were found to be almost similar to the PSIPRED server results of the primary sequence.
Docking studies suggest the ability of the three anthocyanins to bind to active and additional active sites around hbest1. The binding poses and interactions were analyzed by Glide Table 1 (see supplementary material) & (Figure 2). The glide energy, glide score and glide emodel score for Cyanidin 3, 5 glucoside were -64. 34 Kcal/mol and -11.80 Kcal/mol -71.60 Kcal/mol respectively. The compound interacts with Leu 82, Gln59, Arg105, and Arg47 that are the active sites within 20 A° distance of the literature cited residues. Hydrophobic contacts are with Phe82, Phe48, and Thr66. Malvidin 3, 5 diglucoside generated glide energy of -71.61 kcal/mol, glide score of -5.15 Kcal/mol and glide emodel score of -92.42 Kcal/mol. Interacting residues are Arg92, Arg461, Arg534, and Thr 87. Arg92 is the active site in interaction with Malvidin. Residues Ile129, Thr 91, Pro152, Asn 136 are in the hydrophobic pockets of docked structure. Petunidin 3, 7 diglucoside provided glide energy of -68.39 Kcal/mol, a glide score of -7.98 Kcal/mol and glide emodel score of -94.47 Kcal/mol. Asn136, Thr87, Arg461, are the residues in interaction besides Arg92 and Trp94 that are the two active sites involved in the interactions. Hydrophobic interactions comprise of Pro152 and Val90. Peonidin 3, 5 diglucoside resulted in a glide energy of -58.82 Kcal/mol, glide score of -8.47 Kcal/mol and glide emodel score of -85.17 Kcal/mol. Hydrogen bond interactions were exhibited with Asn133, Arg 534, and Gln460. Trp94, Val90, Pro406 residues were found in hydrophobic contacts. A glide energy of -77. 17 Kcal/mol, glide score of -9.18 and a glide emodel score of -107.57 Kcal/mol was generated by delphinidin 3,5 diglucoside. Hydrogen bond interactions include Gly159, Glu382, Ile49, Arg461, and Gln 96, while hydrophobic interactions involve active sites at Arg92, and Trp93. All anthocyanins have