ISOB: A Database of Indigenous Snake Species of Bangladesh with respective known venom composition

At present there is no well structured database available for the venomous snakes and venom composition of snakes in the world although venom has immense importance in biomedical research. Searching for a specific venom component from NCBI, PDB or public databases is troublesome, because they contain huge amount of data entries. Therefore, we created a database named “ISOB” which is a web accessible unique secondary database that represents the first online available bioinformatics resource showing venom composition of snakes. This database provides a comprehensive overview of seventy-eight indigenous snake species covering description of snakes supplemented with structural information of the relevant individual available venom proteins. We strongly believe that this database will contribute significantly in the field of bioinformatics, environmental research, proteomics, drug development and rationale drug designing. Availability The database is freely available at http://www.snakebd.com/


Background:
Snakes are a group of amazing reptiles which have fascinated human being since beginning of mankind. There are about 3000 different snake species described on earth, of these, only about 350 are regarded as poisonous and dangerous for humans. Although snakes are an important parameter of our ecosystem and have tremendous importance in biopharmaceutical research, that been ignored for proper recording possibly due to the danger and panic of handling of this mysterious reptiles. In snakes, venom is very important which is produced and stored in highly specialized venom glands and used mainly for defense or to immobilize prey [1]. Each of these proteins and peptides has profound effect on various physiological systems of victim [2,3]. More than a century of research has shown that snake venom is a rich resource for pharmacologically active proteins and peptides [4] and good natural source of drug lead compounds. A number of drugs are currently available in the market derived from snake venom proteins and hundreds are in the different stages of developmental pipeline [5]. For example, Captopril ® is the most extensively used drug to control blood pressure and cardiac diseases, was developed from one of the venom components of pit viper snake (Bothrops jararaca). Scientists and clinicians in the pharmaceutical development field utilize the molecules of snake venom as therapeutic agents for several pathologies such as hypertension, cancer, thrombosis [6]. In spite of such huge importance there is no separate database available for snake venom proteins. "The Reptile Database" [7] is the only snake related database which merely covers the descriptive and taxonomical aspect of different snake species of the world. Nevertheless, the venom components which have great pharmacological interest are not included in the database. Searching of a specific venom component with particular interest and specific physiological function from NCBI, PDB or public databases are time consuming and troublesome because these databases contain huge amount of data entries. These facts forced to develop a separate database for the snake venoms proteins. Bangladesh is a country in South Asia, located on the fertile "Bengal delta" with rich flora and fauna. Despite of having great promise, snakes received no appropriate attention from the planners and professionals of this country. Snake population in Bangladesh is at high risk due to overwhelming population growth and reclamation of forest and wildlife habitat to mitigate the increasing land requirement for accommodation and food. Only a few documentations are present for snakes in Bangladesh. According to the report published by IUCN in 2000 there are about 98 species of snakes in Bangladesh, although over a dozen of these have not been reported during the last few decades. Therefore, there is an imperative need to have a database covering the venom components of the snake along with their structural and functional information. These facts motivated to develop a secondary database titled "ISOB: A Database of Indigenous Snake Species of Bangladesh with respective known venom composition" with detailed information available so far for seventy-eight indigenous snake species covering description of snakes along with the information of available regarding the venom proteins. It has a set of search tools that permit the users to extract data and carry out specific queries. The aim of ISOB is to offer a unique assortment of data extracted from both public databases and published reports which help to retrieval of information and the analysis of stored data. This database supplies hyperlinks to the corresponding entries of the related external databases. The ISOB database contains information of 78 species of snakes exhibiting relevant information about them. Since January 2013, it contained a total of 405 of snake toxins.  Database design: ISOB is developed using PHP (Hypertext preprocessor) as server-side scripting language and embedded in HTML. MySQL is applied as back-end database management system [14], because of its capability to maintain very large databases.
It is a user-friendly web-based application which connects to the database of the wide-ranging indigenous snake species of Bangladesh. The web application has been constructed with a simple and advanced search options. In simple search approach, one can search for a snake for its general information along with their venom components including their three-dimensional structures (either from PDB or predicted by molecular modeling) by directly using its family, genus, species or scientific name while the advanced search option has been built to search for venoms of any particular class those are present in available venomous snake species of Bangladesh. For selection of multiple sequences from each class a checkbox has been created beside each of the sequence. Using these selected sequences can be exported in a text file as FASTA format which will ease various bioinformatics analysis such as multiple sequence alignment, similar protein search using BLAST etc. The grouping of snake toxins provides a basis for extending and clarifying the existing structural and functional classification. Among all these venom components, 102 numbers of three dimensional structures were retrieved from protein databank and remaining 253 number of new unsolved protein structures were predicted using molecular modeling method SWISS-MODEL [15]. SWISS-MODEL is an entirely automated protein structure homology-modeling server, accessible through the ExPASy web server, or from the DeepView (Swiss Pdb-Viewer) program. The use of this server is to make Protein Modeling accessible to all biochemists and molecular biologists worldwide [16]. The models were created with the help of Rasmol software by using the PDB template generated from SWISSMODEL. This will ease the bioinformatics analyses for identification of sequence patterns associated with specific structural or functional properties of snake toxins. The internet browser accessible interface ISOB database is shown in Figure 1.

Features of the ISOB database records:
The ISOB database exists in as an open access, public domain secondary web database. This database acts as a complete web source for seventy-eight indigenous snake species of Bangladesh along with the information regarding their venom components. Out of 78, 31 snake species are venomous, 42 are mildly venomous and 5 are non-venomous. Among these 31 venomous snake species of Bangladesh, 405 venom components from 13 species are available to date. All the venom protein sequence information was systematically arranged in this database. Each of the records in the ISOB database has a unique documentation of each snake species encloses information about (1) Images with their sources (2) Common name (3) Local name (4) Scientific Name (5)  During this investigation, several errors were revealed in data from public sources and discrepancies of information supplied for the same toxins between different sources. However, In ISOB venom gallery, each of the venom has a unique accession number of the sequences of the snake database entries have that have corresponding entries in public databases such as Genbank, Swissprot and PDB and provides hyperlinks to the corresponding entries in these source databases. A representative entry of the overall data record is given in Figure 3. Protein table consists of the protein sequence with snake wise specific classes are unique feature of this database (Figure 4). Snake venom contains wide ranges of several structurally distinct families of peptides those have miscellaneous functions. As, snake bite is a burning public health issue and second most common cause of un-natural fatality in Bangladesh, ISOB included information regarding snake bite treatment in local language. Apart from the technical information, the database also gives the up to date state of the availability status of snakes in Bangladesh.

Discussion:
Venomous snakes are found in the family"s Colubridae (boomslang, vine snake), Elapidae (cobra, krait, mamba, taipan, tiger snake, coral snake), Hydrophiidae (sea snakes), Viperidae (old world vipers found in Europe, Africa, Asia but not in America or Australia, saw scaled viper, Russell's viper, puff adder, Gaboon viper), and Crotalidae (pit vipers, found in America, Asia and Europe, copperhead, cotton mouth, rattlesnakes [18]. In this current study mainly focused on the snakes found in Bangladesh. These snakes are mostly from the Elapidae,Viperidae, Hydrophiidae and colubridae families. Venom is a complex mixture of proteins and peptides (most of which reveal enzymatic activity), amines, lipids and other components which targets specifically different physiological systems of victim or pray [19]. Due to high activity and high specificity of venom components, they are useful for the study of highly complex biological mechanisms. A wide array of animals including terrestrial: ants, wasps, scorpions, spiders and snakes and marine organisms: Jellyfish, sea anemones, puffer fish, cone snails, sea snakes and octopuses produced venom. Rough estimates of the number of different venom components from individual snake, scorpion or cone snail is in the range of 50-200 different toxins [20] with the growing number of newly characterized toxins, their arrangements will require updating and need better data management. Till now information relating venom-toxin sequences and the features, structure and function of venom-toxins is scattered across multiple sources. Only basic sequence information is found in Primary databases such as GenBank or SWISS-PROT. These public databases contain huge amount of data entries which may be unnecessary for bioinformatics analysis of snake venom components with particular interest. For example, a basic search for all snake venom Phospholipase A2 (svPLA2) in the GenBank and SWISSPROT databases results 55% of irrelevant and redundant sequence entries (including pancreatic PLA2, bacterial PLA2 and PLA2 from other non venomous sources) and required to be filtered out before the analysis. Therefore, a fresh and comprehensive database comprising the description on the snakes supplemented with structural information of the relevant individual venom components available so far will be of special interest for the bioinformatics researchers. As far as our knowledge, only particular venom databases are at present as major sources for the study of venom toxin. The databases include entries collected from different sources, cleaned, organized, analyzed and classified in keeping with their structure-function relationship. The SCORPION database [21] of just about 300 entries of scorpion toxin sequences are annotated and classified along with their structural and functional properties. The Mollusk [22] database comprises over 450 peptides from the cone snail venoms where each entry has a distinctive field to make easy comparison of conotoxin entries. Functionally interpreted entries of svPLA2 toxins are found in the svPLA2 database [23]. However, currently there is no separate database for snake venom components. Therefore, the Database titled ISOB; a secondary database has been developed in the current project with detailed information available so far for 78 available snake species of Bangladesh. In this database, the venom has been grouped in different classes for the easy of searching a particular type of venom component among and across species. Three dimensional structures (3D) of a molecule have enormous implication and importance in Biological Sciences which offer a complementary approach to site-directed mutagenesis for identification of functional residues in venom toxins [24] aid in structure-function studies and also help in rationale drug designing. Structure determination by NMR is limited only to small peptide molecules whereas macromolecules crystallization is a slow and complex process, which involves optimization of various interdependent physical, chemical and biological parameters. Therefore, the prediction of 3D structures of proteins from primary structures by comparative protein modeling techniques is an attractive alternative for large number of toxins.

Conclusion:
A database for the indigenous snakes of Bangladesh is created and for the first time all available information regarding the snake venom proteins have been integrated and arranged under one umbrella. Three dimensional structure of 253 unsolved venom protein structure using homology modeling are also incorporated in the database. Such a huge number of predicted 3D structure will help in studying structure-function relationship, pharmacological action, protein evolution, developing mini protein scaffold and in numerous other bioinformatics analysis. This secondary database is a great initiative and it is strongly believed that scientist throughout the world will be benefited from the database and that will contribute in different bioinformatics investigations. We envision this database as a template for management of molecular data of toxins of various venomous snake species. Thus, this database is specialized for molecular analysis of snake toxins which will serve multidimensional research purposes. This database will serve as a basic guideline for future field based wild-life survey on reptiles of Bangladesh and it will be easier for the field researcher for identification of the available snake species of Bangladesh. It also assists to build up awareness of peoples for primary management and treatment during snake bite.

Future Development:
In future, refinement and periodical update of the database will be carried out so as to ensure that users get the latest information. It is also intended to include more sophisticated Bioinformatics tools into the database for facilitating in-depth structure-function studies, characterization of toxin interaction sites and drug discovery.