open access www.bioinformation.net Database Volume 7(1) Antagomirbase: A putative antagomir database

The accurate prediction of a comprehensive set of messenger putative antagomirs against microRNAs (miRNAs) remains an open problem. In particular, a set of putative antagomirs against human miRNA is predicted in this current version of database. We have developed Antagomir database, based on putative antagomirs-miRNA heterodimers. In this work, the human miRNA dataset was used as template to design putative antagomirs, using GC content and secondary structures as parameters. The algorithm used predicted the free energy of unbound antagomirs. Although in its infancy the development of antagomirs, that can target cell specific genes or families of genes, may pave the way forward for the generation of a new class of therapeutics, to treat complex inflammatory diseases. Future versions need to incorporate further sequences from other mammalian homologues for designing of antagomirs for aid in research. 
 
Availability 
The database is available for free at http://bioinfopresidencycollegekolkata.edu.in/antagomirs.html


Background:
MicroRNAs (miRNAs) are a class of endogenous, small regulatory RNA averaging 22 nucleotides in length that mediate the post-transcriptional regulation of messenger RNAs.They bind to target messages in a sequence-specific manner, and induce translational repression or endonucleolytic cleavage.Recently, a novel class of chemically engineered oligonucleotides, termed antagomirs, is efficient and specific silencers of miRNAs.Moreover, the recent employment of synthetic analogues of these small RNA molecules termed 'antagomirs' has shown that microRNAs of interest can be specifically targeted [1, 2].Functions have only been experimentally assigned to a small fraction of the experimentally designed antagomirs.These putative antagomirs are 20 nucleotides in length and complementary to the mature target human-miRNA.They specifically silenced miRNA expression (miR-122) up-regulating expression of hundreds of genes predicted to be repressed by miR [3].Paradoxically, antagomir treatment also revealed a significant number of down-regulated genes that may be activated as opposed to repressed by miR [4, 5, 6].Computational approaches are thus likely to remain an important means of studying antagomir targets for the foreseeable future, not least as a means of directing wet-lab experiments.Several algorithms have been used to predict antagomir targets in animal species; these are listed in Table 1 (see Supplementary material).We built an experimental antagomir in a way that it could adhere to several humiR, respectively.Interactions are viewed in a global context by predicting folds for the entire miRNA, rather than just its 3'-UTR or seed match.The stem-loop sequences of humiR have been used in our experiment.The probability profile displays predicted accessible sites on the target RNA.Because an accessible site can be targeted by a number of antisense oligos, selection of the "optimal" one can be based on binding energy, together with other empirical rules such as GC content, avoidance of GGGG (or more stringent GGG) motifs, etc. Stronger binding is indicated by smaller binding energy (stacking energies are negatively valued).For example, an antisense oligo with a binding energy of -10 kcal/mol is more effective than an oligo with a binding energy of -5 kcal/mol.The antisense oligo binding energy is a weighted sum of the antagomir/miRNA stacking energies for the hybrid formed by the antagomir and the targeted sequence.For a base-pair stack, the weight for the sum is calculated by the probability of the unpaired dinucleotide in the target sequence that is involved in the stack.This weighting scheme accounts for the structural variation at the target site among the structures in the sample (Figure 1).

Implementation: Database:
The antagomir database is based on the following assumptions.

Tool:
A tool has been integrated in this database.The tool accepts a sequence of 20-25 nuleotides and if it finds a match with the existing putative antagomirs in the database then it returns the respective target i.e. humiR ID along with the secondary structure of antagomirs.

Statistical analysis of predicted targets: Negative normalized free energy:
The occurrence of favourable hybridizations of short antagomirs with long miRNAs can frequently be attributed to chance: the longer the miRNA, the more likely the incidence.In order to eliminate the effect of sequence length on our measure of free energy [7, 8], we define the negative normalized free energy where m is the length of the target sequence searched, and n is the length of the antagomir as shown in equation 1: gn = -(g/log(mn)).

Extreme value statistics:
Extreme value distributions (EVDs) are limiting distributions that describe the minimum or maximum of independent random variables [9].If we consider the antagomir-miRNA duplex energy estimation to be essentially an optimization procedure that produces a minimum, the negative normalized free energy described above is a corresponding maximum, and can be described by an EVD having a distribution function of the form; P[G≤t]=D(t)=exp(-exp(a-t/b)) (equation 2).A transformation then converts this distribution function into a straight line: Log(-log(D))=(a-t)/b=(-1/b)t + a/b (equation 3).By scanning for targets of random antagomir sequences in the miRNA sequences in the dataset, we obtain a set of negative normalized free energies, which we expect will follow an EVD.We then transform the distribution function of the empirical EVD into a straight line, as in Equation 3, and estimate the parameters of the EVD by a linear least squares fit to the line y = mx + c, obtaining b=-1/m (equation 4) and a=cb (equation 5).We can now compute, for each predicted antagomir-miRNA duplex, a p-value, the probability that the same or a more favorable free energy is observed due to chance: P[Z≥ gn]=1exp(-exp((a-gn)/b)) (equation 6), where a and b are estimated EVD parameters, and gn is the negative normalized free energy from equation 1 [7].

Results:
Antagomir database is the central online repository for antagomirs sequence data, annotation and target prediction.The current release (ver.1)contains 22 humiRNA from Homo sapiens, expressing 53 distinct putative antagomir sequences (Table 2, see Supplementary material).The humiRID and the secondary structure of antagomir are included in the database.The best four matches of the secondary structure with respect to free energy are given in Figure 2.

Figure 1 :
Figure 1: Screenshot of the database main page

( 1 )
Based on the humiR sequence the following details about the putative antagomirs are obtained with the help of Sfold.Column 1: target position (starting -ending) Column 2: target sequence (5' 3') Column 3: antisense oligo (5' 3') Column 4: GC content Column 5: oligo binding energy (kcal/mol) (2) The database also includes the secondary structure of antagomirs along with the other details mentioned above.The secondary structures have been designed with the help of mfold.

Figure 2 :
Figure 2: Screenshot of the database showing the UID (User Interface Design)Conclusion:The prediction of antagomir targets is an open and difficult problem in spite of a few years of the existing research.Antagomir Database designed based on human miR targets, is shown to provide predictions characterized by favorable sensitivity, which comes at a price of an increased number of predictions.Our future work will concentrate in the inclusion of more miRNAs from different species as targets of antagomirs.

Table 1 :
Algorithms used to predict antagomir targets

Table 2 :
List of putative antagomirs and their target sequence Seq No.