Analysis of microRNAs and their targets from onion (Allium cepa) using genome survey sequences (GSS) and expressed sequence tags (ESTs)

MicroRNAs are small non-coding RNAs of 21-24 nucleotides in length that acts as important modulators of gene expression related to numerous biological processes including development and defense response in eukaryotes. However, only a limited report on onion (Allium cepa) miRNAs is available and their associated role in growth and development of onion is not yet clear. Therefore, it is of interest to identify miRNAs and their targets in Allium cepa using the genome survey sequences (GSSs) and expressed sequence tags (ESTs) and deduce the functions of the target genes using gene ontology (GO) terms. We report 14 potential miRNAs belonging to 13 different families (miR162, miR168, miR172c, miR172e, miR398, miR400, miR414, miR1134, miR1223, miR6219, miR7725, miR8570, miR8703 and miR8752). BLAST analysis using psRNATarget server predicted 39 potential targets for the identified miRNAs majority of which were transcription factors implicated in plant growth, development, hormone signaling and stress responses. These data forms the basis for further analysis and verification towards understanding the miRNA mediated regulatory mechanism in Allium cepa.


Background:
MicroRNAs (miRNAs) are a group of 21-24 nucleotides (nt) small endogenous RNA sequences that acts as negative regulators of gene expression and play significant modulatory roles in numerous biological processes such as growth, development and response to biotic and abiotic stresses [1]. These are basically transcribed out from the endogenous MIRNA genes within the intronic and intergenic regions of the eukaryotic genomes in the form of a stem-loop primary miRNA (Pri-miRNA) structure. The pri-miRNA is processed by Dicer-like 1 (DCL1)/Hyponastic Leaves 1(HYL1)/Serrate Protein (SE) into hair-pin pre-miRNA and subsequently diced out mature miRNA: miRNA* duplex that are exported into the cytoplasm by HASTY1 (HST1) protein. In the cytosol, the mature miRNA from the duplex binds with the endonuclease ARGONAUTE (AGO) protein forming the RNA induced silencing complex (RISC) and accomplish the regulation of gene expression through cleavage or translational inhibition of the target transcript [2]. Although there are different small non-coding RNAs in plants, miRNAs are unique in the sense that, (1) they are specifically encoded by MIRNA genes, (2) possesses a typical stem-loop structure with negative minimal folding free energy (MFE), (3) have a distinct miRNA* sequence, and (4) and exhibit a high degree of sequence complementarity with their specific targets. While a single miRNA can modulate the expression of multiple genes, several miRNAs may also get tangled in the regulation of a specific gene [3]. As such, identification of complementary targets is fundamental to understand the modulatory roles of miRNAs. During the last couple of decades, notableadvancement has been made in exposing the fundamental role of miRNAs in plant growth, hormone signaling, organogenesis, floral differentiation and myraids of stress responses [4].
The mature miRNAs are well-conserved throughthe plant kingdom [5] making it a significant instrument for the identification of novel miRNAs using homology search based 908 ©Biomedical Informatics (2019) comparative genomics approach. Several plant miRNAs have been identified through high throughput computational strategies including direct cloning and next generation deep sequencing. However, in cases where the whole genome sequence is not available, the similarity search using Basic Local Alignment Search Tool (BLASTn) for nucleotide sequences within the highly conserved regions of the pre-miRNAs and mature miRNAs as well as matching of the secondary hairpin structure could be effective criterion for miRNA identification in plant species. This is possible by making use of multiple data source including the expressed sequence tags (ESTs) and genome survey sequences (GSSs Bulb onion (Allium cepa L.) is an economically important vegetable crop cultivated in greater parts of the world. Besides being an ingredient with high food value, onion is also credited with numerous medicinal properties including for the treatment of cardio-vascular disorders, chicken pox, measles and myraids of cancers [11]. As per global data, onion is one among the five most important fresh market vegetable crops [12]. India is the second largest producer of onion with an area of 0.52 million hectare producing about 6.50 million tonnes. However, the productivity of onion is gradually decreasing complemented with price rise due to several environmental factors such as drought, salinity and biotic stresses including infection by pests and pathogens [13]. Emerging evidences specify that miRNAs and the related RNA interference pathway components are significant elements in the modulation of plants response to biotic and abiotic stresses [4]. A systematic study of miRNAs and their targets in onions could provide novel perceptions into the molecular and biochemical mechanisms of onion development, growth and response to environmental stimuli. Onion contains ESTs and GSSs deposited in the National Centre for Biotechnology Information (NCBI), which could be used as the starting material for predicting miRNAs in this economically important plant species. A previous study had reported 9 onion miRNAs using the ESTs datasets [14]. In the present study, we used a robust homology based comparative algorithm approach for the detection of onion miRNAs and their targets from the EST and GSS datasets. Further, the target genes of the identified miRNAs were also functionally annotated to understand their role in plant development and metabolic processes.

Materials & Methods:
Sequence database and reference set for miRNA identification: All mature miRNA sequences from Viridiplantae group were reclaimed from the miRNA database miRBase (http://www.mirbase.org/) [15]. All these mature miRNAs were previously resulted from different plant species by initial computational identification followed by validation through different experimental approaches including cloning, sRNA sequencing, northern blotting and qPCR method. Mature miRNAs were made non-redundant by duplication to prevent overlapping of miRNA sequences. Taking all these unique mature miRNA sequences as reference, our target miRNA sequences were identified from onion ESTs and GSSs by homology search method. Publicly available 20204ESTs and 10725 GSSs (as of December, 2019) of onionwere downloaded from (NCBI) (www.ncbi.nlm.nih.gov/) by using keyword "Allium cepa".

Prediction of A. cepa miRNA:
Prediction process of putative miRNA fromAllium cepa is represented in (Figure 1). Sequences from the locally developed onion EST and GST databases were BLAST searched against the GenBank database (www.ncbi.nlm.nih.gov/genbank) and Rfam database ver 12.0 (www.rfam.xfam.org). The resulted sequences were further analyzed withBLASTx [16] to identify and eliminate the coding sequences. The filtered sequences were used for homology search against the known mature and non-redundant plant miRNAs in miRBase (Release 22; http://www.mirbase.org/search.shtml). Sequence alignment of the ESTs and GSSs against the known miRNAs was retrieved throughBLASTn algorithmswith a threshold E value of 10, sequence filtration at low complexity and word match size between the query and the database set at 7. Homologous candidate miRNAs were identified based on following parameters: EST/GSS sequences with a miRNA matching region of 18 nucleotides with no gap, and base mismatch between predicted sequences and the known miRNAs should be ≥ 3.Zuker algorithm in the MFOLD program predicted the secondary loop structures of the miRNA precursors [17]. The hairpin structures of the precursors were confirmed using the following criteria: hairpin should have atleast 18nt mature miRNA in one arm of the stem loop; 50% of bases should be paired; <4 nt bulge between miRNA and miRNA*; minimum bulge size of 1 or 2 bases and 1 or less asymmetric bulges within the miRNA/miRNA*;30-70% contents of A + U and high negative MFE and minimal folding free energy index (MFEI) of predicted secondary structure.Negative MFE value of each potential precursor miRNAs were determined by theΔG values (−kcal/mol) of stemloop structures, which is directly correlated with the sequence length

Functional annotation of the miRNA targets:
The functional aspects of the miRNA targets are crucial to comprehend the range of miRNA regulation in the biochemical and molecular mechanism of plant growth and development. Functional enrichment of the miRNA targets was performed using the Blast2GO v3.0 [20] and further verified using the DeepGO prediction tool with the protein GO classes [21].
Identified target genes were categorized in terms of molecular functions, biological processes and cellular components.    (2019) 73.5% respectively. The (A+U) % is well within the range of 30-70%, which is the gold standard for identification of potential plant miRNAs [26]. All the mature miRNAs were located within the stem of the hairpin loop (Figure 2). While 5 miRNAs (41.6%) were positioned in the 5' end of the secondary structure, the remaining 9 (64.28%) were found in the 3' end. This corroborate with the previous finding that majority of onion miRNAs are located on the 3' arm of the secondary hairpin loop structure [14]. Unrooted neighbor-joining phylogenetic tree was developed from the multiple sequence alignment of the identified onion miRNAs and other members of the same family available in miRBase to determine the evolutionary relationship among them.
Distinct phylogenetic trees were obtained for ace-miR162, ace-miR168, ace-miR172c, ace-miR172e and ace-miR398 exhibiting high degree of sequence similarity with miRNAs from other plant species (Figure 3). ace-miR172c demonstrated 27-54% of 914 ©Biomedical Informatics (2019) sequence similarity to other previously reported miRNAs having maximum closeness with osa-miR172c of Oryzasativa and ppe-miR172c of Prunuspersica. On the other hand, ace-miR172e precursor was highly similar to sbi-miR172e from Sorghum bicolor followed by zma-miR172e from Zea mays. The percentage similarity between ace-miR162 precursor and miR162 from other plant species ranges between 20-79%. The highest similarity of ace-miR162 precursor was found with osa-miR162 and zma-miR162. Similarly, ace-miR168 precursor demonstrated 55.4% similarity with bdi-miR168 from Brachypodium disthachion followed by 40.4% similarity with zma-miR168 and 36.4% likeliness with osa-miR168. Phylogenetic analysis of ace-miR398 showed that tae-miR398 from Triticumaestivum and bdi-miR398c have significant evolutionary linkage with ace-miR398. Interestingly, ace-miR162, ace-miR168 and ace-miR398 exhibited greater likeliness with members of the same family in monocots. However, no such specific conditions were observed in case of ace-miR172c or ace-miR172e. This suggests that the evolutionary relationship of ace-miRNA is significantly different and is more inclined towards monocotyledonous plants.
miRNA modulate the expression of target mRNA through complementary binding and consequent cleavage and/or translational inhibition [33]. To understand the functional role of the identified onion miRNAs, potential miRNA target genes were predicted using the psRNATarget webserver with default parameters. As the target sites of plant miRNAs are mostly situated in the open reading frames (ORFs) of the target genes, onion ESTs in addition to AGI and TAIR databases were used to search for putative target genes. Based on acceptedprinciples, the algorithm predicted 39 potential target for 13 miRNAs (Table 4). Unsurprisingly, most of the target were similar to the one that have been earlier validated as plant miRNA targets in other plant species including Arabidopsis, rice, wheat, maize and garlic [32]. Nine miRNAs (ace-miR162, ace-miR168, ace-miR172c, ace-miR172e, ace-miR398, ace-miR400, ace-miR414, ace-miR1134 and ace-miR6219) were predicted to have targets in the range of 2 to 8 suggesting that these miRNAs might have diverse functional attributes. Among the 39 targets, 10 were transcription factor (TF) genes including ethylene response factors (ERFs), MADS-Box TFs and Ring-H2 proteins that have been previously implicated in plant growth regulation and development [34]. Most of the targets exhibited high homology with targets from other plants and presumably demonstrated functional redundancy across plant species. For example, ace-miR162 targeting Dicer-like (DCL) proteins, ace-miR168 targeting Argonaute 1 (AGO1) proteins, ace-miR400 targeting pentatrico peptide repeat protein 1 (PPR1) and ace-miR1134 targeting receptor protein kinase PERK1 have been formerly involved in gene regulation and small RNA biogenesis, plant growth, stress response, hormone signalling and host-microbe interactions [31, 32, 35].ace-miR172c targeted two genes encoding floral homeotic protein APETALA2 suggesting its involvement in the speciation of onion flowers. A few targets were non-transcription factors such as phosphoenol pyruvate carboxylase (ace-miR1134), protein phosphatase (ace-miR8752) and ribosome inactivating protein 1 (ace-miR6219) vindicating their functional role related to plant metabolism, immunity and defense response [36]. Additionally, 8 genes targeted by ace-miRNAs were uncharacterized with unknown function, suggesting that they could be part of unknown biochemical and molecular mechanisms essential for growth and survival of the plant.  To delineate the comprehensive network of genes modulated by miRNAs, the identified targets were subjected to gene ontology (GO) term analysis using the Blast2Go program. A total of 31 out of 39 predicted targets were categorized into 8 biological processes, 5 molecular functions and 4 cellular components ( Table 5). Among the biological processes, genes involved in metabolic process (10), secondary metabolic process (5), signaling (5), regulation of transcription (5) and response to stress (4) were mostly represented. Two target genes (CF451171, CF452115) were specifically involved in immune system process (GO: 0002376) and defense response (GO: 0006952). Likewise, transcription factor activity (GO: 0003700; 7 genes), Catalytic activity (GO: 0003824; 7 genes) and nucleic acid binding (GO: 0003676; 6 genes) were the most represented GO terms in the molecular function category. As regards to the putative target transcript of miRNAs in the cellular component category, the GO term cell part (GO: 0044464) was most represented with 5 genes followed by intracellular part (GO: 0044424) and organelle (GO: 0043226) with 3 genes each. The diversified function of these target genes suggest that the complementing miRNAs presumably plays important modulatory role in the signalling, growth, development and defense response to myraids of stresses in onions.

Conclusion:
In conclusion, a comprehensive computational analyses of onion ESTs and GSSs were performed in the present study to identify 14 potential miRNAs belonging to 13 different families. Phylogenetic analysis of the identified miRNAs confirmed their close homology with conserved miRNAs from other plant species. A total of 39 potential targets were predicted for the identified miRNAs with an inhibitive expressional response due to miRNA mediated cleavage or translational repression. Bulk of the predicted target genes encoded transcription and regulatory factors that are implicated in plant growth, development, hormone signalling and stress responses. GO annotation of the target genes revealed that the miRNAs and their associated components are significant modulators of metabolic processes, plant immunity and defense response. These datawill form the basis for further characterization of miRNAs through transient over-expression and knockout study towards exploration of miRNA mediated regulatory mechanism in onion.