Insights on seed abortion (endosperm and embryo development failure) from the transcriptome analysis of the wild type plant species Paeonia lutea

Paeonia lutea is a wild peony (an endangered flowering plant species) found in China. Seed abortion (endosperm and embryo development failure) is linked to several endangered plant species. Therefore, it is of interest to complete a comparative analysis of transcriptome between the normal active seeds (Population A) and the endangered abortion seeds (Population H). Data from GO assignments of differentially expressed genes (DEGs) shows that "metabolic process", "binding", "cellular process", "catalytic activity", "cell" and "cell part" are commonly prevalent in these popuations. DEGs between the populations are found to be connected with metabolic pathways, biosynthesis of secondary metabolites, purine metabolism and ribosome. We used quantitative RT-PCR to validate 16 DEGs associated with these populations. It is found that histone genes and proline-rich extensin genes are predominant in the common groups. Histone genes (H2A, H2B, H3, H4 and linker histone H1) show 3 to 4 folds log2FC higher expession in population A than in population H in stage I unlike in stage II and III. Increased activity of proline-rich extensin genes in population A than in population H corresponding to seed abortion in the later population is implied. These preliminary data from the transcriptome analysis of the wild type plant species Paeonia lutea provide valuable insights on seed abortion.


Keywords: Paeonia lutea, Seed abortion, Transciptome
Background: Tree peony (Paeonia suffruticosa Andrews) is an important, traditional and most well-known ornamental and medicinal plant in the world due to its colorful flowers and medicinal values, which belongs to Moutan subfamily, the genus Paeonia, family Paeoniaceae [1,2]. Paeonia lutea with special bright yellow flowers and large plant size (1.1-2.3 m) obviously distinguished it from other species of tree peony. The flower colour of most tree peony species is pink, red, purple-red, or white [3,4], bright yellow is rare in the tree peony cultivars. Thus, Paeonia lutea is considered to be the most precious resource for tree peony cultivar breeding [1]. It was classified as rare and endangered plants in 1987 in China. It distributed in middle and northwestern of Yunnan province, southwestern of Sichuan province and Tibet. It usually grows in mountains with elevation of 2500-3500m, the distribution area is very narrow. It has been reported that seedling numbers and total plant numbers of Paeonia lutea in natural environments declined year by year during the past 20 years [5]. It is endangered for its small quantity and narrow distribution. Natural reproduction of Paeonia lutea is mainly by root suckers and seeds, and most of the populations can breed with seeds by themselves and keep a normal growth state. While some populations has been observed with seed abortion problem, the seeds of these plants were small, thin and showed extremely low activity. It is severe for its propagation and may exacerbate its endangering rate. However, there are very few researches focused on its seed abortion mechanisms.
Seed development was been regulated by both exogenous and endogenous factors. For the two kinds of Paeonia lutea population (normal populations with active seeds and seed abortion populations) in our test, since they are distributed in the same environment with similar climatic conditions, the exogenous factors might not be the major driving forces of seed abortion in Paeonia lutea. The endogenous factors as the key to seed formation are regulated by genes expression or repression during the development processes. However, there is no report on the genes or involved pathways on seed development of Paeonia lutea so far. Therefore, it is of interest to complete a comparative analysis of transcriptome between the normal active seeds and the endangered abortion seeds to derive meta-data for explaining seed abortion in Paeonia lutea.

Materials and methods: Plant materials:
The experiments were conducted at Nyingchi Prefecture (29°34′N, 94°37′W), Tibet, China, using wild Paeonia lutea populations as plant materials. Two populations of wild Paeonia lutea with contrasting seed performance (normal vs. abortion, referred to as Population A and H, respectively) were used for artificial pollination. Ten individuals of each population were chosen randomly for pollination. Sampling method was as follows: Flower bud, blooming flower, and pollinated flower were sampled at three stages: stage I, flower bud three days before blooming (Figure 1 A-c; Figure 2, H-c). At each stage, the two populations were sampled at the same time with three biological replicates. All samples were immediately frozen in liquid nitrogen and stored in -80 refrigerator for RNA extraction. The workflow for sequencing and bioinformatic analysis are given ( Figure 3).

De novo assembly of Paeonia lutea transcriptome:
Firstly, the raw reads were filtered by discarding adaptor sequences, low quality reads, reads with adaptors and reads in which unknown bases (N) are more than 5% were removed to get clean reads. Then, Clean reads will be assembled into unigenes using the Trinity software with an optimized k-mer length of 25 [6].  ©Biomedical Informatics (2020)

Results:
Paeonia lutea populations in this experiment were originated in Tibet, and this experiment were conducted in Nyingchi Prefecture (29°34′N, 94°37′W), Tibet, China. In this distribution area, some populations have been investigated regarding to seed abortion problems. According to the survey, in these populations, almost all seeds were aborted in each individual (data not shown). The seed coat of normal populations was plumpness while the aborted seeds were small, thin and flat (Figure 1 A-d, e; Figure 2 H-d, e). The ovules were aborted completely in the group H, while 2 to 4 ovules in each pod developed into active seeds successfully in group A (Figure 1 A-f; Figure 2 H Table 2). This raw sequencing data is available at the NCBI Sequence Read Archive (SRA) database under accession of PRJNA545629.

Unigenes functional annotation and classification:
All assembled unigenes were aligned to seven public functional databases to identify the putative functions with an E-value cut off of 1e. In total, 79,140(50.83%) unigenes in the de novo transcriptome libraries showed significant similarity to known proteins. Unigenes annotation information in seven databases were shown in Table 3.

Seed formation related DEGs selection and functions annotation in Paeonia lutea:
To identify the candidate genes controlling seed formation and differentially expressed between normal populations and seed abortion populations, we performed differentially expressed gene (DEG) analysis by NOIseq [8].  Figure 5).

qRT-PCR validation of core candidate DEGs from RNA-Seq:
To confirm the accuracy and reproducibility of the Illumina RNA-Seq results, 16 core candidate DEGs were verified using qRT-PCR. The RNA-Seq results and qRT-PCR values were displayed in Figure 8, showing consistent expression patterns for those candidate DEGs.

Discussion:
Paeonia lutea as the most precious resource for tree peony cultivar breeding, it is endangered for small quantity and narrow distribution. Natural reproduction of Paeonia lutea in wild is mainly by seeds while some populations have been found with severe seed abortion problem. In this study, transcriptome comparative analysis between the sexual reproductive abortion population and the normal population of Paeonia lutea was carried out to explore the possible mechanism of seed abortion.
Paeonia lutea belongs to Moutan subfamily, the genus Paeonia, family Paeoniaceae. Compared with those model plants, its genomic research is limited, and the relative biological information is insufficient. It is the first time to study the genomics of Paeonia lutea. Therefore, de nove assembly technology was used to assemble the transcripts of Paeonia lute. The overall annotation rate of Unigenes was 50.83%, which is very low. Nearly half of the genes could not be annotated effectively. This indicates the unique genome information that the yellow peony may have. The large amount of gene expression information data obtained in this study will greatly enrich the genetic data resources of the yellow peony, which will provide a basis for the further study of the yellow peony on molecular level.
Seed abortion in natural plants has been noticed and discussed for a long time. Bawa et al. [11] pointed out that there are several hypotheses on seed abortion in natural populations of plants. Parent-offspring conflict over resource allocation, sibling rivalry, pollen competition and genetic load theory had been proposed. These theories explained seed abortion in some plants successfully with an exception in an endangered plant named polygonaceae (Dedeckera rurekensis), this plant had been observed with 97.5% percent of seed developmental failure [12], which was not randomly occurred among the seeds apparently, so it cannot be well explained by any of the above hypotheses. Similarly, the seed abortion phenomenon of Paeonia lutea in natural populations in Tibet is just like what happened to polygonaceae, almost 100% of seeds was aborted in some populations. Sun et al. [13] reported that environment stresses could be the key reasons that lead to seed abortion, however, the normal populations and the seed abortion populations of Paeonia lutea are in the same habitat, excluding the environmental factors. Thus, the inherent genetic reasons may be involved. Urgent study needs to find the reason in case the situation becoming more severe.  Hence, in this study, transcriptome comparative analysis was applied between the sexual reproductive abortion population and the normal population of Paeonia lutea, aimed to explore the related genes or pathways, which may explain the seed abortion problem.
Three key stages during reproductive development process were chosen in this experiment, stage I, Flower bud three days before blooming; stage II, initial blooming time before pollen dispersion and stage III, eight days after pollination. Stage II was showed to be the most activity phase during the whole process through the transcriptome test.
The results suggested that histone genes may involve in the reproductive development processes in Paeonia lutea, a group of DEGs on histone proteins were notable in our test ( Table 4). As it showed in the table, during stage I, there were 11 DEGs annotated as histone H2B, histone H2A, histone H3 and histone H1, and the expression level of all DEGs was 3.7-4.3 log2FC in group A than in group H. During stage II, there were 8 DEGs annotated as histone H3, histone H1 and histone H2B, while the gene expression level was opposite to stage I, it was 3.9-4.1 log2FC in group H than in group A. During stage III, there were 3 DEGs annotated as histone deacetylase HDT1, histone-binding protein RBBP4 and histone H1, and the expression level of genes in group H was significantly higher than that in group A. There seemed showing a pattern that histone proteins were produced earlier in normal seed formation plants than in seed abortion plants.
Histone proteins including core histones H2A, H2B, H3, H4 and linker histone H1, DNA was wrapped around an octamer of histone proteins to form nucleosomes, and the changes of histone proteins lead to higher order chromatin structure formation and remodeling [14,15,16]. Histone modifications including methylation, acetylation, phosphorylation, ubiquitination, and sumoylation, would alter nucleosome stability and positioning, and then affect DNA accessibility for regulatory proteins or protein complexes involved in transcription, DNA replication and repair [17,18,19]. Studies have unraveled diverse epigenetic regulatory mechanisms involved in different processes during floral organogenesis and sexual reproduction in Arabidopsis and rice [1,20]. Histone H3 methyltransferase is required for ovule development in Arabidopsis [21]. It can be inferred that during the reproductive process, histones activity was highly correlated with the expression of key function genes on reproductive regulation. In our test, histone genes were induced highly in stage I in group A uniformly, while in Stage II and III they were highly induced in group H uniformly. The difference of histones proteins dynamic between group H and group A may lead to different seed formation process, while their exactly regulation role on seed ©Biomedical Informatics (2020) 649 development still need to be explored in the future study.
The plant proline-rich proteins, which belonged to a class of proline and hydroxyproline-rich proteins and mainly localized in the cell wall, have been pointed out to act on seed developmental program and coordinate the physiological events occurring during celluar process [22,23]. It expressed specifically in different tissues and developmental stages, and has been reported to regulate cell wall structure in plants [23] In this test, a group of proline-rich extension proteins were selected as DEGs ( Table 5), they showed different patterns in group A and group H during floral organ development process. Generally, the genes' expression level was much higher in group A than in group H, especially in stage III. Unigene9091and other 7DEGs which annotated as extensin-like protein were highly induced in group A. some researchers concluded that SbPRP1 was one of the highly expressed forms of cell wall proteins at the stage of seed coat development [24]. Four days after fertilization, over one hundred genes were identified with exclusively high expression in young seed stages, and most of these genes were annotated as histones and proline-rich proteins [25]. The prolinerich proteins may act as key regulating factors in seed cell development, for their activity in group A was much more intense than that in group H, which may cause seed cell development disorder in group H, then lead to seed abortion.

Conclusion:
We report the predominant presence, activity and expression of histone genes (H2A, H2B, H3, H4 and linker histone H1) and proline-rich extensin genes linking to seed abortion in Paeonia lutea using DEG data in stage I unlike in stage II and III. These data from the transcriptome analysis of the wild type plant species Paeonia lutea provide valuable insights on seed abortion towards improved crop management.

Author contribution statement:
SSZ conceived and designed the study wrote the paper; YF assisted in performing the test; FZ and YNC assisted in analyzing the data; SW and YHL assisted in sampling; XLZ conceived the idea and supervised the research.