Comparative analysis of DNA polymorphisms and phylogenetic relationships among Syzygium cumini Skeels based on phenotypic characters and RAPD technique

The Indian black berry (Syzygium cumini Skeels) has a great nutraceutical and medicinal properties. As in other fruit crops, the fruit characteristics are important attributes for differentiation were also determined for different accessions of S. cumini. The fruit weight, length, breadth, length: breadth ratio, pulp weight, pulp content, seed weight and pulp: seed ratio significantly varied in different accessions. Molecular characterization was carried out using PCR based RAPD technique. Out of 80 RAPD primers, only 18 primers produced stable polymorphisms that were used to examine the phylogenetic relationship. A sum of 207 loci were generated out of which 201 loci found polymorphic. The average genetic dissimilarity was 97 per cent among jamun accessions. The phylogenetic relationship was also determined by principal coordinates analysis (PCoA) that explained 46.95 per cent cumulative variance. The two-dimensional PCoA analysis showed grouping of the different accessions that were plotted into four sub-plots, representing clustering of accessions. The UPGMA (r = 0.967) and NJ (r = 0.987) dendrogram constructed based on the dissimilarity matrix revealed a good degree of fit with the cophenetic correlation value. The dendrogram grouped the accessions into three main clusters according to their eco-geographical regions which given useful insight into their phylogenetic relationships.

Most of the S. cumini trees available in India are seedling type in origin that shows enormous variability with respect to tree and fruit morphology, fruit quality, maturity and productivity due to pre-dominance of seed propagation. These provide good scope for breeding programme [19,20]. Phenotypic characters have traditionally been used to obtain information on variation within plant species, but usually influenced by environment and controlled by many loci. Therefore, genetic resource characterization is required to be complement with molecular markers improving valuable traits [21]. Molecular markers provide a quick and reliable method for evaluating DNA polymorphism and phylogenetic relationships among genotypes of any species without requiring prior knowledge of the target sequence or genome [22,23]. Several classes of molecular markers have been used to evaluate the genetic diversity in collections of genetic resource of horticultural crops. Random amplified polymorphic DNA (RAPD) markers provide an opportunity for direct comparison and identification of diverse genetic material independent of any environmental influences [24,25]. Earlier, molecular markers like RAPD have been used to assess genetic diversity in various fruit crops. RAPD [31]. Few reports on the use of RAPD markers are also available to study the phylogenetic relationships among jamun accessions [32]. In this study an attempt has been made to use the DNA polymorphisms for deriving phylogenetic relationships among Syzygium cumini using phenotypic characters and RAPD markers that could be extremely helpful for germplasm management, crop improvement, varietal selection for breeding programs and to protect indigenous crop wealth in India.

Plant materials
Syzygium cumini accessions from different geographical regions comprising Uttar Pradesh, Maharashtra, Gujarat and Tamil Nadu maintained in the field genebank at Central Institute for Subtropical Horticulture, Lucknow (U.P.) were used for the s t u d y o f p h e n o t y p i c c h a r a c t e r s a n d assessment of the phylogenetic relationship based on RAPD markers.

Morphological characteristics
Twelve accessions of S. cumini with uniform growth and vigour were selected to the study the phenotypic characteristics. The observations on quantitative characteristics viz. fruit weight, fruit length, fruit breadth, fruit length breadth ratio, fruit size, pulp weight, pulp content, seed weight, seed length, seed breadth and pulp seed ratio were recorded for selected S. cumini accessions.

DNA isolation and PCR reaction
The extraction of total genomic DNA was carried out from young disease free fresh leaves of selected accessions using the Qiagen DNeasy mini kit guidelines. The isolated DNA samples were quantified at 260-280 nm absorbances and the DNA quality checked using 0.8 % agarose gel [33 -35]. A set of five RAPD primers series of 10 bases in length with GC content ( > 60%) of OPA, OPB, OPX, OPD and OPG series (Operon Technologies, Almeida, California USA) were tested for their ability to amplify scorable and reproducible DNA fragments. PCR reaction on each DNA sample was performed in a 25 µ l reaction mixture containing 50 ng template DNA, 0.4 mM dNTPs, 1.5 mM MgCl2, 1.5 U Taq DNA polymerase and 0.5 µM primer. Amplification was carried out on a thermocycler (Bio Rad, USA) programmed with 1 cycle of 2 minutes at 94 °C initial denaturation followed by 35 cycles ( 1 minute at 94 °C for denaturation, 1 min at 35 °C for annealing, and 2 min at 72 °C for extension) along with a final extension at 72 °C for 5 min. The reaction products were mixed with 4 μl of loading dye (0.25 % bromphenol blue, 0.25 % xylene cyanol and 40 % sucrose, w/v) and spun briefly in a microtube before loading. A 100 base pair DNA ladder (Fermentas) was added as a molecular ruler. The amplified products were separated by electrophoresis in 1.5 % (w/v) agarose gels with 1X TAE buffer, stained by 0.5 µg/ml of ethidium bromide. The PCR products were resolved by running gel at 5 V/cm for 3 h. The gels visualized under UV light were photographed using photo documentation sy s te m (Alpha Innotech Corp., USA). RAPD analysis was repeated at least three times to ensure strong reproducible bands for the analysis.

Data analysis
The morphological data were used in multivariate analysis with emphasis on distinctness among jamun accessions. Data of each phenotypic traits viz. weight, length, breadth lenght-breadth ratio, size of fruit as well as physiological parameters viz. pulp weight, pulp content, seed weight and pulp -seed ratio of jamun fruits and seeds were analyzed through software DARwin 5.0 [36]. The Rogers-tanimoto dissimilarity coefficient used to summarize variation based on phenotypic characters. The dendrogram was constructed for examining the clustering based on phenotypic relatedness among twelve Syzygium cumini accessions. The cophenetic value matrix of the NJ clustering was used to test for the goodness-of-fit to the dissimilarity matrix [37] by computing the cophenetic correlation (r) with 1000 permutations [38].
DNA polymorphism was evaluated and scored on the basis of presence or absence of bands (1 or 0) and each RAPD fragments was treated as a unit character. For each primer, the numbers of alleles and polymorphic ones as well as the percentage of polymorphic bands (PPB) were calculated. The latter was determined as the percentage of polymorphic bands over the total number of the yielded bands. Polymorphism Information Content (PIC) for each RAPD locus was also calculated based on the number of bands/primer using the formula PIC = 1 -Pi 2 , where Pi is the frequency of the i th band in the genotype examined. The dissimilarity coefficients between cultivars were analyzed and clustering was carried out using unweighted pair group method and arithmetic average (UPGMA) and neighbour joining (NJ) method through Darwin software (dissimilarity analysis and representation for windows (DARwin) Version 5.0.148 (http://darwin.cirad.fr/darwin). The principal coordinates analysis (PCoA) was also used for analysis of the phylogenetic relationship among S. cumini accessions. The two-dimensional PCoA was performed based on the dissimilarity matrix.

Morphological characterization
Morphological markers correspond to the visually scoring qualitative traits as well as physical measurements of quantitative traits are influenced by plant biology and the plant developmental stage [39,40]. The selected accessions were characterized phenotypically in this study by comparing the tree traits, leaf characteristics and fruit characteristics. The maximum pulp weight was recorded (21.29 g) in J -37 followed by J -36 (19.97 g) however, highest pulp content was recorded in J -42 (98.47 %) followed by J -44 (98.10 %) and lowest pulp content was recorded (66.04 %) in J -49. Though the maximum weight of the fruit was observed in J -37 (22.90 g), maximum pulp content (98.47 & 98.10 %) was recorded in J -42 and J-44, due to seedlessness of these accessions. The maximum seed weight was recorded in J -49 (2.65 g) followed by J -51 (2.19 g). Data revealed that pulp: seed ratio in various accessions ranged from (4.10) in J -51 and J -55 to (64.40) in J -42 that showed wide range of variability

Clustering based on phenotypic traits
The weighted neighbor joining (NJ) method based on Rogerstanimoto dissimilarity coefficient was used to determine the variation in S. cumini accessions [38]. A cophenetic value matrix of clustering was used to test for the goodness-of-fit of the clustering to the dissimilarity matrix by computing the cophenetic correlation (r) with 1000 permutations. The cophenetic correlation between the dendrogram and the dissimilarity matrix revealed a good degree of fit (r= 0.9517; p<0.001). The dendrogram was constructed and the cluster analysis based on phenotypic traits using weighted neighbor joining (NJ) method is presented in (Figure 1). Genotypes under study were broadly divided into three major clusters. Cluster I was observed to be the largest one with 6 accessions followed by cluster II (4 accessions) and Cluster III (2 accessions). Bootstrap values above 65 are shown in dendrogram. Each cluster was further subdivided into subclusters that represent the grouping of 12 S. cumini accessions according phenotypic characteristics.

Plant DNA extraction
DNA sequences give information about the genetic makeup of a S. cumini accessions and to distinguish them to understand its uniqueness. The total DNA isolation was carried out using the Qiagen plant DNA isolation kit as it offers a better and a pure yield of DNA that can be used for further molecular analysis. The quality and quantity of DNA concentrations in the samples were determined spectrophotometrically and for all the S. cumini accessions the 260 nm/280 nm ratio was in the range 1.74 to 1.79 which indicated significant levels of purity. The quality of the extracted DNA checked on 0.8 % agarose gel electrophoresis was carried out using 0.5 x TBE and it is found to be pure for further analysis through PCR amplification.

Molecular discrimination based on polymorphism
Assessment of genetic variability and identification of superior genotypes are the fundamental step for any crop improvement programme in particularly in era of genomics.
The aim of present the study was to characterize different S. cumini accessions and to assess their phylogenetic relationship on the basis of RAPD fingerprints. The arbitrary 10-mer primers of OPA, OPB, OPD, OPG, OPE and OPX series were first screened for twelve accessions of S. cumini and only 18 informative primers were retained to assess the genetic variability for the selected accessions based on their ability to produce polymorphic, unambiguous and stable RAPD pattern.
A total of 207 scorable alleles were detected, out of which 201 alleles were observed to be polymorphic and the size varied from 300 bp to 3500 bp. Number of alleles scored per primer ranged from 7 to 19 and the lowest number of alleles per primer was produced by OPX -4 while the highest numbers of alleles were obtained with primer by primer OPD -18 with an average 17.5 alleles per primer Table 2

(see supplementary material).
Reliable and reproducible DNA amplification was obtained for 18 primers and 97 per cent of polymorphism was cumulatively sourced.
Earlier report on Syzygium species endemic to Mauritius explored using both morphological characters and molecular techniques have reveled Random amplified polymorphic DNA (RAPD) and inter simple sequence repeat (ISSR) techniques to be successfully used to amplify all the 6 Syzygium species. The RAPD analysis resulted an average of13.6 markers and the average polymorphism of 41.2 % per primer was observed as compared to an average of 11 markers and 33.3 % polymorphism per ISSR primer [41].
The representative RAPD profile generated by primers OPD -18, OPA -13 and OPG -13 were shown in (Figure 2, 3 & 4). The polymorphic information content (PIC) was used for comparison of polymorphism levels across the markers and to determine the usefulness of markers for specific studies. The polymorphic information content (PIC) indicates polymorphism levels was also determined and discriminating power ranged from 0.78 (OPG-13) to 0.94 (OPD-12) with an average of 0.88 (Table 2).

Genetic distance and Cluster analysis
The cluster analysis was carried out based on Jaccard's similarity coefficients generated from 207 RAPD bands ranged from 0.284 to 0.844. The maximum dissimilarity found in between the accession J -37 and J -44 (0.844) however, the minimum dissimilarity coefficient was recorded for the accessions J -51 and J -55 (0.284). The cophenetic correlation values for the dendrogram UPGMA and NJ based on the Jaccard dissimilarity coefficient revealed a good degree of fit. The cophenetic correlation value of UPGMA (r = 0.967; p<0.001) and NJ (r = 0.987; p<0.001) dendrograms suggest the cluster analysis strongly represents the similarity matrix. Bootstrapping analysis resulted in at least on an average of 72.5 % of confidence limits for the two major clusters.
The dendrogram calculated by using Jaccard coefficient based on UPGMA ( Figure 6A) and Neighbour joining ( Figure 7A) algorithm showed three main clusters. Cluster 1 contained five accessions out of which two J-51 and J-55 were grouped together showing coalignment with accessions J-49, J-23 and J-43 which are apparently unrelated. These accessions originating from different eco-geographical regions as J-51 from Godhra Gujrat and J-55 from Konkan, Maharastra, however, J-23 and J-49 from different regions of Lucknow and J-43 from Varanasi, U.P. These all accessions are characterized by big canopies, producing small fruit, bold seeded with less pulp content, perhaps grouping in same cluster according to their morphological characteristics and sharing common gene pool. In a molecular study, Khan [32] was reported that ten random primers generated 129 amplified products with an average of 12.9 bands per polymorphic primer displaying adequate informativeness for the genetic and ecological diversity among S. cumini accessions. The dendogram generated by UPGMA method grouped accessions into four clusters with no region specific grouping. Nevertheless, the study provides a population level genetic profile that could be further correlated with ecogeographical factors for further investigations.
Cluster II was presented by populations of S. cumini accessions from two different locations of Lucknow, U.P. Cluster II was further divided into two subgroups. One of the subgroups included three accessions namely J-37, J-36 and J-34 and another subgroup comprise two accessions J-40 and J-26. This cluster appears to be quite distinct as compared to other cluster.
These accessions grouped together according to their respective geographical origin and having medium size tree, broad leaf, bold sized fruit, small seed with high pulp content. The cluster III included accessions J-42 and J-44 which are not from the same geographical origin as they are the accessions from Varanasi, U.P. and Periyakulum, T.N. but grouped with each other, because of these are the seedless accessions. The phylogenetic tree for different S. cumini accessions as illustrated in dendrogram generated by UPGMA and NJ algorithms were also clearly shown in topological phenogram ( Figure 6B & 7B). It reveals the facts that these accessions grouped according to their genetic characteristics. The present investigation categorizes 12 S. cumini accessions originating from diverse ecogeographical regions and morphological characteristics, into different clusters and gives a useful insight into their genetic relationships that may be of value in the crop improvement programs.