Genetic diversity and population structure of Rheum tanguticum (Dahuang) in China

Background Wild Rheum tanguticum (Dahuang in Chinese) has becoming endangered in China. This study aims to examine the genetic structure and genetic diversity of R. tanguticum within species, and the genetic differentiation within and among populations in China. Methods The variability and structure of 19 populations of R. tanguticum were investigated by their chloroplast DNA matK sequences. The genetic diversity index was calculated by Dnasp, PERMUT, and Arlequin 3.0 software, and a neighbor-joining (NJ)-tree was constructed by MEGA 5.0 software. Results Fifteen haplotypes were obtained based on the matK sequence analysis. The mean genetic diversity within species was 0.894, and the genetic variability among populations (67.6%) was relatively higher than that within populations (13.88%) according to the AMOVA and PERMUT analyses. The NJ-tree and a pairwise difference analysis indicated geographical isolation of R. tanguticum. The gene flow among populations was 0.05, indicating a genetic drift among some populations, which was also confirmed by the NJ-tree and haplotype distributions. Furthermore, a mismatch distribution analysis revealed the molecular evolution of R. tanguticum. Conclusion Genetic diversity among and within populations of R. tanguticum in China was demonstrated.

evidence for potential population endangerment [16,17]. Various molecular markers were used to investigate the genetic diversity of R. tanguticum. Chen et al. [18] discovered a relatively high genetic diversity at the species level and a low genetic diversity within populations of R. tanguticum by evaluating an SSR marker. These findings were in accordance with those of Wang et al. [19] based on an ISSR marker. However, Hu et al. [20] demonstrated a similar result at the species level, but an opposite result within and among populations of R. tanguticum using an ISSR marker. These studies of R. tanguticum genetic diversity involved limited materials, and their results were contradictory. Therefore, large samples and new molecular markers were required to reveal the real state of R. tanguticum genetic diversity.
The matK gene (1500 bp) is a molecular marker for plant molecular systematics and evolution, and is located within the intron of the chloroplast gene trnK on the large single-copy section adjacent to the inverted repeat [21]. Among various other molecular markers, the matK gene sequence avoided any interference of heterozygosity  Table 2.
TH15 * * * * and its evolutionary rate was relatively fast [22,23]. Therefore, in recent years, the matK gene has been employed as an important and powerful tool for examining intergenus and intragenus genetic diversity because of its high substitution rate [24,25]. This study aims to examine the genetic structure and genetic diversity of R. tanguticum within species, and the genetic differentiation within and among populations in China. The genetic diversity of R. tanguticum at the species level and within and among populations was investigated using the matK gene sequences, and the population structure of R. tanguticum was clarified.

Plant materials
A total of 276 R. tanguticum individuals were collected from 19 populations in Sichuan, Gansu, and Qinghai provinces of China ( Figure 1). Each population was composed of 10-20 individuals spaced 50 m apart from one another. Tender leaves of each sample were stored in ziplock bags with silica gel. The latitude, longitude, and altitude of each collection site were recorded by an Etrex GIS unit (Garmin, Taiwan). The sample information is listed in Table 1.

Data analysis
Sequences were aligned by ClustalX [27] and manually adjusted by BioEdit v.7.0.9 [28]. All gaps were treated as missing characters. Dnasp 4.0 estimated the molecular diversity, including the number of segregating sites (S), number of haplotypes (Nh), haplotype diversity (Hd), and nucleotide diversity (Pi) [29]. The Dnasp 4.0 also performed Tajima's test and calculated the mismatch distributions [30]. PERMUT calculated the average gene diversity within populations (Hs), total gene diversity (Ht), and two measures of population differentiation, GST and NST (equivalent coefficient taking into account sequence similarities among haplotypes) [31]. Arlequin 3.0 software performed an analysis of molecular variance (AMOVA) to analyze the pairwise differences among and within populations [32]. The DNA divergences among populations (Fst) were measured, and the significances were tested using 10,000 permutations [33]. Gene flow between pairs of populations was calculated based on the Fst values (Nm = (1-Fst)/4 Fst). Statistical Product and Service Solutions (SPSS) calculated the correlation between genetic difference and geographic distance. A molecular phylogenetic tree was constructed by the neighbor-joining (NJ) method in MEGA 5.0, based on 87 samples including all of the haplotypes [34]. Insertions and deletions of base pairs were removed by the bootstrap method with 1000 replicates.

Haplotypes and their distribution analysis
Among the 19 populations, a 1518-bp matK sequence was obtained from 18 populations. The only exception was the TZ population from Gansu province, which produced a 1524-bp matK sequence with a 'TAAACC' insertion at the 1022-bp site. A total of 21 segregated sites were found in the matK sequence of R. tanguticum, and 15 haplotypes were determined (Table 2). There was only one haplotype in 13 populations, two different haplotypes in four populations, and four different haplotypes in the JZ and TK populations ( Figure 1, Table 1). Among the 15 haplotypes, three haplotypes, TH3, TH4, and TH5, were simultaneously detected in four different populations. Two haplotypes, TH1 and TH2, were simultaneously detected in three different populations. TH11 was detected in two populations at the same time. The other nine haplotypes, TH6, TH7, TH8, TH9, TH10,

Genetic diversity analysis
The genetic diversity of the matK sequences was relatively low in the same population, but relatively high in different populations (Table 1, Figure 1). The highest genetic diversity was observed in population ZN (Hd = 0.667, Pi = 0.0022), while the lowest genetic diversity was observed in 13 populations, e.g., TH4 (Hd = 0, Pi = 0). The changes in Pi showed a similar trend toward haplotype diversity, and the only difference was that the highest Pi was found in population QL (Pi = 0.00276), rather than population ZN (Pi = 0.0022). The Hd and Pi values within the species were 0.894 and 0.00308, respectively, demonstrating a relatively high level of genetic diversity.

Genetic differentiation and genetic difference analysis
The AMOVA results showed high variability among the populations ( Table 3). The genetic differentiation among and within populations was 67.6% (FST = 0.82996) and 13.88% (FSC = 0.86121), respectively. The genetic differentiation was mainly observed among populations. According to the results of the PERMUT analysis, the genetic diversity among populations (Ht = 0.918) was higher than that within populations (Hs = 0.173), which was consistent with the AMOVA results. The value of NST (0.854) was higher than the value of GST (0.812), indicating a differentiation of geographical structure among populations of R. tanguticum. The genetic differences according to the AMOVA results were listed in Table 4. The pairwise Fst values varied from 0 to 1, and most of the pairwise Fst values between populations were significant (P < 0.05). The SPSS analysis demonstrated a significant positive relationship between genetic difference and geographic distance ( Figure 2).

Genetic structure analysis
An NJ-tree was constructed based on the matK gene sequences of 87 R. tanguticum samples (Figure 3). The 87 samples were clustered together into two groups, one including the LQ and TB populations, and the other including the remaining 17 populations, which were further clustered into three subgroups. In general, samples from the same population were clustered together, such as the samples from populations QL, TZ, TD, and GD. However, several samples from the same population were clustered into different subgroups, for example, JZ-1, JZ-2, JZ-3, JZ-4, JZ-5, JZ-6, JZ-7, and JZ-8 were all collected from population JZ, but were clustered with different populations.
The results of the NJ-tree analysis were consistent with those of the genetic difference analysis between populations. The genetic differences between populations YJ and TK, SP and BM, DR and MQ, TB and ZQ, and TD and GD were all zero, and these populations were clustered into one subgroup on the NJ-tree. Meanwhile, the genetic differences between populations GD and DR, MQ and TD, and TZ and DG were significant, and they were clustered into different subgroups on the NJ-tree. However, some populations, such as YJ and ZQ, and LQ and YJ, were clustered into the same subgroups on the NJ-tree, but the genetic differences between them were significant (Fst = 1).

Mismatch distribution analysis
A mismatch distribution analysis based on Dnasp was performed, and multi-peak traces were obtained to explain the gene exchange present among different populations of R. tanguticum (Figure 4). Tajima's test (Tajima's D = 1.09761, P > 0.10) demonstrated the presence of gene exchange among R. tanguticum populations. The average number of migrants (Nm) between populations calculated by AMOVA and Dnasp was 0.05 for both analyses.

Discussion
In this study, a relatively high genetic diversity was found in R. tanguticum, and the genetic diversity among populations was higher than that within populations. Endangered species often showed a relatively low level of genetic diversity [35][36][37][38], which was not consistent with this study. In general, many factors were found to influence genetic diversity, such as environmental, genetic, and human factors [39]. R. tanguticum is a herbaceous perennial with a long living history [19] and self-incompatible species [23], and its pollen is widely spread from Gansu Province to the Tibet autonomous region in China, i.e., different environmental and climate conditions, thereby enhancing gene exchange and leading to high genetic diversity [40][41][42][43].  The distribution of the 15 haplotypes and the SPSS analysis results demonstrated a significant positive relationship between genetic difference and geographic distance. On the NJ-tree, the samples from the same population were clustered together, and the samples from different populations were clustered into different subgroups. Geographic isolation, e.g., by mountains and rivers, was noted among different populations of R. tanguticum, and explained why the genetic diversity differed among populations. In this study, the geographic distance between populations JZ and BM was close, but the difference in their haplotypes was significant.
Haplotypes TH1-TH5 were present in different populations at the same time. However, in two populations, JZ and TK, many different haplotypes were simultaneously observed. Although the geographic distances between populations ZS and JZ, DR and QL, and TB and LQ were significant, they had the same genotypes, respectively. On the NJ-tree, some samples from the same population did not cluster into the same subgroup, such as the samples from populations JZ and TK. The genetic differentiation of R. tanguticum mainly occurred among different populations. The multi-peak traces and Tajima's test results (Tajima's D = 1.09761, P > 0.10) demonstrated that the evolution of R. tanguticum was consistent with the neutral theory [44], indicating that it did not experience huge environmental changes and rapid expansion. The adaptive capacity to an environment is decided by the genetic diversity of the species, which is also an important index for its long-term survival [45]. As our samples were all collected from untraversed fields without human interference, the gene exchange phenomenon was the result of early accumulation of genetic diversity.

Conclusion
Genetic diversity among and within populations of R. tanguticum in China was demonstrated.