- Open Access
Evaluation of seven DNA barcodes for differentiating closely related medicinal Gentiana species and their adulterants
Chinese Medicinevolume 8, Article number: 16 (2013)
Species identification of living organisms by standard DNA sequences has been well-accepted. Consortium for the Barcode of Life (CBOL) recommends chloroplast regions rbcL and matK as the DNA barcodes for the land plants. This study aims to evaluate the feasibility and limitations of rbcL, matK, and 5 other commonly used regions as the DNA barcodes for the medicinal Gentiana and their adulterants, Gentiana. rhodantha and Podophyllum hexandrum.
The species differentiation power of rbcL, matK, nuclear internal transcribed spacer (ITS) and 5S rRNA intergenic spacer, and chloroplast trnH-psbA, trnL-F and rpl36-rps8 intergenic spacers were tested in different medicinal Gentiana, including Gentiana scabra, Gentiana triflora, Gentiana manshurica and Gentiana rigescens, from common adulterants such as Gentiana rhodantha and Podophyllum hexandrum (a toxic herb producing podophyllotoxin).
All seven tested loci could be used to differentiate medicinal Gentiana species from their adulterants, and to distinguish Guanlongdan from Jianlongdan. In terms of general differentiation powers, rbcL and matK had no significant advantages over the other five loci. Only the 5S rRNA and trnL-F intergenic spacers were able to discriminate the closely related species G. triflora, G. scabra and G. manshurica.
The DNA barcodes rbcL and matK are useful in differentiation of closely related medicinal species of Gentiana, but had no significant advantages over the other five tested loci.
The nuclear and chloroplast genomes are the major targets for plant species authentication and phylogenetic studies. Since the rate of evolution varies across each genome, different DNA regions may be selected to reveal different taxonomic levels. The criteria for a useful DNA marker for authentication are: (1) high interspecific divergence; (2) low intraspecific divergence; (3) short PCR product of around 1 kb; and (4) availability of universal primers for amplification [1, 2]. The Consortium for the Barcode of Life (CBOL) set up a standardized sampling method and experimental protocol to analyze agreed-upon “DNA barcodes” . This universal identification system is called DNA barcoding. Recently, the CBOL Plant Working Group recommended that rbcL and matK should be used as the land plant barcodes . The former offers high universality and good discrimination power, while the latter has higher resolution than other loci. However, it is known that the differentiation powers of rbcL and matK may not be sufficient for closely related species . Indeed, plenty of land plants are identified by other DNA regions as markers.
The internal transcribed spacer (ITS) of the nuclear ribosomal cistron consists of ITS1 and ITS 2, and has been demonstrated to be useful for phylogenic studies in many angiosperm families . Recently, over 60,000 ITS sequences of plants and animals from GenBank were compared . At the species level, the success rates of identification were 91.9%, 76.1%, 74.2%, 67.1%, 88.1% and 77.4% for animals, dicotyledons, monocotyledons, gymnosperms, ferns and mosses, respectively. ITS regions can be found in plants, animals and fungi, and occasionally ITS regions of fungi in medicinal materials were co-amplified, thereby making direct sequencing of the amplified DNA product unsuccessful. The non-transcribed spacer of 5S rRNA is highly variable, and some studies have illustrated that its resolving power is higher than those of the ITS sequences . In the chloroplast genome, the trnH-psbA spacer is a rapidly evolving region suitable for identification at the species level . Other chloroplast DNA loci, including trnL-F, have been demonstrated to be informative at the generic level . In a recent study, trnL-F has also been used to separate Cardiocrinum giganteum from its variant C. giganteum var. yunnanense and their closely related species .
Four medicinal Gentiana species, including Gentiana manshurica Kitag., Gentiana scabra Bunge, Gentiana triflora Pall., and Gentiana rigescens Franch., are listed in the Chinese Pharmacopoeia as Gentianae Radix et Rhizoma or “Longdan” in Chinese . They are common medicinal materials used for treating liver diseases , and hepatoprotective against acetaminophen-induced acute toxicity . The first three species are mainly distributed in the northeastern part of China and called “Guanlongdan” (GL), while G. rigescens is located in the southwestern part of China and called “Jianlongdan” (JL). The genus Gentiana is divided into 12 sections in China . GL and JL belong to the adjacent sections of Pneumonanthe (Section III) and Monopodiae (section IV), respectively. While different plant species may be used for the same medicinal purpose in Chinese medicine (e.g. Gentiana rhodantha Franch. is frequently used as a substitute in southwestern China), the neurotoxic Podophyllum hexandrum Royle in the family Berberidaceae with a similar morphology is deemed adulterant .
This study aims to evaluate the feasibility and limitations of rbcL and matK and five other commonly used DNA regions for authentication of medicinal Gentiana species and their adulterants, G. rhodantha and P. hexandrum. In particular, the sequence divergences and differentiation powers of the tested regions were determined and compared.
Authentic samples were collected from various regions of China, as identified by Dr. Hui Cao based on morphological characters. (Table 1) . The voucher specimens were deposited in the Institute of Chinese Medicine, The Chinese University of Hong Kong.
The rhizome of each sample (0.05 g) was ground and total DNA was extracted by a modified CTAB extraction method with a minor modification  that the DNA pellet was resuspended in 30 μL of water instead of 50 μL of Tris-EDTA buffer. Polymerase chain reaction was performed in a 25-μL mixture. Details of the primer sequences and the respective amplified regions are presented in Table 2. The specific PCR products were isolated from the PCR mixture by a Gel-M™ Gel Extraction System (Viogene, Taiwan). Except for 5S rRNA, the purified PCR products of the DNA barcodes were directly sequenced. The 5S rRNA PCR product was ligated into the pGEM-T Easy vector (Promega, USA) at 25°C for 2 hours. Three to four clones containing the insert were sequenced for each individual sample. A Rapid Plasmid Miniprep System (Viogene, Taiwan) was used for plasmid extraction. The purified PCR products or plasmids were sequenced using a BigDye® Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, USA). Alignment of the DNA sequences was accomplished by ClustalW using the BioEdit program [19, 20], and manual adjustment of the sequence alignment was performed if necessary. The genetic distance among samples was determined by the nucleotide model Kimura 2-parameter in MEGA 5 software . All distances were calculated from pairwise global alignments, in which alignment gaps and missing data were eliminated by choosing the “pairwise deletion option”. If the minimum sequence divergence between two groups of species was larger than the maximum intraspecific sequence divergence of the two groups of species, the discrimination was considered successful. Phylogenetic trees of the seven loci were constructed by MEGA5 with the neighbor-joining (NJ) method . Bootstrap analyses for 1000 replicates were performed to provide confidence estimates for the tree topologies.
DNA barcode sequence determination
The primers listed in Table 2 could amplify the corresponding loci in the samples, except for the trnL-F region of sample 2005-2706b (G. rhodantha) and the matK regions of samples 2005-2703b (G. triflora), 2005-2704a (G. rigescens), 2005-2704b (G. rigescens), 2005-2706a (G. rhodantha), ICM 686 (P. hexandrum) and ICM 2148 (P. hexandrum). As a result, we checked the complementarity between the primers and the available Gentiana sequences from National Center for Biotechnology information (NCBI). For matK, it was found that there are 3–4 different nucleotides between 3 F KIM f and the Gentiana sequences (Table 3). We designed a new pair of primers, matK_G SC_F (5′-TATATATTGTATTCGATACAAAC-3′) and matK_GSC_R (5′-TTCTACGAATATTGGAATTGGAA-3′), based on the conserved region of the available Gentiana and P. hexandrum sequences, which successfully amplified all of the Gentiana and P. hexandrum samples. For trnL-F, there was only one nucleotide difference near the 5′ end terminus of the primer. Since only sample 2005-2706b (G. rhodantha) could not be amplified, the failure of amplification might be caused by fragmentation of the template DNA.
Sequence divergences of the seven DNA regions
The sizes of the seven loci (excluding the primer-binding sites) of the examined species are shown in Table 4. The sizes ranged from 239 to 940 bp, with most falling between 400 to 800 bp as the optimum range for routine PCR. The lengths of the protein-encoding genes rbcL and matK were identical across the samples, while the five intergenic spacers were found to be varied.
To show the discriminative powers of the seven DNA regions, we compared the sequence divergence of (1) medicinal Gentiana species (G. scabra, G. manshurica, G. triflora and G. rigescens) and their adulterants (G. rhodantha and P. hexandrum); and (2) GL (G. scabra, G. manshurica and G. triflora) and JL (G. rigescens) (Table 5). When comparing the divergences between medicinal Gentiana species and their adulterants, 5S rRNA had the highest divergence values, both interspecifically and intraspecifically, while rbcL had the lowest values (Table 5). The minimum divergence values of rbcL, matK, trnH-psbA, trnL-F, rpl36-rps8, ITS and 5S rRNA between medicinal Gentiana and P. hexandrum were 0.0995, 0.3399, 0.3908, 0.3888, 0.2211, 0.4781 and 0.6154, while the maximum intraspecific divergence values were only 0.0018, 0.0042, 0.0101, 0.0026, 0.0033, 0.0058 and 0.0939, respectively. For medicinal Gentiana and the adulterant G. rhodantha, the minimum divergence values between these regions were 0.0128, 0.0597, 0.1349, 0.1562, 0.0958, 0.0862 and 0.3098, while the maximum intraspecific divergence values were 0.0018, 0.0042, 0.0101, 0.0026, 0.0033, 0.0058 and 0.0914, respectively. Since the maximum intraspecific divergences of the seven loci were lower than the interspecific divergences, all of them could be employed to discriminate between medicinal Gentiana species and their adulterants.
The DNA sequences were significantly different in GL and JL. The minimum divergence values of rbcL, matK, trnH-psbA, trnL-F, rpl36-rps8, ITS and 5S rRNA between these two groups were 0.0109, 0.0521, 0.0780, 0.0332, 0.0392, 0.0462 and 0.4897, while the maximum intraspecific divergence values were 0.0018, 0.0042, 0.0101, 0.0026, 0.0000, 0.0043 and 0.0914, respectively. Therefore, GL and JL could be distinguished from each other using any of the seven DNA loci (Table 5). On the other hand, the genetic variability in the three GL species was extremely low for all loci. Only 5S rRNA could differentiate between G. manshurica and G. triflora, while trnL-F could distinguish G. scabra and G. triflora. Table 4 shows the selected polymorphic sites for differentiating among the three GL species. G. triflora, G. scabra and G. manshurica are genetically closely related, and possess the interchangeable medicinal applications.
To confirm the effectiveness of rbcL and matK in the identification of Gentiana species, we included all available Gentiana sequences in NCBI in the analysis of these two barcodes. In total, 14 rbcL sequences (including 10 sequences generated in this study) of 9 Gentiana species and 68 matK sequences (including 10 sequences generated in this study) of 23 Gentiana species and subspecies were aligned. For rbcL, the maximum intraspecific divergence value was 0.00215, while the minimum interspecific divergence value was 0. We found that the rbcL sequences of Gentiana andrewsii (HQ590117.1) and Gentiana pneumonathe (JN891473.1) were identical. For matK, the maximum intraspecific divergence value was 0.01032, while the minimum interspecific divergence value was 0. Twenty sequences were identical, including 12 samples of Gentiana verna (EF552088.1–EF552099.1), one sample of Gentiana schleicheri (EF552087.1), three samples of Gentiana pumila subsp. delphinensis (EF552084.1–EF552086.1) and four samples of Gentiana brachyphylla subsp. favratii (EF552080.1–EF552083.1). These results indicated that rbcL and matK could not resolve all Gentiana species well.
As shown in Figures 1, 2, 3, 4, 5, 6 and 7, the NJ trees of the seven barcodes revealed that medicinal Gentiana species were clearly differentiated from P. hexandrum. Among the Gentiana species, the three GL species were clustered together as a clade and separated from JL and G. rhodantha with high supporting bootstrap values (>70%), suggesting that the species identification among GL, JL and G. rhodantha can be well resolved by the seven DNA barcodes.
This study performed a comparative assessment of the discriminative powers of seven DNA regions for the authentication of genetically closely related medicinal Gentiana species and their adulterants. rbcL and matK are the two recommended DNA barcodes that can resolve 72% of land plants when used in combination . In our study, however, rbcL provided the lowest intraspecific and interspecific divergences. There were only 6 bp that differed out of 553 bp between GL and JL. It has also been shown that rbcL is the least divergent locus among 11 DNA barcode candidates for differentiating species in Solanaceae .
The other CBOL-recommended barcode matK had higher sequence divergence, but was difficult to amplify by PCR. There were mismatches between the primer and the published Gentiana sequences, indicating that the recommended matK primers might not be applicable to all land plants. A recent study of medicinal plants in Southern Morocco  shows that the success rate of PCR amplification of matK is less than 30%. Regarding the resolving power, matK had the third-highest value for differentiating between GL and JL (Table 5). Nevertheless, it was only ranked fifth and sixth for distinguishing between medicinal Gentiana species and their adulterants P. hexandrum and G. rhodantha, respectively.
trnL-F had the longest DNA sequence among the tested loci (Table 4). A Gentiana sample could not be amplified, which was probably due to fragmentation of the DNA or other reasons. trnL-F had a high resolving power, and was the only locus capable of differentiating G. scabra from G. triflora (Table 4), suggesting trnL-F as a good locus for differentiation of the closely related Gentiana species.
The size of rpl36-rps8 was small among the seven loci (Table 4) The PCR product of P. hexandrum was about 200 bp larger than those of Gentiana. Thus, the size difference could be used as a marker to distinguish Gentiana from P. hexandrum without DNA sequencing. Like rbcL, rpl 36-rps 8 also had low interspecific and intraspecific divergences, although its ranking was slightly higher than that of rbcL. Its major drawback was the limited number of reference sequences in GenBank.
The size of the trnH-psbA region ranged from 399 to 646 bp, which was moderate among the seven DNA regions (Table 4). There was a significant size difference between Gentiana and Podophyllum. In terms of the resolving power, trnH-psbA had ranked second for differentiating GL from JL, and provided higher resolving power than matK and rbcL. This intergenic spacer also shows a good amplification success rate and discrimination power among the nine loci tested . Among 19 species in seven families of angiosperms, trnH-psbA shows nearly three-fold higher divergence than other tested chloroplast regions, while the ITS region exhibits two-fold higher divergence than trnH-psbA.
Some studies [26–28] show that nuclear ITS is an appropriate DNA marker for herbal authentication and plant phylogenetic studies. In our study, the ITS region was the third longest region across Gentiana and P. hexandrum, and the sizes differed slightly from one another (Table 4). The divergence ranking was average among the five Gentiana species, but increased to the second highest for distinguishing medicinal Gentiana and P. hexandrum (Table 5), indicating that the ITS regions among the studied Gentiana species were quite conserved.
The size of the 5S rRNA intergenic spacer regions ranged from 239 to 457 bp, which was the smallest but most varied (Table 4). Among the tested regions, only 5S rRNA could distinguish G. triflora from G. manshurica and G. scabra. Our study showed that the intraspecific divergence was high, which was probably due to the non-homogeneity of the different copies of the 5S rRNA gene spacer. It is essential to clone the amplified PCR product prior to sequencing to overcome the sequence degeneration issue.
Jiang et al. established chemical profiles of Gentiana species. The four medicinal Gentiana species involved have close similarity in their chemical compositions, in that they all contain loganic acid, 6-O-B-D-glucopyranosylgentiopicroside, swertiamarin, gentiopicroside, and sweroside . Compound 2-(o,m-dihydroxybenzyl)-sweroside is only found in G. rigescens and can be used to differentiate GL and JL. Among the three GL species, only G. triflora contains gentiotrifloroside . The chemical profiles of G. manshurica and G. scabra are nearly identical, except that the former has a higher sweroside content . The chemical profiles therefore support our observations in the DNA barcode analyses.
All the tested loci could differentiate medicinal Gentiana species from their adulterants, and distinguish GL from JL. The two official DNA barcodes, rbcL and matK, have no significant advantages over the remaining five loci examined.
Consortium for the Barcode of Life
National Center for Biotechnology Information.
Kress WJ, Wurdack JK, Zimmer EA, Weigt LA, Janzen DH: Use of DNA barcodes to identify flowering plants. Proc Natl Acad Sci USA. 2005, 102: 8369-8374. 10.1073/pnas.0503123102.
Schoch CL, Seifert KA, Huhndorf S, Robert V, Spouge JL, Levesque CA, Chen W, Fungal Barcoding Consortium: Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi. Proc Natl Acad Sci USA. 2012, 109: 6241-6246. 10.1073/pnas.1117018109.
Consortium for the Barcode of Life.http://www.barcodeoflife.org/,
CBOL Plant Working Group: A DNA barcode for land plants. Proc Natl Acad Sci USA. 2009, 106: 12794-12797.
Ren BQ, Xiang XG, Chen ZD: Species identification of Alnus (Betulaceae) using nrDNA and cpDNA genetic markers. Mol Ecol Resour. 2010, 10: 594-605.
Li M, Cao H, But PPH, Shaw PC: Identification of herbal medicinal materials using DNA barcodes. J Syst Evol. 2011, 49: 271-283. 10.1111/j.1759-6831.2011.00132.x.
Yao H, Song J, Liu C, Luo K, Han J, Li Y, Pang X, Xu H, Zhu Y, Xiao P, Chen S: Use of ITS2 region as the universal DNA barcode for plants and animals. PLoS One. 2010, 5: e13102-10.1371/journal.pone.0013102.
Cronn RC, Zhao X, Paterson AH, Wendel JF: Polymorphism and concerted evolution in a tandemly repeated gene family: 5S ribosomal DNA in diploid and allopolyploid cottons. J Mol Evol. 1996, 42: 685-705. 10.1007/BF02338802.
Kondo K, Shiba M, Yamaji H, Morota T, Zhengmin C, Huixia P, Shoyama Y: Species identification of licorice using nrDNA and cpDNA genetic markers. Biol Pharm Bull. 2007, 30: 1497-1502. 10.1248/bpb.30.1497.
Kohjyouma M, Lee IJ, Iida O, Kurihara K, Yamada K, Makino Y, Sekita S, Satake M: Intraspecific variation in Cannabis sativa L. based on intergenic spacer region of chloroplast DNA. Biol Pharm Bull. 2000, 23: 727-730. 10.1248/bpb.23.727.
Li M, Ling KH, Lam H, Shaw PC, Cheng L, Techen N, Khan LA, Chang YS, But PP: Cardiocrinum seeds as a replacement for Aristolochia fruits in treating cough. J Ethnopharmacol. 2010, 130: 429-432. 10.1016/j.jep.2010.04.040.
The Pharmacopoeia Editorial Committee: Pharmacopoeia of the People’s Republic of China. 2010, Beijing: Chemical Industry Press
Anonymous: Zhong Yao Zhi. 1988, Beijing: The People’s Medical publishing House
Wang AY, Lian LH, Jiang YZ, Wu YL, Nan JX: Gentiana manshurica Kitagawa prevents acetaminophen-induced acute hepatic injury in mice via inhibiting JNK/ERK MAPK pathway. World J Gastroenterol. 2010, 16: 384-391. 10.3748/wjg.v16.i3.384.
He TN: Gentianaceae Dictionary of Chinese Botany. 1988, Beijing: Science Press
But PP, Tomlinson B, Cheung KO, Yong SP, Szeto ML, Lee CK: Adulterants of herbal products can cause poisoning. BMJ. 1996, 313: 117-
Jiangsu New Medical College: Zhong Yao Da Ci Dian. 1977, Shanghai: Shanghai Science and Technology Publishing House
Kang HW, Cho YG, Yoon UH, Eun MY: A rapid DNA extraction method for RFLP and PCR analysis from a single dry seed. Plant Mol Biol Rep. 1998, 16: 1-9.
Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680. 10.1093/nar/22.22.4673.
Hall TA: BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser. 1999, 41: 95-98.
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011, 28: 2731-2739. 10.1093/molbev/msr121.
Taberlet P, Gielly L, Pautou G, Bouvet J: Universal primers for amplification of three non-coding regions of chloroplast DNA. Plant Mol Biol. 1991, 17: 1105-1109. 10.1007/BF00037152.
White TJ, Burns T, Lee S, Taylor J: PCR Protocols: A Guide to Methods and Applications. 1990, San Diego: Academic Press
Carles M, Lee T, Moganti S, Lenigk R, Tsim KWK, Ip NY, Hsing IM, Sucher NJ: Chips and Qi: microcomponent based analysis in traditional Chinese medicine. Fresenius J Anal Chem. 2001, 371: 190-194. 10.1007/s002160100964.
Kool A, de Boer HJ, Krüger A, Rydberg A, Abbad A, Björk L, Martin G: Molecular identification of commercialized medicinal plants in southern Morocco. PLoS One. 2012, 7: e39459-10.1371/journal.pone.0039459.
Lau DTW, Shaw PC, Wang J, But PPH: Authentication of medicinal Dendrobium species by the internal transcribed spacer of ribosomal DNA. Planta Med. 2001, 67: 456-460. 10.1055/s-2001-15818.
Kitaoka F, Kakiuchi N, Long CF, Itoga M, Mitsue A, Mouri C, Mikage M: Molecular characterization of Akebia plants and the derived traditional herbal medicine. Biol Pharm Bull. 2009, 32: 665-670. 10.1248/bpb.32.665.
Li M, Jiang RW, Hon PM, Cheng L, Li LL, Zhou JR, Shaw PC, But PPH: Authentication of the anti-tumor herb Baihuasheshecao with bioactive marker compounds and molecular sequences. Food Chem. 2010, 119: 1239-1245. 10.1016/j.foodchem.2009.09.013.
Jiang RW, Wong KL, Chan YM, Xu HX, But PPH, Shaw PC: Isolation of iridoid and secoiridoid glycosides and comparative study on Radix Gentianae and related adulterants by HPLC analysis. Phytochemistry. 2005, 66: 2674-2680. 10.1016/j.phytochem.2005.09.028.
This research was supported by the Strategic Investment Scheme Fund (Scheme A) of the Chinese University of Hong Kong.
The authors declare that they have no competing interests.
KLW generated the DNA barcode sequences and performed data analyses. PPHB collected herbal materials and designed the study. PCS coordinated the study. KLW and PCS wrote the manuscript. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.