PlantID – DNA-based identification of multiple medicinal plants in complex mixtures

Background An efficient method for the identification of medicinal plant products is now a priority as the global demand increases. This study aims to develop a DNA-based method for the identification and authentication of plant species that can be implemented in the industry to aid compliance with regulations, based upon the economically important Hypericum perforatum L. (St John’s Wort or Guan ye Lian Qiao). Methods The ITS regions of several Hypericum species were analysed to identify the most divergent regions and PCR primers were designed to anneal specifically to these regions in the different Hypericum species. Candidate primers were selected such that the amplicon produced by each species-specific reaction differed in size. The use of fluorescently labelled primers enabled these products to be resolved by capillary electrophoresis. Results Four closely related Hypericum species were detected simultaneously and independently in one reaction. Each species could be identified individually and in any combination. The introduction of three more closely related species to the test had no effect on the results. Highly processed commercial plant material was identified, despite the potential complications of DNA degradation in such samples. Conclusion This technique can detect the presence of an expected plant material and adulterant materials in one reaction. The method could be simply applied to other medicinal plants and their problem adulterants.


Background
The quality of Chinese medicines (CM) has been questioned because of "continuing evidence of an international trade in herbal remedies made to an unreliable standard" [1][2][3], and there is increasing international demand for regulation of phytomedicines and definitive quality standards [4,5]. In the European Union (EU), the Traditional Herbal Medicines Directive (Directive 2004/ 24/EC) regulates medicinal plant products for human use, and all medicinal plant products must now hold a Traditional Herbal Registration (THR) certified by a logo on all packaging. To gain a THR, many factors must be certified, such as the identification and authentication of the medicinal plant material upstream of manufacturing and processing. This is currently achieved by morphological and chemical methods, both of which are timeconsuming and cost-intensive [6].
DNA-based methods are a preferred alternative, because they are more efficient, less expensive and timeconsuming, require less plant material, and can reliably distinguish materials to the species level [6][7][8][9]. However, the current identification methods (DNA-based and chemical) are not capable of resolving mixtures of several plant species simultaneously.
Identification of species independently and concurrently in complex mixtures has become increasingly important for regulators based on two main factors. First, adulteration and contamination occur, particularly with rare and expensive medicinal plant species that are substituted with less valuable alternatives [10,11]. These alternatives may have no biological activity or might have detrimental health implications. Therefore, the target plant and the dangerous adulterant must be identified to assess the safety of products. Second, synergistic polyherbal formulations are fundamental to the practices of CM, in that the use of specific combinations of medicinal plant materials results in an enhanced outcome, i.e. the whole being greater than the sum of its parts. Each of the plants included must be identified to authenticate such preparations.
As a pilot study for the design of industry-standard DNA-based identification assays, Hypericum perforatum L. (St John's Wort or Guan ye Lian Qiao), which is used for its anti-inflammatory and antimicrobial properties in CM and for the treatment of mild to moderate depression in Europe [12], was selected as the target for the design of a new assay.
The nuclear ribosomal internal transcribed spacer (nrITS) regions were used as targets for the design, and the sequences from 15 Hypericum species were aligned and analysed to identify the most divergent regions. PCR primers were then designed to anneal specifically to these regions of the different species. Candidate primers were selected such that the species-specific product of each PCR differed in size and could be resolved by capillary electrophoresis with the fluorescently labelled primers.
This method is capable of detecting the closely related species Hypericum androsaemum, Hypericum athoum, Hypericum ascyron and Hypericum perforatum individually and in any combination, from within a mixture of DNA from seven Hypericum species. This technique has the power to both confirm the presence of the expected plant material and detect the adulterant material in one reaction. The method of design can be replicated for any other medicinal plants and their problem adulterants.
This study aims to develop a DNA-based method for the identification and authentication of plant species that can be implemented in the industry to aid compliance with regulations, through the discrimination of several different Hypericum species, based on a similar design to that used by Tobe and Linacre [13] for identifying mammalian species. These species represent a worst-case scenario for discrimination, as they are extremely closely related.

Primer validation
The initial testing of the species-specific primers was conducted by conventional PCR using the High-Fidelity PCR amplicons as DNA templates. The products from these amplifications were diluted to a suitable working concentration (H. athoum, 1 × 10 -4 dilution; remainder, 1 × 10 -5 dilution). This enabled all possible combinations to be tested against vouchered DNA samples, which were in limited supply. To test cross-amplification, non-target DNA panels were created with the remaining six Hypericum species (Table 2). This meant that six species could be eliminated for cross-amplification in one reaction. As each species represented one-sixth of the DNA present in the panel (e.g. 5ng each, totalling 30 ng), the DNA used to check for amplification of the target DNA was diluted to produce the same final concentration (e.g. 5 ng).
The reactions consisted of Green GoTaq W Flexi Buffer (Promega, USA) (1×), MgCl 2 (2.5 mM), GoTaq W DNA Polymerase (Promega, USA) (1.25 U), relevant primers (0.1 μM each), dNTPs (0.1 μM each) and template DNA (1 μL of appropriate sample dilution) made up to a final volume of 25 μL with nuclease-free water in 0.2-mL Manufacturer's instructions were followed with the exception of conducting two disruption steps of 1 min at 30 Hz after the addition of 400 μL of Buffer AP1 and 4 μL of RNase A to the sample at the beginning of the procedure. Dried material (0.02 g) from within the capsules was used and the resultant DNA samples stored at 2-5°C.

Multiplex PCR
A DNA panel was created that included all seven Hypericum species available to optimise the multiplex PCR The initial PCR products were diluted by 1 × 10 -5 (except H. athoum, 1 × 10 -4 ) and equal volumes of each sample were added to the panel mixtures, as indicated by a tick. The panels are described by the name of the species that is not present, indicated by a cross. These panels were then used to test the species-specific primers for each target for cross-amplification, allowing six DNA samples to be tested simultaneously. The panel used to test the multiplex reaction contained all the available samples, and is named Multiplex.
Figure 1 ITS amplification products from all seven Hypericum samples were included as templates in a multiplex PCR. The multiplex panel composition is described in  The selected pairs are indicated in bold. Candidate selection was based on specificity, and further analysis of theoretical interactions in the multiplex PCR and discrimination of amplicons by capillary electrophoresis were used to make the final selection.

Capillary electrophoresis
The fluorescently labelled multiplex PCR products were analysed in an ABI Prism™ 3130 Genetic Analyzer (Applied Biosystems, USA) using a 30-cm capillary and Performance Optimised Polymer 4 (Applied Biosystems, USA). The run module used consisted of a 12-s injection at 1.2 kV, followed by electrophoresis at 60°C and 15 kV for 25 min. Each multiplex PCR product (1 μL) was diluted with 8.5 μL of Hi Di™ Formamide and 0.5 μL of GeneScan™-500 ROX™ size standard (Applied Biosystems, USA) before the capillary electrophoresis. GeneMapper W ID v3.2 fragment analysis software (Applied Biosystems, USA) was used.
The candidate primers were introduced to the multiplex reaction. This was carried out with the mixture of all seven Hypericum species nrITS sequences (Table 2), and the products were separated by capillary electrophoresis (Figure 1). The concentration of each primer pair was optimised to account for the efficiency differences found when introducing the primers into a multiplex reaction. Panels and bins were created in GeneMapper ID v.3.2 (Applied Biosystems, USA) following the instructions in the software, to allow for automatic recognition and labelling of amplicons falling within ± 0.5 bp of the determined fragment size for each species of interest.

Results
The nrITS sequences from seven Hypericum species were aligned and analysed to provide the basis for the species-specific primer design. PCR primers were designed for regions where the sequences of the four target species (H. androsaemum, H. ascyron, H. athoum and H. perforatum) differed from all the other species. The sequences of each primer pair were species-specific, because they matched only one sequence from all the input sequences. A total of 19 primer pairs were designed, all of which produced amplicons of the expected size when tested with the target DNA template in conventional PCR: six pairs for H. androsaemum; four pairs for H. ascyron; four pairs for H. athoum; and five pairs for H. perforatum (Table 3). The results for the six H. androsaemum pairs are shown in Figure 2.
The primer pairs were tested for cross-amplification against a panel of nrITS sequences from closely related Hypericum species, the non-target panel ( Table 2). In each case, panels were constructed using High-Fidelity PCR amplifications of the nrITS regions from vouchered species specimens. Primer pairs that gave a product with the nrITS of the target DNA, but not with the nontarget DNA panel, were candidates for the multiplex reaction. Of the candidate primer pairs found for each of the four species, one primer pair per species was selected for the multiplex reaction (shown in Figure 3, highlighted in Table 3). These pairs were chosen based on analysis with the AutoDimer v.1 software, which indicated a low possibility of interactions between the candidate primers when all were introduced into a multiplex system. The candidate primers were also selected based on the size of the resultant amplicons, because the products must be sufficiently different in size to facilitate their separation by capillary electrophoresis. Consequently, a minimum length difference of 5 bp was considered.
PCR products were used as the templates for the design and optimisation of the assay. But genomic DNA is likely to be the eventual target for the method. The assay was conducted with mixtures of genomic DNA from each of the target species to ensure that the assay was equally efficient for this type of sample (Figure 4), producing a higher quality profile of peaks with fewer baseline anomalies.
The working assay was used to test DNA extracted from commercial preparations of St. John's Wort sold in capsule or tablet form, produced by companies A, B and C. The three DNA samples were previously confirmed by conventional PCR of the ITS regions [17]. DNA degradation and/or shearing was observed in the samples from companies A and B, as the 750-bp ITS region could not be amplified, while a smaller amplicon (160 bp) within this region was amplified. This may have arisen because of the age and processed nature of the plant material. The sample from company C produced the full 750-bp ITS amplicon.
The multiplex reaction was carried out using DNA extracted from the three samples. The DNA extracted from the samples of companies A and B was identified as H. perforatum, whereas the DNA from the sample of company C was not identified as any of the Hypericum species in the assay ( Figure 5). The sample from company C is labelled as a mixture of plant extracts, and the amplified DNA may come from another species. However, H. perforatum was stated as the dominant plant component, and would have been expected to be the dominant DNA component.

Discussion
The developed assay, PlantID, can detect four closely related species (i.e. the number of sequence differences between species within the nrITS region is minimal) simultaneously in one reaction within a mixture of seven species. Variation in the sequences used for the design confers the possibility for successful species-specific primer design, which is essential for this assay. The plant species in genuine cases of misidentification or adulteration may not be this closely related, and are likely to contain more DNA sequence variation.
During the design process, amplicons were ensured to be of the shortest possible length, thereby optimising the technique for use with degraded or fragmented samples following the miniSTR (Sequence Tagged Repeat) approach introduced by Butler in 2003 [18]. This enables the assay to be used with commercial products containing highly processed plant material. This is of particular importance when testing tablet forms of herbal medicines, as well as ingested samples collected during postmortem [19].
The number of species identified using this technique could be dramatically increased by altering a few parameters. Although only one type of fluorescent label was used in this study, the system is capable of simultaneous detection of five fluorophores in a run. One of these fluorophores is reserved for the size standard, leaving four options for primer labelling, which can increase the detection yield by four-fold. In addition, the nrITS region alone was the basis for this "proof of concept" assay design. The use of more DNA regions would increase possibilities for unique annealing positions for primers, thereby increasing detection. The use of multiple DNA sequence regions for the assay design could also confer greater reliability. The technique could be further developed such that each species is identified by the presence of several peaks, which would greatly reduce the possibility of a false-positive result. The optimisation of this assay will aim to achieve the optimal multiplex reaction so that each peak produced is of equal intensity when the input DNA templates are at the same concentration. This would then produce a semiquantitative assay, with relative peak heights indicating which DNAs are present at the highest and lowest concentrations.
The evaluation of polyherbal preparations would benefit from the development of this type of assay, for which no other techniques can confirm the presence of each individual species. For example, Ayurvedic preparations, such as Dashmool, contain many different plant species that are highly processed, making them impossible to identify morphologically. Chemical analyses of such a mixed preparation containing many compounds from different species produce a highly complex profile. Substitution of the raw materials in this preparation is common and has led to the development of a DNA-based assay that can identify one species that should be in the preparation (Desmodium gangeticum) and two species that are often found as adulterants (Desmodium velutinum and Desmodium triflorum) [20]. Each of these is identified by an individual PCR, the product of which is then analysed by gel electrophoresis. However, the multiplex PlantID system could potentially identify all 10 different species that should be present in the preparation, and also test for species that are known to be used as adulterants, in one reaction.

Conclusion
This technique can detect the presence of an expected plant material and adulterant materials in one reaction. The method could be simply applied to other medicinal plants and their problem adulterants.
(See figure on previous page.) Figure 5 Multiplex PCR product profiles from DNA extracted from three different companies products(companies A, B and C, panels labelled respectively). The results show a positive peak for H. perforatum for companies A and B, but not for company C.