Forensically informative nucleotide sequencing (FINS) for the authentication of Chinese medicinal materials

Chinese medicinal materials may be authenticated by molecular identification. As a definitive approach to molecular identification of medicinal materials, forensically informative nucleotide sequencing (FINS) comprises four steps, namely (1) DNA extraction from biological samples, (2) selection and amplification of a specific DNA fragment, (3) determination of the sequence of the amplified DNA fragment and (4) cladistic analysis of the sample DNA sequence against a DNA database. Success of the FINS identification depends on the selection of DNA region and reference species. This article describes the techniques and applications of FINS for authenticating Chinese medicinal materials.


Background
World Health Organization estimates that 70-80% of the population in the developed countries have used some forms of alternative or complementary medicine [1]. Adulteration and misuse of Chinese medicinal products may be due to (a) accidental substitution due to the similarity of organoleptic characters, (b) inconsistent naming in local areas, (c) intentional substitution of expensive materials by less expensive items and (d) different use of substitutes in local areas. Conventional authentication methods based on organoleptic features and chemical constituents are influenced by various factors such as growing stages, environmental factors and post-harvest processing.
Molecular techniques have been employed to authenticate medicinal materials since the mid 1990s [2]. Molecular techniques, such as DNA fingerprinting, DNA sequencing and DNA microarray, have been applied extensively to authenticate Chinese medicinal materials with a number of these applications having been patented and commercialized [3]. DNA sequencing can retrieve the maximum molecular information from a particular DNA region. Polymorphism of nucleotide sequences provides information to distinguish closely related species from distantly related species and between genuine medicinal materials and adulterants.
Forensically informative nucleotide sequencing (FINS), a technique that combines DNA sequencing and phylogenetic analysis, is used to identify samples based on informative nucleotide sequences. The concept of FINS was first proposed by Bartlett and Davidson in 1992 to identify the origin of animal food products and has since been extensively applied in forensic investigations [4,5]. In the past decade, FINS has been applied to identify and authenticate the Chinese medicinal materials with species-specific DNA regions [6][7][8].
This article describes the techniques and applications of FINS in authenticating Chinese medicinal materials.

Performing FINS
A defined DNA sequence from examined specimen is obtained and compared with suitable reference sequences from a reliable database using a phylogenetic analysis to identify the tested material [4]. Four basic steps are involved in FINS, namely (1) DNA extraction from biological samples, (2) selection and amplification of a specific DNA fragment, (3) determination of nucleotide sequences and (4) identification using a phylogenetic analysis against a sequence database. Materials used to construct the reference database should be properly identified fresh materials or authentic preserved specimens. The total DNA can be isolated by either DNA extraction or DNA release [9]. A general workflow of FINS is given in Figure 1.

DNA extraction from biological samples
DNA extraction refers to an invasive method that extracts DNA from tissues and cells via physical disruption and/or chemical fractionation. Cetyl trimethylammonium bromide (CTAB) and phenol/chloroform extraction [10] was employed by a number of commercial kits for DNA extraction. For example, DNA from highly processed Chinese medicinal materials, such as the mule skin extract Asini Corii Colla (Ejiao) [11].
DNA release, a non-invasive method that allows DNA to release from a sample into a solution without destruction, is particularly useful for obtaining DNA from important voucher specimens. DNA release is also used to investigate samples by analyzing the environmental DNA or the preservative, as demonstrated by recent studies of DNA detection from the water in which frogs (Rana catesbeiana) live and from worms (Hypopta agavis) preserved in 95% ethanol [12,13]. The quantity and quality of the obtained DNA is a major concern with this method. While purification may be achieved by commercially available kits, the yield of DNA is quite minute and should be stored in safe conditions (freezing, cyanide and ethanol), certain chemicals that can damage DNA, such as ethyl acetate or formaldehyde, should be avoided [14].

Selection and amplification of a specific DNA fragment
Usually, only a small amount of DNA can be extracted or released from highly processed or improperly stored Chinese medicinal materials. Polymerase chain reaction (PCR) can produce a sufficient amount of a specific DNA fragment obtained from a tiny amount of DNA extract. The selection of a DNA region for amplification is one of the crucial factors for FINS because the resolution of FINS depends heavily on the variability and the number of informative sites of the DNA sequences of the tested samples and reference materials. As the evolutionary rates of different DNA regions vary, DNA regions with sufficient variability are essential for providing a high resolution result. Rapidly evolving regions among taxonomic groups can be used for the identification at the genus or species level. Slowly evolving regions among groups can be used to differentiate at the section or family level. An ideal DNA region for identifying Chinese medicinal materials should have high inter-specific variation but low intra-specific variation and have sufficient informative polymorphic sites to allow differential sequence alignment among the samples and the reference species. The evolutionary rate of the same DNA region may vary among animals, plants and fungi. For example, mitochondrial cytochrome c oxidase subunit 1 (COI) is suitable for the identification of specific animal species [15]; however, it is not suitable for most plants as few polymorphic sites are found across the 1.4 kb COI sequences [16], probably due to the slow mutation rate [17]. Thus, prior knowledge of the evolutionary rates of various DNA regions facilitates the selection of an appropriate DNA region. In the past few years, short DNA sequences for global barcoding of species have been proposed [15]. For example, the DNA barcodes for animals is COI and for fungi is ITS; the core DNA barcodes for plants are chloroplast large subunit of ribulose-bisphosphate carboxylase gene (rbcL) and chloroplast maturase K coding region (matK), while chloroplast trnH-psbA intergenic spacer (trnH-psbA) and nuclear internal transcribed spacer (ITS) are supplementary DNA barcodes for plants [15,18,19]. Recent studies suggested that ITS should be incorporated into the core DNA barcode for seed plants [20][21][22]. These DNA barcodes have also been commonly applied to identify medicinal materials and should be considered as the primary DNA target region for FINS [23]. Chinese medicinal materials are often dried or processed, which may affect the quality and quantity of the extractable DNA. A shorter DNA region should be considered for samples with degraded DNA. The universal primers for PCR amplification of some commonly used regions in FINS are listed in Table 1.

Determination of nucleotide sequences
DNA sequencing is the most direct approach to obtaining maximum genetic information of the amplified DNA regions. With significantly lowered costs and time, DNA sequencing is now routinely used to identify medicinal materials. The amplified and purified DNA fragments may be sequenced directly; however, molecular cloning may be applied in some cases. Cloning is required if (1) some DNA regions (e.g. ITS and 5S rRNA gene spacer) have non-homogenous multiple copies or secondary structures [24,25]; (2) non-specific PCR amplification generates multiple amplicons of similar size; (3) there is simultaneous amplification of DNA from samples and fungal contaminants (e.g. due to improper storage) and (4) there is poly-A/T structure (e.g. in trnH-psbA) interfering with the DNA sequencing [26].

Phylogenetic analysis with reference to a sequence database
A sequence database is necessary because a successful application of FINS relies on the comparison of DNA sequences among the samples and reference species. Phylogenetic analyses of many taxa using various DNA regions have been performed, providing useful reference for FINS. Our group has recently constructed an online Medicinal Materials DNA Barcoding Database http:// With the vast amount of sequence data, it is possible to roughly identify any unknown sample even if the sequence of its source species is not yet available. However, the quality of publicly available DNA sequences could sometimes be incorrect or derived from wrongly identified species [6,28,29]. Generation of tailor-made reference sequences is essential if the concerned reference sequences do not exist and high resolution identification is required. The original idea of FINS is to perform phylogenetic analysis of unknown samples together with the reference species to trace their source origin [4], which is different from molecular identification based solely on multiple sequence alignment and comparison of polymorphic sites. FINS emphasizes the use of phylogenetic analysis to identify species via phylograms [4]. In general, phylogenetic analysis carefully selects sequence alignment to find the informative homologous sites for subsequent analysis. Phylogenetic trees are then constructed using tree construction methods, such as maximum parsimony (MP), maximum likelihood (ML) and Bayesian analysis, to reflect the evolutionary history of the concerned taxa. Available computer programs for constructing multiple sequence alignment and phylogenetic analysis are Align-M, ClustalW, BioEdit, PAUP and MEGA [30].
To identify medicinal materials, FINS users would rather identify a sample than its phylogenetic relationship with the reference species. The topology of the phylogram is the major concern and the phylogenetic relationship among the reference species is less focused in FINS identification. It was suggested that DNA distance-based methods are preferred to phylogeny-based methods in the application of FINS for identification [31]. The DNA distance-based method provides similarities between the reference and the unknown species whereas the phylogeny-based method explores the evolutionary history of the species. The major difference between these two methods lies in the way that the DNA sequences are analyzed. Phylogenetic relationship analysis, such as maximum parsimony and maximum likelihood, uses a matrix of discrete phylogenetic informative characters or statistical models to infer the optimal phylogenetic trees of selected taxa. Distance-matrix methods, such as unweighed pair-group mean analysis (UPGMA) and neighbor-joining (NJ), calculate the genetic distance from multiple sequence alignments to determine the similarities among reference sequences. A cladogram is then constructed based on the pair-wise distance values to build up the relationship of similarity. The distance-matrix methods are simple to implement and do not invoke any evolutionary indications because similar looking species may not necessarily be phylogenetically related (i.e. convergent evolution).

Applications of FINS in Chinese medicinal materials
Over 800 medicinal species are officially recorded in the Pharmacopoeia of the People's Republic of China [32]. Some of these Chinese medicinal materials are economically important and ecologically valuable, such as Dendrobii Caulis (Shihu), while some others are highly toxic, such as Aristolochiae Fructus (Madouling) and Radix Tripterygii Wilfordii (Leigongteng). FINS is one of the most definitive methods to ensure they are used safely and to protect consumers from adulteration. Over the years, FINS has been used to identify economically important materials and ecologically valuable species, as well as toxic and commonly used Chinese medicinal materials. Examples of the identification of these Chinese medicinal materials using FINS are given in Table 2.
Requirements for FINS used to authenticate Chinese medicinal materials FINS has four major requirements on its application in identifying Chinese medicinal materials. Firstly, the success of FINS identification is highly dependent on the quality and amount of the reference sequences [6,28,29]. Confirmation of the authenticity of the reference sequences or generation of tailor-made sequences may be costly and time-consuming. Secondly, FINS requires a reference database to identify any single Chinese medicinal material. Therefore, it is important to select and/ or construct various databases with different reference species and different DNA regions to identify a mixture of Chinese medicinal materials. Thirdly, similar to other molecular identification techniques, FINS requires sufficient amount of good quality DNA. Some Chinese medicinal materials are derived from various plant parts with low DNA content and that were highly processed (e.g. by heat, boil or sun-dry). As a result, DNA can be damaged to the point where only very short fragments (< 200 base pair) are left [11,33]. These short DNA fragments may not possess sufficient informative characters for high resolution FINS identification. ITS2 may be a good region for FINS because of its small size (200-300 base pair) and its high variability in plants and animals [20], although molecular cloning is needed to overcome the problem of multiple copies and secondary structure [24,25]. Fourthly, contamination of fungal species is common in Chinese medicinal materials. Specific primers are required for the materials without the amplification of the contaminants when nuclear DNA regions, such as ITS and 5S rRNA gene spacer, are used.

Conclusion
Using to authenticate genuine medicinal materials, FINS actually traces the identifies of DNA samples at different taxonomic levels. High resolution FINS is expected to be useful in the authentication and quality control of Chinese medicinal materials.