Network pharmacological identification of active compounds and potential actions of Erxian decoction in alleviating menopause-related symptoms

Background Erxian decoction (EXD) is used to treat menopause-related symptoms in Chinese medicine. This study aims to identify the bioactive compounds and potential actions of EXD by network pharmacological analysis. Methods Two databases, the Traditional Chinese Medicine Systems Pharmacology database and TCM Database@Taiwan, were used to retrieve literature of phytochemicals of EXD. STITCH 4.0 and the Comparative Toxicogenomics Database were used to search for compound–protein and compound–gene interactions, respectively. DAVID Bioinformatics Resources 6.7 and Cytoscape 3.01 with Jepetto plugin software were used to perform a network pharmacological analysis of EXD. Results A total of 721 compounds were identified in EXD, of which 155 exhibited 2,656 compound–protein interactions with 1,963 associated proteins determined by STITCH4.0 database, and of which 210 had 14,893 compound–gene interactions with 8,536 associated genes determined by Comparative Toxicogenomics Database. Sixty three compounds of EXD followed the Lipinski’s Rule with OB ≥30% and DL index ≥0.18, of which 20 related to 34 significant pathway- or 12 gene- associated with menopause. Conclusions Twenty compounds were identified by network pharmacology as potential effective ingredients of EXD for relieving menopause with acceptable oral bioavailability and druggability. Electronic supplementary material The online version of this article (doi:10.1186/s13020-015-0051-z) contains supplementary material, which is available to authorized users.


Background
By the age of 35 years, the quality and quantity of ovarian follicles would decline [1], and consequential hormonal and symptomatic changes would lead to cessation of menses [2]. During menopause, the fluctuating levels of sex hormones, including luteinizing hormone, folliclestimulating hormone, estrogen, and progesterone [3], can cause osteoporosis and menopausal symptoms, such as hot flushes, depression, nocturnal sweating, uterine bleeding, vaginal dryness, insomnia, and loss of sexual function [4][5][6]. It is estimated that there will be about 1.2 billion menopausal women worldwide by 2030 [7]. Menopause occurs between 44.6 and 52 years of age, varying among different races and countries [8]. In the United States, about 6,000 women reach menopause every day, which is more than 2 million per year [7]. The average age of menopause in the United Kingdom and United States is 52 and 51 years, respectively [9,10]. In China, women around 50 years of age would experience natural menopause and in the southeast of China reach menopause at an average age of 48.9 years [11,12]; thus, 0.28 billion women will be over the age of 50 years by 2030 would have menopause [13].
Hormone replacement therapy (HRT) has been used for more than 60 years to relieve menopausal symptoms. However, there are many adverse effects associated with HRT [14], e.g., increasing the risks of breast cancer, coronary artery disease, endometrial cancer, venous thromboembolism and stroke [15].
During the past two decades, drug discovery has pursued a dominant target, "one drug, one disease" paradigm. However, many drugs exert therapeutic effects via restoration of multiple disease-related targets rather than a single one [36,37]. Network pharmacology, which is based on systems biology, polypharmacology and molecular network analysis, provides a possible strategy to elucidate the action mechanism of multi-ingredient medicine in a holistic view [38][39][40]. Molecular networks are constructed by interactions of target-based proteins and genes for predicting their function and facilitating drug discovery, which provides pharmacological information in a holistic manner [40,41]. Enrichment analysis is an analytical method to assess functional associations between sets of genes or proteins of interest to us and a database of known gene or protein sets [42,43]. It can identify the significant pathways and their enriched gene/protein sets, and elucidate significant multiple pharmacological mechanisms [42,44].
The complexity of numerous chemical constituents and biological actions has not been fully identified in EXD. This study aims to identify the bioactive compounds and actions of EXD by a network pharmacological analysis.

Methods
The constituent compounds of EXD were identified by two phytochemical databases, the Traditional Chinese Medicine Systems Pharmacology (TCMSP) database and TCM Database@Taiwan., as well as published EXD literatures [26-30, 35, 45, 46]. The druggability analysis of the identified compounds in EXD were performed and provided by Lipinski's rule (LR) and TCMSP database in term of oral bioavailability (OB) and drug-likeness (DL) indices, respectively. OB is the degree to which a drug or other substance becomes available to the target tissue after oral administration. DL is to evaluate their potentials to be bioactive compounds compare with the well-developed drug. The significant pathways and gene-associated diseases for the identified compounds were determined by enrichment analysis (JEPETTO (US): http://apps.cytoscape.org/apps/jepetto) [43] of the compound-protein interaction and enrichment analysis (DAVID 6.7 (US): http://david.abcc.ncifcrf.gov/home.jsp) [47] of the compound-gene interactions, respectively. The workflow of the network pharmacology study of EXD was summarized in Figure 1.

Druggability analysis by LR, OB and DL properties
Lipinski's rule (LR) [48] was used to identify druggable compounds according to the following criteria: molecular weight (MW) of not more than 500 Da (MW ≤500), chemical composition with no more than five hydrogen bond donors (H-bond donors ≤5), no more than 10 hydrogen bond acceptors (H-bond acceptors ≤10), and octanol-water partition coefficient, LogP, no >5 (LogP ≤ 5). A compound that does not satisfy at least two of the above conditions is less likely to be an orally active drug [49].
The phytochemical information of the compounds with their OB and DL properties were explored using the TCMSP database, which embed OBioavail 1.1 software for OB [50] and Tanimoto similarity software for DL [51]. The DL calculations in TCMSP database were based on the following formula [51]: where A is related to the molecular property of the target compound and B refers to the average molecular properties of all drugs from the Drugbank database (http://www. drugbank.ca/). A more detailed calculation of the DL index can be found in Tao et al. [51] and Wang et al. [52]. The thresholds used were OB ≥30% and DL index ≥0.18, as recommended by the TCMSP database. The thresholds were selected to efficiently identify bioactive compounds from the large pool of chemical compounds based on the following criteria: (1) the model obtained could be reasonably explained by previous pharmacological data and (2) the compound met the recommended mean DL index of 0.18 (the mean of DL index of 6,511 molecules from Drugbank database (2011) is 0.18) [51,52].

Identification of associated proteins and genes
The integrative efficacy of the identified constituents in EXD was determined by analyzing the chemical-protein and chemical-gene interactions obtained from the Search Tool for Interactions of Chemicals and Proteins (STITCH) database and Comparative Toxicogenomics Database (CTD), respectively. The STITCH 4.0 database (http://stitch.embl.de/) can be used to study potential interactions between 300,000 phytochemicals and 2.6 million proteins curated from 1,133 organisms [53]. In this database, the approximate probability of a predicted association for a chemical-protein interaction is determined by the confidence score, with a higher score indicating a stronger interaction (low confidence score ~0.2; medium confidence score ~0.5; high confidence score ~0.75; highest confidence score ~0.95, provided by STITCH 4.0 database). The CTD (http://ctd.mdibl.org/) is a publicly available research resource that includes more than 116,000 interactions between 9,300 chemicals and 13,300 genes [54]. Both databases were searched independently by two researchers to minimize any bias.
In order to identify the associated significant pathways, proteins with a chemical-protein interaction confidence score ≥0.5 were selected for the enrichment analysis by JEPETTO with the KEGG database, a Java-based Cytoscape 3.01 plugin [43]. For studying the gene-associated diseases, the genes were firstly ranked by frequency of occurrence of the chemical-gene interactions, and then the genes with gene frequency ≥1.67 were chosen for the enrichment analysis by Visualization and Integrated Discovery (DAVID) Bioinformatics Resources 6.7 (http://david.abcc.ncifcrf.gov/).

Compounds in EXD
Eight hundred and ninety-five phytochemicals were collected from the six herbs in EXD. From the TCM Database@Taiwan, 203 compounds were identified, comprising 29 in HE, 44 in RC, 38 in RMO, 56 in RAS, seven in CPC, and 29 in RA. From the TCMSP database, 646 compounds were identified, comprising 130 in HE, 78 in RC, 174 in RMO, 125 in RAS, 58 in CPC, and 81 in RA. 46 phytochemicals from previous studies in the literature [26-30, 35, 45, 46], comprising 15 in HE, one in RC, five in ROM, five in RAS, 14 in CPC, 5 in RA, and one in EXD (specific herbs unknown). Finally, a total of 721 phytochemicals were identified in EXD after removing overlapping/duplicate compounds from the databases and the literature (Additional file 1).

Identifying druggable compounds by LR, OB, and DL predictions
Of the 150 compounds from HE, 75 (50%) compounds were identified based on LR, 23

Revealing the significant pathways and gene-associated diseases
Overall, 155 of the 721 compounds from EXD were found to have 2,656 chemical-protein interactions. After removing the overlapping/duplicate information, 1,963 associated proteins were obtained (Additional file 2). 1,824 of 1,963 proteins with a confidence score exceeding 0.5 were obtained. After enrichment analysis of 1,824 associated proteins, XD-scores and q values of pathways have been obtained. The XD-score is relative to the average distance to all pathways and represents a deviation from the average distance [43]. A larger positive XD-score indicates a stronger association between the inputted associated proteins and molecular interaction network of pathways. The q value determines the significance of the overlap (Fisher's exact test) between the input information and the pathways. The enrichment algorithm analysis (graphbased statistic) of XD-score and q-value revealed that the threshold value of XD-score in this study was 0.67, therefore there are 34 pathways significantly associated with input set of proteins (Table 3).
In total, 210 of the 721 compounds from EXD were found to have 14,893 compound-gene interactions with 8,536 associated genes in the CTD (Additional file 3). Subsequently, the 8,536 genes were ranked according to their frequency of occurrence. The number of genes fell abruptly when the frequency of occurrence was small (gene frequency ≤8; Figure 2). Subsequently, the number of genes became stabilized for gene frequencies between 10 and 19. However, the number of genes with gene frequencies ≥20 was quite small. Genes with gene frequencies below the average of 1.74 were removed to reduce the number of redundant genes. After that, the remaining 2,183 genes were used to conduct the gene enrichment analysis by the DAVID platform. The "GENETIC_ASSOCIATION_DB_ DISEASE_CLASS" was selected as the annotation category to search for the significant diseases associated with the input genes, which was statistically verified by Fisher's exact test using the DAVID platform [47]. P ≤ 0.01 indicated significant association or enrichment with the related items. After removing nonspecific diseases, 12 classes of diseases were found to be highly associated with the input genes (Tables 4 and 5). Most of these diseases were related to menopause, such as aging, reproduction, cancer, cardiovascular diseases, and neurological diseases [55][56][57][58].

Identifying twenty bioactive compounds related to menopause with following the druggability prediction
Eighteen of the 155 compounds that have 2,656 chemical-protein interaction, followed the Lipinski's Rule with OB ≥30% and DL index ≥0.18. Thirteen of the 210 compounds that have compound-gene interactions interaction, followed the Lipinski's Rule with OB ≥30% and DL index ≥0.18. Finally, 11 compounds has been identified related to both chemical-gene and chemical-protein interaction and followed the druglikeness prediction. Moreover, 20 compounds related to 34 significant pathway-or 12 gene-associated with menopause have been identified (Table 3).

Discussion
The actions of bioactive compounds in EXD were investigated by combining a drug prediction method with an enrichment analysis using information from bioinformatics databases at the gene and protein levels. For example, candidate compounds such as berberine, palmatine, and jatrorrhizine, which we identified using our drug prediction method, have been shown to exhibit extensive pharmacological activities [59,60]. From the enrichment analysis based on the available information for compound-protein and compound-gene interactions of EXD, we identified the most significantly related pathways and gene-associated disease, including pathways related to endocrine [35], VEGF [61], lipid metabolism [62] and anti-inflammatory [34]. Their pharmacological association with EXD were in line with previous publications [34,35,61,62].

Figure 3
Chemical-protein interactions related to steroid hormone biosynthesis pathways. The grey color represents genes in the target set, green relates to the steroid hormone biosynthesis pathway, blue (labeled) is the overlap between the related pathway and the input protein set.
Several pathways involved the endocrine have also been identified, such as steroid hormone biosynthesis, GnRH signaling pathway, and adipocytokine signaling pathway, covering the previous finding of our group to promote estradiol biosynthesis in animal study [35]. For the steroid hormone biosynthesis signaling pathways, the EXD compound, quercetin, promoted the expression of aromatase (CYP19A1), which is the enzyme for estrogen biosynthesis [63]. This compound also met the druggability criteria. Other important overlapping proteins were HSD11B1, SULT2B1, CYP1A1, COMT, and CYP1B1 ( Figure 3).
While a previous study showed EXD to have antiinflammatory activity [34], the present study suggested Figure 4 Chemical-protein interactions related to the VEGF signaling pathway. The grey color represents genes in the target set, green relates to the VEGF pathway, blue (labeled) is the overlap between the related pathway and the input protein set. The orange is the expansion of their pathways.
the pathways to include the Toll-like receptor signaling pathway, NOD-like receptor signaling pathway, and Fc epsilon RI signaling pathway [70][71][72]. This findings were consistent with previous studies on EXD antimetastatic activity in a human ovarian cancer model [73] and its antiangiogenic properties [61].
Compound-compound interactions were not considered in this study because the available databases could only provide limited information for the six individual herbs. The information of the databases did not cover the new compounds synthesized by chemical reactions during the decoction of EXD's ingredients, which will be confirmed by liquid chromatograph couple with mass spectrometry in further study. The ranking of the compound-gene and compound-protein interaction information was based on published evidence, but qualify of this evidence still needs extensive assessment. This study exemplified how to screen and identify bioactive compounds in CHFs.

Conclusions
Twenty compounds were identified by network pharmacology as potential effective ingredients of EXD for menopause with acceptable oral bioavailability and druggability.