Skip to main content

Predictive QSAR model confirms flavonoids in Chinese medicine can activate voltage-gated calcium (CaV) channel in osteogenesis



Flavonoids in Chinese Medicine have been proven in animal studies that could aid in osteogenesis and bone formation. However, there is no consented mechanism for how these phytochemicals action on the bone-forming osteoblasts, and henceforth the prediction model of chemical screening for this specific biochemical function has not been established. The purpose of this study was to develop a novel selection and effective approach of flavonoids on the prediction of bone-forming ability via osteoblastic voltage-gated calcium (CaV) activation and inhibition using molecular modelling technique.


Quantitative structure–activity relationship (QSAR) in supervised maching-learning approach is applied in this study to predict the behavioral manifestations of flavonoids in the CaV channels, and developing statistical correlation between the biochemical features and the behavioral manifestations of 24 compounds (Training set: Kaempferol, Taxifolin, Daidzein, Morin, Scutellarein, Quercetin, Apigenin, Myricetin, Tamarixetin, Rutin, Genistein, 5,7,2′-Trihydroxyflavone, Baicalein, Luteolin, Galangin, Chrysin, Isorhamnetin, Naringin, 3-Methyl galangin, Resokaempferol; test set: 5-Hydroxyflavone, 3,6,4′-Trihydroxyflavone, 3,4′-Dihydroxyflavone and Naringenin). Based on statistical algorithm, QSAR provides a reasonable basis for establishing a predictive correlation model by a variety of molecular descriptors that are able to identify as well as analyse the biochemical features of flavonoids that engaged in activating or inhibiting the CaV channels for osteoblasts.


The model has shown these flavonoids have high activating effects on CaV channel for osteogenesis. In addition, scutellarein was ranked the highest among the screened flavonoids, and other lower ranked compounds, such as daidzein, quercetin, genistein and naringin, have shown the same descending order as previous animal studies.


This predictive modelling study has confirmed and validated the biochemical activity of the flavonoids in the osteoblastic CaV activation.


Flavonoids are polyphenol compounds that are categorized according to their chemical structures into distinct groups, namely flavonols, flavones, flavanones, flavan-3-ols, and isoflavones. In fact, these flavonoids are widely presented in various agricultural food, natural products and Chinese Medicine, and they can exert various health promoting effects in the human body based on their chemical structures [1]. In particular, flavonols are a class of flavonoids, and some compounds, e.g. kaempferol, quercetin, quercetrin, rutin and myricetin are commonly found in Hippophae rhamnoides, Hypericum perforatum, and Cacumen platycladi [2]. In addition, from Fructus viticis and Perilla frutescens, various flavones such as apigenin, isovitexin, luteolin and vitexin could be obtained [3, 4]. Favones is another class of flavonoids that could be found easily in Radix scutellariae and Cuscuta chinensis. For example, naringenin can reduce cholesterol levels [5], hesperidin can reduce inflammation via its suppression pathways of lipopolysaccharide (LPS)-elicited and infection-induced Tumor necrosis factor alpha (TNF-α) production [6], and naringin can be used in bone graft material to induce osteogenesis [7]. Flavan-3-ols include the catechins and the catechin gallates. The major compounds are catechin, epicatechin, catechin gallate and epicatechin gallate which are the active components of green tea leaves (Camellia sinensis) and have antimutagenic, antitumour, anti-inflammatory and free-radical scavenging activities [8]. Isoflavones has its main sources in soy cheese, soy flour, soybean and tofu, etc. Daidzein and genistein are among several known isoflavones [9].

Some flavonoids, such as daidzein [10], quercetin [11], genistein [12] and naringin [7], have been proven in animal studies that could aid in osteogenesis and bone formation. However, there is no consented mechanism for how these chemicals action on the bone-forming osteoblasts. Some evidences have shown bone resorption [13], cell proliferation [14] and cell signal transport [15] were related to activation of osteoblastic calcium channels. In particular, L-type calcium channels (e.g. CaV1.2) could mediate the change of Ca2+ inside the osteoblasts by some regulatory agents such as parathyroid hormone (PTH) [16] and vitamin D [17]. However, study also showed the inhibition of the channels might also promote osteoblast differentiation [18]. On the other hand, Saponara et al. [19] has recently shown in the whole-cell patch-clamp experiments that 24 flavonoids are either activators or deactivators of CaV1.2 channel current measured in artery myocytes of rat tail. Thus, it seems to that the actual physiological mechanisms are unclear [20].

To establish the correlation for the flavonoids with known biochemical activity of being a blocker or activators of Ca2+ channels, quantitative structure–activity relationship (QSAR) modeling might be useful to identify and screen the flavonoids, since QSAR could predict biochemical activities for new or untested flavonoids of the same class via selected molecular characteristics (descriptors) that has correlation with the biochemical activity of activating or blocking CaV channels. In fact, QSAR has been used in the fields of medicine, biochemistry, molecular biology and biomaterial science for more than three decades. QSAR is particular useful in screening and predicting the biochemical interaction between, for example, enzyme and complex phytocompounds [21]. Recently, predictive QSAR can be operated in either small, focused and good biochemical data (so-called supervised machine learning approach) that can easily map the biological responses with input feature parameters [22], or utilizing on big enough data for “black box” (so-called deep learning or descriptor-free approach) [23] that can assist in drug design even the chemicals are not existed [24].

Thus, to consider the specific effects of flavonoids on CaV channel current kinetics, a supervised machine learning QSAR model can be built based on chemical structures together with suitable biochemical data (molecular descriptors), such as pKa and patch-clamp experiment data, of the flavonoids to provide a better understanding of the structure–activity relationship between the compounds and CaV. These descriptors have been selectively chosen for the flavonoids since there have been increasing amount of molecular descriptors represented by quantum-chemical and various classical parameters that were designed and tested as potential variables for QSAR modeling. Quantum-chemical parameters represent a special class of molecular properties. They can be obtained from sophisticated ab initio calculations or by means of relatively inexpensive semi-empirical methods, but in the case of flavonoids, such calculations require more time and effort than those for one, two or three-dimensional classical parameters which can be computed from molecular structures of flavonoids within a few minutes. However, in contrast to most classical descriptors, quantum-chemical parameters are capable of expressing all the electronic and geometric properties of the flavonoids being analyzed as well as their interactions [25]. Therefore, the interpretation of quantum-chemical descriptors can provide much deeper insights into the nature of flavonoids’ biochemical and physicochemical mechanisms than that of classical descriptors. The advantages of descriptors calculated by means of quantum-chemical approaches that account for specific and non-specific solvation effects are of prime importance.

This paper focuses on QSAR studies by applying the quantum descriptors based on the current literatures’ biochemical data in predicting the effects of flavonoids’ biochemical activities on the osteoblast’s CaV channel current in order to demonstrate its important biochemical and physicochemical properties on either activating or deactivating the CaV channel for osteogenesis. As such, we anticipate a novel selection and effective approach of flavonoids for CaV activation and inhibition could be developed.

Materials and methods

QSAR equation modeling attempted to calculate the mathematical correlations between the tested compounds’ chemical attributes and its biochemical response. Such attempt aimed to establish statistical formalism that was indicated as biochemical response of flavonoid = f (flavonoids’ biochemical attributes). The flavonoids’ biochemical attributes and property were derived from the flavonoid’s chemical structure and property. Hence, the QSAR equation was expressed as:

$${\text{Biochemical}}\;{\text{response}} = f({\text{flavonoid's}}\;{\text{chemical}}\;{\text{structure}}\;{\text{and}}\;{\text{property}}).$$

To be stated in a simple way, the QSAR equation could be statistically expressed as:

$$Y = \beta_{0} + \beta_{1} X_{1} + \beta_{2} X_{2} + \beta_{3} X_{3} + \cdots + \beta_{n} X_{n} ,$$

where \(\beta_{0}\) was a constant; \(\beta_{1, } \beta_{2, } \beta_{3, } \ldots \beta_{n}\) were the inputs of descriptors; \(X_{1, } X_{2, } X_{3, } \ldots X_{n}\) were different flavonoids’ structural features; and Y was biochemical response.

Steps of QSAR modeling

The four basic steps of QSAR study included (i) data preparation, (ii) data processing, (iii) data prediction and validation, and (iv) data interpretation. The first step was allowed to arrange the data in a convenient and usable form. Since biochemical responses of the flavonoids on CaV channel were considered as the dependent variable, the input data were the flavonoids’ rate of activation and inhibition that were retrieved from Saponara et al. [19]. The predictor variables (i.e. molecular descriptors) could be obtained from chemical structure and property of the flavonoids. After the determination and computation of descriptors, a QSAR table was formed that was a two-dimensional (2D) array of numbers with the columns representing descriptors and response and compounds were depicted in successive rows. As QSAR was basically a statistical approach, the number of observations was higher than the number of descriptors used in the final models for achieving sufficient modeling reliability and robustness. By considering the presence of intercorrelated and redundant data, a pretreatment procedure was also used in the data-processing step.

In each step of the QSAR model development, several statistical operations were involved right from the generation of descriptors which were encoding of information to the pretreatment of data, classification of the data set, development of model, validation and reliability check of the model. Although the partial least squares (PLS) and multiple linear regression (MLR) were common statistical tools to develop QSAR models with genetic algorithm (GA) serving as variable selection methods, these techniques might be inappropriate if \(X_{ij}\) is highly correlated or high dimensional, especially in comparison to sample size that might cause variable selection procedures to be unstable.

$$Y_{i} = \beta_{0} + \sum \beta_{j} X_{ij} + \varepsilon_{i} .$$

In these cases, it could find the way to reduce the amount of covariate information because the main focus was on future prediction.

Instead, \(\xi_{j}\) was the principle components (PCs) of \(X_{\text{i}}\):

$$X_{\text{i}} = \mathop \sum \limits_{j = 1}^{p} \alpha_{ij} \xi_{j} .$$

Then, the principle components regression (PCR) model was developed

$$Y_{i} = \beta_{0}^{'} + \mathop \sum \limits_{j = 1}^{{p^{\prime}}} \beta_{j}^{'} \alpha_{ij} + \varepsilon_{i} ,$$

for some \(p^{\prime} < p.\)

PCR had the advantages that αij were uncorrelated to strengthen the stability of estimates, and established stable variable selection through dimension reduction. By choosing \(p^{\prime}\), enough PCs could be used to do variable selection, capture higher percentage of variation, and maximize adjusted r2. The basic workflow of QSAR analysis along with the principal component regression (PCR) was depicted in Fig. 1. On successful runs of Principal Component Regression (PCR) by the QSAR Module of the VLifeMDS 4.3 software (VLife Technologies, Pune, India), the QSAR equations were generated to statistically analyze and determine the model.

Fig. 1

Flowchart of the QSAR formalism

Computation of molecular descriptors

First, the QSAR software was applied to align the 3D structural data of the flavonoids as shown in Fig. 2 in order to investigate the variation of molecular shape of each molecule. There had been 20 flavonoids to be included in the training set for deriving QSAR model whereas the chemical structures classified by subclass of flavonoids are shown in Table 1 and the acid dissociation constant (pKa) (i.e. − log10 Ka) values of the hydroxyl groups are shown in Fig. 2. Moreover, the data of the rate of activation and inhibition from Saponara et al. [19] are supplemented as the dataset in which the current evoked at 0 mV from a Vh of − 50 mV activated and inhibited with τ of activation ranging between 2.2 and 3.1 ms, and τ of inactivation between 92.0 and 127.9 ms are shown in Table 2. Molecular descriptors, which characterized specific information about a flavonoid, were the numerical value affiliated with the biochemical response for correlation of chemical structures with various biochemical properties. In other words, the modeled response was represented as a function of quantitative values of structural features or properties that were termed as descriptors for the QSAR model. Ab initio derived electronic properties in combination with topological quantum-chemical descriptors (i.e. “k2alpha”, “Id”, “IdwAverage”, “Most+vePotential”, “MomInertiaY” and “DeltaEpsilonC”) were used to help to describe the electronic environment of the flavonoids and locate molecular regions responsible for given bioactivity of flavonoids on the CaV channel [26].

Fig. 2

Predicted pKa values (in red colour) for the flavonoids used in this study (Source: PubChem)

Table 1 Chemical structure of flavonoids classified by subclass of flavonoids
Table 2 Rate of activation and inactivation of flavonoids on CaV channel under control conditions

The type of descriptors used and the extent to which they could encode the structural features of the molecules that were correlated to the response were critical determinants of the quality of the QSAR model. The ways of chemical structures used to calculate descriptors for QSAR model were illustrated in Fig. 3. The data set of flavonoids constituted a group of small polyphenol compounds which can both block and enhance Ca2+ current. Firstly, the half maximal activiting/inhibitory concentration (IC50) was regarded as the activatory/inhibitory activity values. Then, [IC50(μM)] that was referred as the activity data was transformed into the logarithmic scale pIC50, i.e. [− log IC50(μM)], that had been applied as the response variables to obtain the linear relationship in the QSAR equation. Secondly, the biochemical database of the study was randomly classified into two subsets that include 4 compounds of test-set and 20 compounds training-set (Table 3). Thirdly, the molecular descriptors were computed by the docking QSAR software for different types of theoretical descriptors for each flavonoid. Finally, two models, namely model A for CaV activation and model B for CaV inhibition, were generated by PCR after it screened for different combinations of descriptors by genetic algorithm.

Fig. 3

Flowcharts describing the ways of chemical structures used to calculate descriptors for QSAR model

Table 3 Flavonoids classified by its corresponding PubChem CID and training/test set

Validation on QSAR models

Although there are no confirmatory experiments performed to validate simulation results, these in silico data could be confirmed by other QSAR methods such as comparative molecular field analysis and support vector machine. Validation for QSAR model was done based on the flavonoids for detecting the precision of predictions. The leave-one-out cross validation technique was mainly involved in validating the sample (n = 24). For checking reliability of a QSAR model for prediction of the response property on the data, the original data set was classified into 6 subsets with each of size 4. The validation process was repeated by using 5 subsets as training set and the rest 4 as testing set. Training set was employed for model development while the ability of the model to predict response value of the flavonoids was done using the testing set. The developed models were subjected to statistical validation tests to establish its reliability. Steps of validation methods were indicated in Fig. 4. The following metrics for determination of QSAR quality, as well as internal and external validation were used:

Fig. 4

Steps of validation methods for the QSAR model

  1. a.

    Metrics for determination of quality of QSAR model

Determination coefficient (r2):

$${\text{r}}^{2} = 1 - \frac{{\sum \left( {Y_{obs} - Y_{cal} } \right)^{2} }}{{\sum \left( {Y_{obs} - \overline{{Y_{obs} }} } \right)^{2} }} .$$

Adjusted r2a

$${\text{r}}_{a}^{2} = \frac{{\left( {{\text{N}} - 1} \right) \times r^{2} - p}}{N - 1 - p}.$$

Variance ratio (F)

$${\text{F}} = \frac{{\frac{{\sum (Y_{calc} - \bar{Y})^{2} }}{p} }}{{\frac{{\sum (Y_{obs} - Y_{calc} )^{2} }}{N - p - 1} }} .$$

Standard error of estimate (s)

$$S = \sqrt {\frac{{(Y_{obs} - Y_{cal} )^{2} }}{N - p - 1}} .$$
  1. b.

    Validation metrics for QSAR model

    1. i.

      Internal validation

Leave-one-out (LOO) cross-validation

$$PRESS = \sum \left( {Y_{obs} - Y_{pred} } \right)^{2} ,$$
$${\text{SDEP}} = \sqrt {\frac{\text{PRESS}}{n}} ,$$
$${\text{q}}^{2} = 1 - \frac{{\sum \left( {Y_{{obs\left( {train} \right)}} - Y_{{pred\left( {train} \right)}} } \right)^{2} }}{{\sum \left( {Y_{{obs\left( {train} \right)}} - \bar{Y}_{training} } \right)^{2} }} = \frac{PRESS }{{\sum \left( {Y_{{obs\left( {train} \right)}} - \bar{Y}_{training} } \right)^{2} }} .$$

The \(r_{m}^{2}\) metric for internal validation

$$\overline {{r_m}^2} = \frac{{\left( {{r_m}^2 + {{r_m^\prime }^2}} \right)}}{2},$$
$$\Delta {r_m}^2 = \left| {{r_m}^2 - {{r_m^\prime}^2}} \right|,$$
$$r_{m}^{2} = r^{2} \times \left( {1 - \sqrt {\left( {{\text{r}}^{2} - {\text{r}}_{0}^{2} } \right)} } \right),$$
$${{r_m^\prime}^2} = {r^2} \times \left( {1 - \sqrt {\left( {{r^2} - {{r_0^\prime }^2}} \right)} } \right),$$
$${\text{Scaled}}\;{Y_i} = \frac{{\left( {{r_m}^2 + {{r_m^\prime }^2}} \right)}}{2},$$
$${\text{Scaled}}\;Y_{i} = \frac{{Y_{i} - Y_{{\text{min} \left( {obs} \right)}} }}{{Y_{{\text{max} \left( {obs} \right)}} - Y_{{\text{min} \left( {obs} \right)}} }}.$$
  1. ii.

    Metrics for external validation

Predictive r2pred

$${\text{r}}_{pred}^{2} = 1 - \frac{{\sum \left( {Y_{{obs\left( {test} \right)}} - Y_{{pred\left( {test} \right)}} } \right)^{2} }}{{\sum \left( {Y_{{obs\left( {test} \right)}} - \bar{Y}_{training} } \right)^{2} }} .$$

Root mean square error in prediction (RMSEP)

$${\text{RMSEP}} = \sqrt {\frac{{\sum \left( {y_{{obs\left( {test} \right)}} - y_{{pred\left( {test} \right)}} } \right)^{2} }}{{n_{ext} }}} .$$
  1. iii.

    Molecular descriptor

“k2alpha” is descriptor indicating second order kappa alpha shape index (2kα or k2alpha):

$${\text{k}}2{\text{alpha}} = \frac{{\left( {A + \alpha - 1} \right)\left( {A + \alpha - 1} \right)^{2} }}{{\left( {P + \alpha } \right)^{2} }}.$$

“Id” and “IdwAverage” that are the type of information theory based descriptors on distance equality, whereas the total information content on the distance equality (Id):

$${\text{Id}} = \frac{{A\left( {A - 1} \right)}}{2}log_{2} \left( {\frac{{A\left( {A - 1} \right)}}{2}} \right) - \mathop \sum \limits_{g = 1}^{G} flog_{2} \left( f \right),$$

where f is the number of distances with equal g values in the triangular D submatrix. D is an A × A matrix that contains the graph distances between atoms. The graph distances are calculated as 1/(the number of bonds between atoms)2

The mean information content on the distance equality (IdwAverage):

$${\text{IdwAverage}} = - \mathop \sum \limits_{g = 1}^{G} \frac{2f}{{A\left( {A - 1} \right)}}log_{2} \left( {\frac{2f}{{A\left( {A - 1} \right)}}} \right).$$


As listed in Table 4, for Model A, the QSAR model not only had internal predictive ability (\(q^{2}\) = 6.93%) and external predictive ability (\(r^{2}_{pred}\) = 95.86%), but also could explain 31.82% of the total variance (\(r^{2}\) = 0.3182) in the training database. The F-test = 3.9664 showed that the Model A was statistically significant with p-value < 0.001 for which it indicated the model had less than 0.1% probability of making an error. For Model B, although the QSAR model had only external predictive ability (\(r^{2}_{pred}\) = 52.27%) and could only explains 8.45% of the total variance (\(r^{2}\) = 0.0845) in the training dataset, its internal predictive ability (\(q^{2}\) = 11.48%) was relatively higher than that of Model A. Similarly, the F-test = 1.5682 also indicated the Model B was statistically significant with p < 0.001 which meant that the Model B’s likelihood of committing an error was less than 5%. Therefore, both Model A and B were justified for their internal and external predictive ability.

Table 4 The QSAR model with the corresponding parameters of estimates

Tables 5 and 6 indicate the observed values and the predicted values of the activation (Model A) and inhibition (Model B) activity of flavonoids on CaV channel, respectively. Figure 5a shows the goodness of fit graph of observed activity and predicted activity of the flavonoids activating on CaV channel. Moreover, it indicated how good the actual training dataset could be fitted by the predicted PCR equation. From the Radar plots as shown in Fig. 5b, c, the fitted PCR equation of the training data set could be predicted well by the test data set. Hence, the predictive ability of Model A could be confirmed. On the other hand, Fig. 6a indicates that the fitness plot of observed activity and predicted activity of the flavonoids inhibiting on CaV channel. In addition, it showed how well the actual training dataset could be fits by the predicted PCR equation. From Radar plots in Fig. 6b, c, it was also revealed that the fitted PCR equation of the training data set could be predicted well by the test data set. Therefore, the predictive ability of Model B could also be confirmed.

Table 5 Observed and predicted activity of the flavonoids on the CaV activation (Model A)
Table 6 Observed and predicted activity of the flavonoids on the CaV inhibition (Model B)
Fig. 5

a Model A’s graph of goodness of fit indicating observed and predicted activity of polyphenols on CaV activation by QSAR equations along with the residuals, b Model A’s Radar plot depicting closeness between the actual and predicted activity of the flavonoid compounds of training set, c Model A’s Radar plot depicting closeness between the actual and predicted activity of the test set’s compounds

Fig. 6

a Model B’s graph of goodness of fit indicating observed and predicted activity of polyphenols on CaV inhibition by QSAR equation along with the residuals, b Model B’s Radar plot depicting closeness between the actual and predicted activity of the flavonoid compounds of training set, c Model B’s Radar plot depicting closeness between the actual and predicted activity of test set’s compounds

Table 7 shows the ranking the order of activation of flavonoids on the CaV channels such that the descending order of flavonoids activation on the CaV channels are: scutellarein > morin > daidzein > myricetin > apigenin > quercetin > (±)-taxifolin > 5,7,2′-trihydroxyflavone > genistein and so on.

Table 7 The order of the flavonoids ranked by relative contribution of individual descriptors using pIC50 in model A and the order of the compounds ranked by the proportion of increase in in vivo new bone formation


The results indicated in Table 7 has shown the molecular structures, contribution graphs and parameters of selected descriptors of flavonoids as predicted in the CaV activation model with QSAR equations (model A) The results were surprisingly consistent with the order of the flavonoids ranked by the percentage of increase in new bone formation, i.e. daidzein (602%) > quercetin (556%) > genistein (520%) > naringin (490%) as reported in series of in vivo animal studies by Wong et al. [7, 10,11,12]. This partial external validation, despite not a full set of comparison, gives a good guarantee for the present predictive supervised machine-learning QSAR model that can precisely ranks and predicts the flavonoids effects on osteogenesis based on the existing biochemical information with CaV [19].

Activation and inhibition activities of flavonoids were investigated based on their ability to maintain the balance between activation and inactivation in the CaV by binding to the beta subunit receptor of CaV. In this case of activation, it was indicated that activating activity is mainly the outcome of electronic interactions between atomic charges within flavonoids and possible receptor-like structures in the CaV. In the case of inhibition, it was shown that the binding affinities of selected flavonoids to the CaV receptor are highly dependent on physicochemical properties involved in the interactions. As such, statistically the Model A presented the justified characteristics of molecular structure of flavonoid compounds that were required for activating CaV channel in which electrostatic fields were estimated using “k2alpha” that was descriptor signifying second alpha modified shape index, and “Id” and “IdwAverage” that were the type of information theory based descriptors. On the contrary, the Model B showed different structural features that were required for flavonoids to inhibit CaV channel. Its electrostatic fields were estimated using “Most+vePotential” that was descriptor indicating the highest value of positive electrostatic potential on the van der Waals surface area of the flavonoids, “MomInertiaY” that was steric descriptor signifying moment of inertia at Y-axis, and “DeltaEpsilonC” that was descriptor for electronegativity signifying differences between the frontier molecular orbital energies. Hence, from the six descriptors selected for the principal component regression model, one was related to the electronic (i.e. “Most+vePotential”) or two were related to the physicochemical (“MomInertiaY” and “DeltaEpsilonC”) properties of the whole molecules and three (“k2alpha”, “Id”, “IdwAverage”) described electronic properties of individual atoms. All these selected descriptors correspond to the analogous behavior in terms of tendency that was being also observed by the QSAR analysis on the CaV channel to be activated and inhibited by the flavonoids.

Furthermore, the QSAR model was chosen in accordance with the parameter estimates of \(r^{2}\), \(q^{2}\), \(r^{2}_{pred}\), F-stat and p-value. Since the \(r^{2}\) value = 0.3182 of Model A was higher than the Model B’s \(r^{2}\) value = 0.0845, and the \(q^{2}\) value = 0.0693 of Model A was relatively lower than the Model B’s \(q^{2}\) value = 0.1148, and \(r^{2}_{pred}\) = 0.9586 of Model A was higher than the Model B’s \(r^{2}_{pred}\) = 0.5227, Model A had justified values for being selected to be the better QSAR model to support the argument that CaV channel was more likely to be activated rather than being inhibited by the flavonoids. From this point, the QSAR approach pursues its objective of understanding the biochemical effects of the flavonoids on the CaV channel and providing practical suggestions for screening optimal flavonoids have been demonstrated in the investigations of its activating and inhibitory activity on CaV channel. Moreover, analogous behavior in terms of selected descriptors tendency was also observed by the QSAR analysis on the CaV channel to be activated and inhibited by the flavonoids. Activation and inhibition activity of flavonoids was investigated based on their ability to maintain the balance between activation and inactivation in the CaV by binding to the beta subunit receptor of CaV. In this case of activation, it was indicated that activating activity is mainly the outcome of electronic interactions between atomic charges within flavonoids and possible receptor-like structures in the CaV. In the case of inhibition, it was shown that the binding affinities of selected flavonoids to the CaV receptor are highly dependent on physicochemical properties involved in the interactions.

Apparently, the electronic properties of the flavonoids were found to be significant after exploring the entire pool of the classical and electronic variables for screening a QSAR model which has thousands of parameters available from experiment and in silico calculations that could potentially serve as independent variables (descriptors) in statistical analysis. However, it has also been known from fitting this QSAR study that utilization of an excessive number of descriptors leads to over-fitting of QSAR models and/or increases the risk of chance correlations. Despite the existence of rules for building successful and meaningful QSAR models, the increasing complexity of biochemical mechanisms the flavonoids on the CaV creates the need for considering a large variety of variables that makes the knowledge-based approach to the identification of the most significant descriptors for this particular case of investigating flavonoids’ biochemical activities on the CaV channel extremely difficult. Therefore, this is the main reason to apply PCR to perform reduction of data by generating linear combinations of molecular descriptors [27]. The PCR method identifies correlated variables, groups them into linear combinations, and generates uncorrelated orthogonal variables that are uncorrelated and called principal components. The process of data transformation is given by \({\text{X}} = TP^{T}\), where X represents the initial data matrix, T is a score matrix that defines the position of data points in a new coordinate system and P is a loadings matrix. The loadings indicate how much each original descriptor contributes to the corresponding PC. Scores and loadings allow the data points to be mapped into the new vector space defined by PCs [28].

The correlation between independent and dependent variables could statistically be determined to fit a PCR line to the data so as to obtain a best-fit equation. Then, the goodness of fit for a PCR equation was estimated by referring to its standard deviation and correlational coefficient in which the level of statistical significance of the PCR equation was represented by the F-statistics with its corresponding p-value. By applying PCR analysis, enough PCs could be used to do variable selection by choosing p-value in order to maximize adjusted r2. We notice that the limitation of this study could be only ascribable to 24 compounds, and the predictive power could be less using this small dataset. Indeed, various QSAR studies [29,30,31] have used 10–25 compounds to generate the predictive model that seems to be quite successful. On the other hand, regarding to the results, although it is generally recommended that r2 should be > 0.7, and q2 should be > 0.5, these are not stringent guidelines. The predictive power of QSAR should not solely rely on r2 and q2 [32]. In particular, we have also cross-compared with the in vivo animal studies (Table 7) for daidzein, quercetin, genistein and naringin [7, 10,11,12]. If necessary, further time-consuming in vivo studies could be done in order to completely validate the model.

In this study the combination of electronic and physicochemical descriptors helped to identify molecular shape, hydrophobicity and electronic properties as three major factors responsible for these types of activation and inhibition activity of flavonoids on the CaV. The use of QSAR in screening the bioactive flavonoids for tissue engineering applications is relatively new. The success of this QSAR modeling in the accurate determination of electronic properties of biochemically significant flavonoids may initiate QSAR studies in tissue engineering that focus specifically on the exploration of bioactive growth factors for cells. QSAR provides an invaluable tool for calculating quantum-chemical descriptors that demonstrate high potential in generating predictive QSAR models without the addition of a large number of descriptors for various groups of growth factors [33, 34]. Since osteogenesis is a very dynamic process, other factors that are related to CaV channel such as runx2 activation, ALP secretion, osteocalcin level, angiogenesis and mineralization can also be incorporated if appropriate in the future study. Nevertheless, cautious should be taken into account for QSAR studies because it is only an approximating method. When many physicochemical properties are involved, it is not always possible to vary one property without affecting another. Moreover, it does not provide an in-depth insight on the mechanism of biological action of flavonoids. Also, there may be some risk of inaccurate predictions of biological activity of this type of flavonoid compounds.

Through the QSAR study, we have established a predictive molecular modeling method that allows one to estimate the properties of flavonoids as bioactive compounds at a much lower cost and environmentally-friendly than that of actual laboratory screening. Since both the model’s predictive ability and the scientific insights into biochemical activity in the CaV depend on the descriptors selected in the modeling process, this study has indicated that the use of quantum-chemical descriptors under supervised machine-learning has an obvious advantage over other experimentally measured properties. Since they are reproducible in the framework of the chosen approximation, they allow meaningful interpretation of QSAR models in terms of the biochemical mechanism of flavonoids as activator of the CaV. Thus, it can offer a clear guidance for molecule optimization or design of flavonoids as a growth factor for osteogenesis.


This predictive QSAR study confirmed and validated the biochemical activity of the flavonoids in the CaV, such that flavonoids can activate CaV in osteogenesis. Scutellarein was predicted to rank the highest among the screened flavonoids.

Availability of data and materials

The datasets during and/or analysed during the current study available from the corresponding author on reasonable request.



Voltage-gated calcium




Quantitative structure–activity relationship


Tumor necrosis factor alpha


Partial least squares


Multiple linear regression


Genetic algorithm


Principle components


Principle components regression


Half maximal inhibitory concentration


  1. 1.

    Kim KH, Tsao R, Yang R, Cui SW. Phenolic acid profiles and antioxidant activities of wheat bran extracts and the effect of hydrolysis conditions. Food Chem. 2006;95(3):466–73.

    CAS  Article  Google Scholar 

  2. 2.

    Wang W, Lin P, Ma LH, Xu KX, Lin XL. Separation and determination of flavonoids in three traditional chinese medicines by capillary electrophoresis with amperometric detection. J Sep Sci. 2016;39(7):1357–62.

    CAS  Article  PubMed  Google Scholar 

  3. 3.

    Cao XC, Zou H, Cao JG, Cui YH, Sun SW, Ren KQ, et al. A candidate Chinese medicine preparation-fructus viticis total flavonoids inhibits stem-like characteristics of lung cancer stem-like cells. BMC Complement Altern Med. 2016.

    Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Ali F, Rahul, Naz F, Jyoti S, Siddique YH. Health functionality of apigenin: a review. Int J Food Prop. 2017;20(6):1197–238.

    CAS  Article  Google Scholar 

  5. 5.

    Lee SH, Park YB, Bae KH, Bok SH, Kwon YK, Lee ES, et al. Cholesterol-lowering activity of naringenin via inhibition of 3-hydroxy-3-methylglutaryl coenzyme A reductase and acyl coenzyme A: cholesterol acyltransferase in rats. Ann Nutr Metab. 1999;43(3):173–80.

    CAS  Article  PubMed  Google Scholar 

  6. 6.

    Kawaguchi K, Kikuchi S, Hasunuma R, Maruyama H, Yoshikawa T, Kumazawa Y. A citrus flavonoid hesperidin suppresses infection-induced endotoxin shock in mice. Biol Pharm Bull. 2004;27(5):679–83.

    CAS  Article  PubMed  Google Scholar 

  7. 7.

    Wong RWK, Rabie ABM. Effect of naringin collagen graft on bone formation. Biomaterials. 2006;27(9):1824–31.

    CAS  Article  PubMed  Google Scholar 

  8. 8.

    Khalatbary AR, Tiraihi T, Boroujeni MB, Ahmadvand H, Tavafi M, Tamjidipoor A. Effects of epigallocatechin gallate on tissue protection and functional recovery after contusive spinal cord injury in rats. Brain Res. 2010;1306:168–75.

    CAS  Article  PubMed  Google Scholar 

  9. 9.

    Laurenz R, Tumbalam P, Naeve S, Thelen KD. Determination of isoflavone (genistein and daidzein) concentration of soybean seed as affected by environment and management inputs. J Sci Food Agric. 2017;97(10):3342–7.

    CAS  Article  PubMed  Google Scholar 

  10. 10.

    Wong RWK, Rabie ABM. Effect of daidzein on bone formation. Front Biosci. 2009;14:3673–9.

    CAS  Article  Google Scholar 

  11. 11.

    Wong RWK, Rabie ABM. Effect of quercetin on bone formation. J Orthop Res. 2008;26(8):1061–6.

    CAS  Article  PubMed  Google Scholar 

  12. 12.

    Wong RW, Rabie AB. Effect of genistin on bone formation. Front Biosci. 2010;2:764–70.

    Google Scholar 

  13. 13.

    Ritchie CK, Maercklein PB, Fitzpatrick LA. Direct effect of calcium-channel antagonists on osteoclast function—alterations in bone-resorption and intracellular calcium concentrations. Endocrinology. 1994;135(3):996–1003.

    CAS  Article  PubMed  Google Scholar 

  14. 14.

    Riddle RC, Taylor AF, Genetos DC, Donahue HJ. MAP kinase and calcium signaling mediate fluid flow-induced human mesenchymal stem cell proliferation. Am J Physiol Cell Physiol. 2006;290(3):C776–84.

    CAS  Article  PubMed  Google Scholar 

  15. 15.

    Jorgensen NR, Teilmann SC, Henriksen Z, Civitelli R, Sorensen OH, Steinberg TH. Activation of L-type calcium channels is required for gap junction-mediated intercellular calcium signaling in osteoblastic cells. J Biol Chem. 2003;278(6):4082–6.

    CAS  Article  PubMed  Google Scholar 

  16. 16.

    Li W, Duncan RL, Karin NJ, Farach-Carson MC. 1,25 (OH)2D3 enhances PTH-induced Ca2+ transients in preosteoblasts by activating L-type Ca2+ channels. Am J Physiol. 1997;273(3 Pt 1):E599–605.

    CAS  PubMed  Google Scholar 

  17. 17.

    Bergh JJ, Shao Y, Puente E, Duncan RL, Farach-Carson MC. Osteoblast Ca(2+) permeability and voltage-sensitive Ca(2+) channel expression is temporally regulated by 1,25-dihydroxyvitamin D(3). Am J Physiol Cell Physiol. 2006;290(3):C822–31.

    CAS  Article  PubMed  Google Scholar 

  18. 18.

    Nishiya Y, Kosaka N, Uchii M, Sugimoto S. A potent 1,4-dihydropyridine L-type calcium channel blocker, benidipine, promotes osteoblast differentiation. Calcif Tissue Int. 2002;70(1):30–9.

    CAS  Article  PubMed  Google Scholar 

  19. 19.

    Saponara S, Carosati E, Mugnai P, Sgaragli G, Fusi F. The flavonoid scaffold as a template for the design of modulators of the vascular Ca(v)1.2 channels. Br J Pharmacol. 2011;164(6):1684–97.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Blair HC, Schlesinger PH, Huang CL, Zaidi M. Calcium signalling and calcium transport in bone disease. Sub-Cell Biochem. 2007;45:539–62.

    CAS  Article  Google Scholar 

  21. 21.

    Gao H. Predicting tyrosinase inhibition by 3D QSAR pharmacophore models and designing potential tyrosinase inhibitors from traditional Chinese medicine database. Phytomedicine. 2018;38:145–57.

    CAS  Article  PubMed  Google Scholar 

  22. 22.

    Lo YC, Rensi SE, Torng W, Altman RB. Machine learning in chemoinformatics and drug discovery. Drug Discov Today. 2018;23(8):1538–46.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Chakravarti SK, Alla SRM. Descriptor free QSAR modeling using deep learning with long short-term memory neural networks. Front Artif Intell. 2019.

    Article  Google Scholar 

  24. 24.

    Popova M, Isayev O, Tropsha A. Deep reinforcement learning for de novo drug design. Sci Adv. 2018.

    Article  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Todeschini R, Consonni V. In: Mannhold R, Kubinyi H, Timmerman H, editors. Handbook of molecular descriptors. Methods and principles in medicinal chemistry. Hoboken: Wiley; 2008.

  26. 26.

    Fernandez M, Caballero J. Modeling of the inhibition of the intermediate-conductance Ca2+ Activated K+ channel (IKCa1) by some triarylmethanes using quantum chemical properties derived from Ab initio calculations. QSAR Comb Sci. 2008;27(7):866–75.

    CAS  Article  Google Scholar 

  27. 27.

    de Molfetta FA, Angelotti WFD, Romero RAF, Montanari CA, da Silva ABF. A neural networks study of quinone compounds with trypanocidal activity. J Mol Model. 2008;14(10):975–85.

    CAS  Article  PubMed  Google Scholar 

  28. 28.

    Kholodovych V, Smith JR, Knight D, Abramson S, Kohn J, Welsh WJ. Accurate predictions of cellular response using QSPR: a feasibility test of rational design of polymeric biomaterials. Polymer. 2004;45(22):7367–79.

    CAS  Article  Google Scholar 

  29. 29.

    Jain SV, Ghate M, Bhadoriya KS, Bari SB, Chaudhari A, Borse JS. 2D, 3D-QSAR and docking studies of 1,2,3-thiadiazole thioacetanilides analogues as potent HIV-1 non-nucleoside reverse transcriptase inhibitors. Org Med Chem Lett. 2012;2(1):22.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Bhadoriya KS, Kumawat NK, Bhavthankar SV, Avchar MH, Dhumal DM, Patil SD, et al. Exploring 2D and 3D QSARs of benzimidazole derivatives as transient receptor potential melastatin 8 (TRPM8) antagonists using MLR and kNN-MFA methodology. J Saudi Chem Soc. 2016;20:S256–70.

    CAS  Article  Google Scholar 

  31. 31.

    Goyal M, Grover S, Dhanjal JK, Goyal S, Tyagi C, Grover A. Molecular modelling studies on flavonoid derivatives as dual site inhibitors of human acetyl cholinesterase using 3D-QSAR, pharmacophore and high throughput screening approaches. Med Chem Res. 2014;23(4):2122–32.

    CAS  Article  Google Scholar 

  32. 32.

    Gramatica P, Sangion A. A historical excursus on the statistical validation parameters for QSAR models: a clarification concerning metrics and terminology. J Chem Inf Model. 2016;56(6):1127–31.

    CAS  Article  PubMed  Google Scholar 

  33. 33.

    Yu XL, Liu WQ, Liu F, Wang XY. DFT-based theoretical QSPR models of Q-e parameters for the prediction of reactivity in free-radical copolymerizations. J Mol Model. 2008;14(11):1065–70.

    CAS  Article  PubMed  Google Scholar 

  34. 34.

    Liu WQ, Yi PG, Tang ZL. QSPR models for various properties of polymethacrylates based on quantum chemical descriptors. QSAR Comb Sci. 2006;25(10):936–43.

    CAS  Article  Google Scholar 

Download references


This work was done in partial fulfillment of the requirements of the degree of Ph.D. for the first author at the Faculty of Dentistry, The University of Hong Kong. Part of the data has been presented in 94th General Session and Exhibition of the International Association of Dental Research (IADR), Seoul, Korea.


Not applicable.

Author information




KC analyzed and interpreted the data, and was a major contributor in writing the manuscript. JT was a major contributor in reviewing the manuscript, and handled the project supervision. HL provided resources in the simulation, and reviewed the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to James Kit-Hon Tsoi.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chan, K., Leung, H.C.M. & Tsoi, J.K. Predictive QSAR model confirms flavonoids in Chinese medicine can activate voltage-gated calcium (CaV) channel in osteogenesis. Chin Med 15, 31 (2020).

Download citation


  • QSAR
  • Flavonoids
  • Voltage-gated calcium channels
  • Computer modelling