Machine learning in TCM with natural products and molecules: current status and future perspectives

Ma, Suya; Liu, Jinlei; Li, Wenhua; Liu, Yongmei; Hui, Xiaoshan; Qu, Peirong; Jiang, Zhilin; Li, Jun; Wang, Jie

doi:10.1186/s13020-023-00741-9

Chinese Medicine

Table 1 Comparisons between machine learning algorithms

From: Machine learning in TCM with natural products and molecules: current status and future perspectives

Machine learning algorithm		Advantages	Limitations	Applications
Deep learning	CNN	High accuracy; Weight-sharing; Relieves the model overfitting problem; Shift-invariant feature enhances the robustness of the network	The pooling layer loses a lot of valuable information; Substantial hardware and dataset size requirements [41]; No memory function; Shift-invariant feature also prevents the neuron that recognizes the object from being activated when the object changes slightly	Computer vision [40, 41]; Natural language processing [40]
	Elman RNN	Strong ability to extract time series features; Better generalization ability	Prone to show gradient exploding and gradient vanishing; Unable to solve the problem of long-term dependencies and parallel training [42]	Time series data, e.g., natural language processing [22]
	LMST	Achieve better analysis results in longer sequences; Solving vanishing gradient problem and stability problems in the time dimension of Elman RNN	When processing longer sequences data, LSTM is still difficult; Time-consuming [22]	For processing longer time series data than Elman RNN, such as in natural language processing [22]
	GAN	Generative model; Can still be used when the probability density is not calculated; Good at generalization	Unstable training process; Difficult to achieve Nash equilibrium; Not suitable for processing discrete forms of data	Data augmentation [46]; Text-to-image synthesis [26]; Image-to-image translation [26]; Computer vision [25]
Traditional machine learning	MLP	Simple model; Easy implementation, and good generalization ability	architecture optimization; For training large datasets is very time-consuming [27]	Identification; Classification and prediction [32]
	SVM	Suitable for small-sample binary classification research; Good robustness and generalization ability	Sensitive to parameters and kernel function; Inappropriate for multi-classification research in non-optimized cases	Classification and Regression problems [32]
	DT	Simple to understand and to interpret; Requires little data preparation	Unstable for small variations in the data might result in the generation of a completely different tree	classification, and regression problems [32]
	RF	Simple structure; Easy to implement; Higher efficiency [37]	Unable to optimize its own parameters; Overfitting can easily occur when the amount of data is large [45]	Classification and regression problems [44]

Back to article page

ISSN: 1749-8546

Contact us

Submission enquiries: Access here and click Contact Us
General enquiries: info@biomedcentral.com