Evaluation of the accuracy of diagnostic scales for a syndrome in Chinese medicine in the absence of a gold standard
 Xiao Nan Wang^{1},
 Vanessa Zhou^{3},
 Qiang Liu^{4},
 Ying Gao^{5}Email author and
 XiaoHua Zhou^{1, 2}Email author
DOI: 10.1186/s1302001601002
© The Author(s) 2016
Received: 13 March 2015
Accepted: 27 May 2016
Published: 28 July 2016
Abstract
Background
The concept of syndromes (zhengs) is unique to Chinese medicine (CM) and difficult to measure. Expert consensus is used as a gold standard to identify zhengs and evaluate the accuracy of existing diagnostic scales for zhengs. But, the use of expert consensus as a gold standard is problematic because the diagnosis of zhengs by expert consensus is not 100 % accurate. This study aimed to evaluate the accuracy of standardized diagnostic scales for a syndrome zhengs in the absence of a gold standard, with application to internal wind (nei feng) syndrome in ischemic stroke patients.
Methods
A total of 204 participants (age 41–84 years) with ischemic stroke were assessed by the stroke syndrome differentiation diagnostic criterion (SSDC), ischemic stroke TCM syndrome diagnostic scale (ISDS), and expert syndrome differentiation (ESD). The diagnostic tests and data collection process were conducted over a 10month period (February 2008 to November 2008) in 10 hospitals across nine cities in China. The Bayesian method was used to estimate the accuracy of the SSDC, ISDS, and ESD.
Results
For internal wind syndrome, the estimated sensitivities and specificities of the SSDC, ISDS, and ESD without use of a gold standard were respectively: \(\widehat{Se}_{1}=0.687\), \(\widehat{Sp}_{1}=0.776\); \(\widehat{Se}_{2}=0.884\), \(\widehat{Sp}_{2}=0.875\); and \(\widehat{Se}_{3}=0.813\), \(\widehat{Sp}_{3}=0.922\)
Conclusion
After adjusting for imperfect gold standard bias, we found that both the sensitivity and specificity of the ISDS were higher than those of the SSDC for diagnosis of internal wind syndrome in ischemic stroke patients.
Keywords
CM Sensitivity Specificity Diagnostic scale Syndrome Bayesian method Without a gold standardBackground
The concept of syndromes (zhengs) is unique to Chinese medicine (CM). Syndromes are identifiable from a holistic understanding of a patient’s clinical presentation using the four CM diagnostic methods: observation, listening/smelling, questioning, and pulse analyses [1]. Identification of a syndrome can differ from one CM practitioner to another because of varying medical experience and other related factors. In recent years, the CM community has developed several standardized diagnostic scales for syndromes [2–5]. The accuracies of these scales have been assessed by the diagnostic opinion of CM practitioners as the gold standard. However, an expert diagnosis is largely dependent on clinical experience and educational background, leading to different syndrome differentiation for the same patient by different expert CM practitioners. This results in biased estimates for the accuracy of diagnostic scales because the expert syndrome differentiation (ESD) is imperfect. Such bias is called an imperfect gold standard bias [6, 7]. If the diagnostic test and imperfect gold standard are conditionally independent of the true disease status, the sensitivity and specificity of the diagnostic test are underestimated. However, if the diagnostic test and imperfect gold standard are conditionally dependent, the estimated sensitivity and specificity of the diagnostic test can be biased in either direction. The direction of the bias is determined by the degree to which the diagnostic tests and imperfect gold standard misclassify the same patients. When this tendency is slight, the accuracy of the diagnostic test is generally underestimated; when the tendency is strong, the accuracy of the diagnostic test is generally overestimated [6].
In recent years, several statistical methods have been developed to correct imperfect gold standard bias. Hui and Waiter [8] developed a model for two diagnostic tests within two populations and introduced a maximum likelihood approach when assuming the existence of two populations strata with different prevalence rates. In that model, they also assumed that the two tests were conditionally independent. However, the assumption of conditional independence may not be realistic in some applications owing to some common factors that can influence both diagnostic tests and true disease status. Sinclair and Gastwirth [9] extended the Hui and Waiter model to allow for conditional dependence. Espeland and Handelman [10] and Yang and Becker [11] proposed latent class modeling for conditional dependence, Qu et al. [12], Hadgu and Qu [13] proposed random effects models, and Albert and Dodd [14] developed latent class modeling approaches for binary tests. Pepe and Janes [15] discussed the latent class analysis method when assessing the multiple diagnostic tests without a gold standard, and concluded that a latent class model required careful justification of assumptions made about the conditional dependence structure. These researchers also stressed that a formal clinical definition of the disease should be given before evaluating the accuracy of diagnostic tests with the latent class method. Only when the disease has been clearly defined can the estimated parameters be meaningful for diagnostic tests; otherwise, the results of the estimators were meaningless. The abovementioned methods used the frequentist approach to estimate the parameters in the model when the diagnostic tests were conditionally independent, given the true disease status or given the true disease status and a random effect.
Joseph et al. [16] used Bayesian methods to assess the accuracy of diagnostic tests under conditional independence without a gold standard. Dendukuri [17], Georgiadis et al. [18], and Branscum et al. [19] developed Bayesian models to evaluate the accuracy of diagnostic tests with two conditionally dependent tests. These methods have been widely used for estimation of the accuracy of diagnostic tests without a gold standard in Western medicine research [20–27]. However, they have not been applied for estimation of the accuracy of diagnostic tests for CM syndromes. This study aimed to evaluate the accuracy of standardized diagnostic scales for a syndrome in the absence of a gold standard, with application to internal wind (nei feng) syndrome in ischemic stroke patients.
Methods
Study design and approval
In this study, we evaluated the accuracy of the stroke syndrome differentiation diagnostic criterion (SSDC), ischemic stroke TCM syndrome diagnostic scale (ISDS), and ESD for detecting “internal wind” in ischemic stroke patients, without assuming that the ESD is the gold standard. We mainly focused on comparing the accuracy of the two diagnostic scales (SSDC and ISDS).This study used data from the second round of a diagnostic test study of the ISDS. The diagnostic test and data collection process were performed over a 10month period (February 2008 to November 2008), after receiving approval (ECSLBDY2008012) from the Ethics Committee of the Dongzhimen Hospital of Beijing University of Chinese Medicine (Additional files 1 and 2).
Inclusion and exclusion criteria
Individuals who had a confirmed diagnosis of acute ischemic stroke by computed tomography and magnetic resonance imaging examinations, were aged between 35 and 85 years, and were informed of the objectives and research procedures of the study (details of study please see Additional file 3) and provided signed consent forms themselves (consent forms please see Additional file 4) were selected as the participants in this study [5]. We excluded individuals with the following symptoms: transient ischemic attack; cerebral hemorrhage or subarachnoid hemorrhage; stoke caused by brain tumor, traumatic brain injury, or blood disease; severe heart, liver, kidney, or hematopoietic system comorbidity and complication; mental disorder or severe dementia; and severe aphasia that could affect data collection [5].
Study subjects
Crossclassified test results of \(T_{1} \), \(T_{2}\) and \(T_{3}\) for internal wind syndrome
\(T_{1}\)  \(T_{2}=1\)  \(T_{2}=0\)  

\(T_{3}=1\)  \(T_{3}=0\)  \(T_{3}=1\)  \(T_{3}=0\)  
\(T_{1}=1\)  69  19  7  12 
\(T_{1}=0\)  32  9  5  51 
CM syndrome factor scales and syndrome differentiation
The SSDC and ESD were used to diagnose the status of a patient in place of a gold standard, before the development of the ISDS. The SSDC was the first recognized scale for diagnosing a CM syndrome in ischemic stroke patients, and has been widely used since its publication in 1994 [2, 3]. The development of the ISDS was based on the SSDC. Essentially, the ISDS is an updated version of the SSDC [3], and was first developed in 2007. The simple process for developing the ISDS has been described in the published literature [3, 4]. Briefly, the ISDS was developed from a tworound Delphi study, which generated a pool of draft items with 288 items in six syndrome factor dimensions [4]. From this pool of items, six syndrome factor diagnostic scales were constructed according to logistic regression functions and receiveroperating curve analysis. Each syndrome factor diagnostic scale consisted of 10–20 “yes” or “no” statements. The ESD was completed by three senior physicians with over 10 years of work experience [4, 5]. When the practitioners failed to reach a unanimous decision about a patient’s diagnosis, the majority opinion was used.
Statistical methods
Descriptive statistics were utilized to summarize the characteristics of the subjects in the data set. The latent class model was fitted to the results of the SSDC, ISDS, and ESD for the ischemic stroke patients when a gold standard was not available. The Bayesian method was used to estimate the sensitivity and specificity for every CM diagnostic scale. We followed the guidelines for reporting Bayesian analyses in biomedical journals, as described by Lang and Altman [28].
Using the reporting guidelines, we first described the general Bayesian statistical model. Next, we specified the pretrial probabilities (prior distributions) for the parameters in the proposed model based on the data we wanted to analyze and also explained how the prior distributions were selected. Subsequently, we used Markov chain Monte Carlo (MCMC) techniques to obtain the Bayesian estimated parameters, based on the posterior distribution. The median and credibility interval were used as the posterior summary measures in this study. Finally, we illustrated the sensitivity of the analyses to different prior distributions in the Bayesian model.
IBM SPSS Statistics for Windows [version: 21.0; IBM Crop; NY] was utilized for the descriptive statistics. WinBUGS software [version: 1.4.3; BUGS project; UK] was used for the Bayesian data analysis (WinBUGS code for this study could be found in Additional file 5). A detailed description of the proposed Bayesian method for evaluating the accuracy of the diagnostic tests without a gold standard is given as below.
Notation
Let \(T_{1}\), \(T_{2}\), and \(T_{3}\) denote the diagnostic results of the two CM diagnostic tests (SSDC and ISDS) and ESD for one syndrome factor in ischemic stroke patients, where \(T_{1}\), \(T_{2}\), and \(T_{3}=0,1\), with “1” indicating the presence of the syndrome factor and “0” indicating the absence of the syndrome factor. Let D denote the true status of the syndrome factor in an ischemic stroke patient, which is not observed in the study. The parameters of interest include: the prevalence of the syndrome factor in the population, \(\pi \), defined as \(\pi =P(D=1)\); the sensitivity of the ith diagnostic test in detecting the syndrome factor, \(Se_{i}\), defined as \(Se_{i}=P(T_{i}=1D=1)\); and the specificity of the ith diagnostic test for detecting the syndrome factor, \(Sp_{i}\), defined as \(Sp_{i}=P(T_{i}=0D=0)\), where \(i=1,2,3\).
Bayesian model
Procedure of the analysis
Here \(T_{1}\),\(T_{2}\), and \(T_{3}\) denote the CM diagnostic scales (SSDS and ISDS) and ESD for detecting internal wind syndrome, respectively. The observed data can be represented by \(Y=(69,19,7,12,32,9,5,51)\), as shown in Table 1. We denoted the proposed model as model (I). For comparison purposes, we also included the results obtained by the commonly used naive method, which assumed the ESD as the gold standard, and denoted this method as model (II). In the Bayesian analysis, a prior distribution for \(\theta \), which was defined in the Bayesian model, had to be chosen.
Selecting the prior distribution
A prior distribution for \(\theta \) consisted of three sensitivities, three specificities, one prevalence rate, and two conditional covariances. Since the first six parameters have a range between 0 and 1, we chose a beta distribution \(Beta (\alpha ,\beta )\) for each of them, where \(\alpha \) and \(\beta \) were hyperparameters. We used the method proposed by Dendukuri [17] and Enøe et al. [27] to choose these hyperparameter values by the priori moment information. According to the published literature describing the three diagnostic tests (SSDC, ISDS, and ESD) [2–5], the most probable value of the sensitivities of \(T_{1}\) and \(T_{2}\) for detecting internal wind syndrome was determined as 0.7, and we were 95 % sure that these sensitivities were less than 0.5. Thus, the prior distribution for the sensitivities \(T_{1}\), \(T_{2}\) was chosen to be the beta distribution, Beta(13.322, 6.281). For the specificities of the diagnostic scales \(T_{1}\), \(T_{2}\) for detecting internal wind syndrome, the most probable value was determined as 0.8, and we were 95 % sure that these specificities were less than 0.5. Therefore, the prior distribution for the specificities \(T_{1}\) and \(T_{2}\) was chosen to be the beta distribution, Beta(7.549, 2.637). The best guess value for the sensitivity of \(T_{3}\) was 0.8, and the experts were 95 % sure that the sensitivity of \(T_{3}\) was at least 0.7; hence, the prior distribution for the sensitivity of \(T_{3}\) was chosen to be the beta distribution, Beta(48.283, 12.821). The best guess value for the specificity of \(T_{3}\) was 0.85, and the experts were 95 % sure that the specificity of \(T_{3}\) was at least 0.6; thus the prior distribution for the specificity of \(T_{3}\) was chosen to be the beta distribution, Beta(10.657, 2.704). The uniform distribution on [0, 1] was used for the prior distribution of the internal wind prevalence rate. For the last two conditional covariances, \(C_{+}\) and \(C_{}\), which measured the dependence of \(T_{1}\) and \(T_{2}\) among the diseased and nondiseased statuses, respectively, we have the following constraints: \((Se_{1}1)(1Se_{2})\le C_{+} \le min(Se_{1},Se_{2})Se_{1}Se_{2}\) and \((Sp_{1}1)(1Sp_{2})\le C_{} \le min(Sp_{1},Sp_{2})Sp_{1}Sp_{2}\), respectively. Hence, we chose two uniform distributions for \(C_{+}\) and \(C_{}\): \(U((Se_{1}1)(1Se_{2}),(min(Se_{1},Se_{2})Se_{1}Se_{2}))\) and \(U((Sp_{1}1)(1Sp_{2}),(min(Sp_{1},Sp_{2})Sp_{1}Sp_{2}))\).
MCMC techniques for computing the posterior estimator
It was difficult to directly obtain the posterior estimator of each parameter through a numerical integration method in the Bayesian model. Since the joint posterior distribution \(f(\theta \mid Y)\) was complicated and involved highdimensional integral problems, which were often impossible to compute directly, we used the MCMC algorithm to draw a random sample from the joint posterior distribution. We then computed the sample median of the randomly drawn sample to estimate \(\theta \) and its components of interest. In this study, the WinBUGS package was used to perform this MCMC process.
To use the MCMC technique in the Bayesian method, we specified the initial values of the model parameters, and the initial values were given as follows: \(\pi =0.623,Se_{1}=0.748,Se_{2}=0.945,Se_{3}=0.850,Sp_{1}=0.844,Sp_{2}=0.883,Sp_{3}=0.935\), respectively. We also chose different initial values and obtained similar results. The numbers of iterations and burnins were determined by the convergence of the Markov chain in estimating the parameters by WinBUGS.
Results and discussion
Accuracy of diagnostic scales (median) for internal wind syndrome factor in 204 ischemic stroke patients under different models
Model I  Model II  

\(\hat{Se}_{1}\)  0.687 (0.605,0.765)  0.673 (0.587,0.759) 
\(\hat{Se}_{2}\)  0.884 (0.815,0.938)  0.894 (0.837,0.951) 
\(\hat{Se}_{3}\)  0.813 (0.742,0.880)  N.A. 
\(\hat{Sp}_{1}\)  0.776 (0.652,0.885)  0.659 (0.562,0.756) 
\(\hat{Sp}_{2}\)  0.875 (0.739,0.968)  0.692 (0.597,0.787) 
\(\hat{Sp}_{3}\)  0.922 (0.831,0.981)  N.A. 
\(\hat{\pi }\)  0.648 (0.556,0.731)  0.554 (0.486,0.622) 
As shown in Table 2, the respective Bayesian estimated sensitivities of the SSDC, ISDS, and ESD for diagnosing internal wind syndrome without a gold standard were as follows: \(\hat{Se}_{1}=0.687\), \(\hat{Se}_{2}=0.884\), and \(\hat{Se}_{3}=0.813\). The respective estimated specificities of the SSDC, ISDS, and ESD for diagnosing internal wind syndrome in the absence of a gold standard were as follows: \(\hat{Sp}_{1}=0.776\), \(\hat{Sp}_{2}=0.875\), and \(\hat{Sp}_{3}=0.922\). From these results, we concluded that the ISDS was more accurate than the SSDC in detecting internal wind syndrome. The Bayesian method also gave an estimate of \(\hat{\pi }=0.648\) for the prevalence rate of internal wind syndrome. Hence, we concluded that the sensitivity and specificity of the ISDS were both higher than those of the SSDC when diagnosing internal wind syndrome in ischemic stroke patients. We also found that the sensitivity and specificity of the ESD for internal wind syndrome were also high, but not perfect.
To assess the sensitivity of our results to chosen prior distributions, we selected several different prior distributions for parameters in model (I). The posterior estimates under the chosen prior distributions for the parameters led to consistent results with the previous posterior estimates.
Conclusion
After adjusting for imperfect gold standard bias, we found that both the sensitivity and specificity of the ISDS were higher than those of the SSDC for diagnosis of internal wind syndrome in ischemic stroke patients.
Abbreviations
 CM:

Chinese medicine
 SSDC:

stroke syndrome differentiation diagnostic criterion
 ISDS:

ischemic stroke TCM syndrome diagnostic scale
 ESD:

expert syndrome differentiation
 MCMC:

Markov chain Monte Carlo
Declarations
Authors' contributions
XZ, XW, YG, and QL conceived and designed the study. QL and YG facilitated the data collection in China. XW and XZ analyzed the data. XW, VZ, XZ, YG, and QL interpreted the results. XW, XZ, and VZ wrote the manuscript. XW and XZ revised the manuscript. All authors read and approved the final manuscript.
Acknowledgements
The authors wish to acknowledge the support of the Ministry of Science and Technology of the PRC on a research project entitled “Significant New Drug DevelopmentConstruction of Technology Platform used for Original New Drug Research and Development” (2012ZX09303010002). The authors also wish to acknowledge the data provided by the “973 program,” a basic research program supported by the Chinese Government Ministry of Science and Technology that promotes research in China. This research was also supported by the State Foundation for Studying Abroad.
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Wang J, Wang P, Xiong X. Current situation and reunderstanding of syndrome and formula syndrome in Chinese medicine. Intern Med. 2012;2:1–5. doi:10.4172/21658048.1000113.View ArticleGoogle Scholar
 State Administration of TCM and Acute Encephalopathy Cooperation Group SSSSS. TCM syndrome differentiation diagnosis criterion of stroke. Beijing Zhong Yi Yao Da Xue Xue Bao. 1994;17:42.Google Scholar
 Stroke Syndromes and Clinical Diagnosis SSSSS. Clinical validiation of TCM syndrome diagnostic criterion of stroke. Beijing Zhong Yi Yao Da Xue Xue Bao. 1994;17:41–3.Google Scholar
 Liu Q, Gao Y. Theory basis of syndrome diagnosis scale. Zhong Hua Zhong Yi Yao Za Zhi. 2010;25:989–92.Google Scholar
 Gao Y, Bin M, Liu Q, Wang Y. Methodological study and establishment of the diagnostic scale for TCM syndrome of ischemic stroke. Zhong Yi Za Zhi. 2011;52:2097–101.Google Scholar
 Zhou XH, Obuchowski NA, McClish DK. Statistical methods in diagnostic medicine. New York: John Wiley and Sons; 2011.View ArticleGoogle Scholar
 Hui SL, Zhou XH. Evaluation of diagnostic tests without gold standard. Stat Methods Med Res. 1998;7:354–70. doi:10.1177/096228029800700404.View ArticlePubMedGoogle Scholar
 Hui SL, Walter SD. Estimating the error rates of diagnostic tests. Biometrics. 1980;36:167–71. doi:10.2307/2530508.View ArticlePubMedGoogle Scholar
 Sinclair MD, Gastwirth JL. On procedures for evaluating the effectiveness of reinterview survey methods:application to labor force data. J Am Stat Assoc. 1996;91:961–9. doi:10.1080/01621459.1996.10476966.View ArticleGoogle Scholar
 Espeland MA, Handelman SL. Using latent class models to characterize and assess relativeerror in discrete measurements. Biometrics. 1989;45:587–99. doi:10.2307/2531499.View ArticlePubMedGoogle Scholar
 Yang I, Becker MP. Latent variable modeling of diagnostic accuracy. Biometrics. 1997;53:948–58. doi:10.2307/2533555.View ArticlePubMedGoogle Scholar
 Qu Y, Tan M, Kutner MH. Random effects models in latent class analysis for evaluating accuracy of diagnostic test. Biometrics. 1996;52:797–810. doi:10.2307/2533043.View ArticlePubMedGoogle Scholar
 Hadgu A, Qu Y. A biomedical application of latent class models with random effects. Appl Stat. 1998;47:603–16. doi:10.1111/14679876.00131.Google Scholar
 Albert PS, Dodd LE. A cautionary note on the robustness of latent class models for estimating diagnostic error without a gold standard. Biometrics. 2004;60:427–35. doi:10.1111/j.0006341X.2004.00187.x.View ArticlePubMedGoogle Scholar
 Pepe MS, Janes H. Insights into latent class analysis of diagnostic test performance. Biostatistics. 2007;8:474–84. doi:10.1093/biostatistics/kxl038.View ArticlePubMedGoogle Scholar
 Joseph L, Gyorkos T, Coupal L. Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. Am J Epidemiol. 1995;141:263–72.PubMedGoogle Scholar
 Dendukuri N, Joseph L. Bayesian approaches to modeling conditional dependence between multiple diagnostic tests. Biometrics. 2001;57:158–67. doi:10.1111/j.0006341X.2001.00158.x.View ArticlePubMedGoogle Scholar
 Georgiadis MP, Johnson WO, Gardner IA, Singh R. Correlationadjusted estimation of sensitivity and specificity of two diagnostic tests. Appl Stat. 2003;52:63–76. doi:10.1111/14679876.00389.Google Scholar
 Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostictest sensitivity and specificity through Bayesian modeling. Prev Vet Med. 2005;68:145–63. doi:10.1016/j.prevetmed.2004.12.005.View ArticlePubMedGoogle Scholar
 Rybicki BA, Peterson EL, Johnson CC, Kortsha GX, Cleary WM, Gorell JM. Intra and inter rater agreement in the assessment of occupational exposure to metals. Int J Epidemiol. 1998;27:269–73. doi:10.1093/ije/27.2.269.View ArticlePubMedGoogle Scholar
 McDermott J, Drews C, Green D, Berg C. Evaluation of prenatal care information on birth certificates. Paediat Perinat Epidemiol. 1997;11:105–21. doi:10.1046/j.13653016.1997.d014.x.View ArticleGoogle Scholar
 Line BR, Peters TL, Keenan J. Diagnostic test comparisons in patients with deep venous thrombosis. J Nucl Med. 1997;38:89–92.PubMedGoogle Scholar
 Mahoney WJ, Szatmari P, Maclean JE, Bryson SE, Bartolucci G, Walter SD, Marshall BJ, Zwaigenbaum L. Reliability and accuracy of differentiating pervasive developmental disorder subtypes. J Am Acad Child Adolesc Psychiatry. 1998;37:278–85. doi:10.1097/0000458319980300000012.View ArticlePubMedGoogle Scholar
 Chriel M, Willeberg P. Dependency between sensitivity,specificity and prevalence analysed by means of Gibbs sampling. Epidémiologeie et Santé Animale. 1997;31/32:12.03.1–3.Google Scholar
 Georgiadis MP, Gardner IA, Hedrick RP. Field evaluation of sensitivity and specificity of a polymerase chain reaction (PCR) for detection of N. salmonis in rainbow trout. J Aquat Anim Health. 1998;10:372–80. doi:10.1577/15488667(1998) 010<0372:FEOSAS>2.0.CO;2.
 Singer RS, Boyce WM, Gardner IA, Johnson WO, Fisher AS. Evaluation of bluetongue virus diagnostic tests in freeranging bighorn sheep. Prev Vet Med. 1998;35:265–82. doi:10.1016/S01675877(98)000671.View ArticlePubMedGoogle Scholar
 Enoe C, Georgiadis MP, Johnson WO. Estimation of the sensitivity and specificity of diagnostic tests and disease prevalence when true disease state is unknown. Prev Vet Med. 2000;45:61–81. doi:10.1016/S01675877(00)001173.View ArticlePubMedGoogle Scholar
 Lang T, Altman D. Statistical Analyses and Methods in the Published Literature:the SAMPL Guidelines. Science Editors’Handbook, European Association of Science Editors; 2013.