Objective To compare the ability of three propensity score weighting methods to balance the covariates and the advantages and disadvantages to estimate the treatment effects when dealing with multiple treatment data under different sample sizes. Methods Monte Carlo simulation was used to generate data sets and the advantages and disadvantages of balancing covariates and estimating the treatment effects of three propensity score weighting methods, Logistic-IPTW, Logistic-OW and GBM-OW were compared. The evaluation index of covariate equilibrium level was the absolute standard mean difference. The evaluation indexes of effect estimation included the point estimate of treatment effect, root mean square error and confidence interval coverage. Results Compared with Logistic-IPTW and Logistic-OW, GBM-OW was better in effect estimation and had a smaller root mean square error in five scenarios where covariates were related to treatment factors and outcome variables with different varying degrees of complexity. In terms of covariate equilibrium, all three methods had good effects. GBM-OW method performed better when the overlap of propensity score distribution of multiple treatment data was relatively low and covariables had increasingly complex nonlinear relationships with treatment factors and outcome variables. Conclusion When dealing with multiple treatment data, GBM-OW method has advantages over the other two methods when there is nonlinearity and/or interaction between covariates and treatment factors and outcome variables. Using this method, the effect estimation is closer to the real value, which is a better choice.
This study introduced the inverse probability weight and overlap weight by propensity score and how to test the balance and estimate the effect after weighting. Four R packages that can be used for propensity score weight analysis were introduced and compared.
Randomized controlled trials (RCTs) are often limited because of ethical or operational reasons. Quasi-experimental studies could be an alternative to RCTs to make causal inferences without randomization by controlling the confounding effects of the study. This paper introduced the general statistical analysis methods of quasi-experimental design through basic ideas, characteristics, limitations and applications in medicine, including difference-in-difference models, instrumental variables, regression discontinuity design, interrupted time series, and so on, and to provide references for future research.
Objective To elaborate on the statistical analysis methods for evaluating the accuracy of imaging diagnostic tests in a multiple-reader multiple-case (MRMC) design through formula derivation and real cases. Methods This study consisted of two parts: theoretical derivation and a real case study. The theoretical part discussed in detail the principles and procedures of MRMC statistical analysis methods, particularly the Obuchowski-Rockette (OR) and Dorfman-Berbaum-Metz (DBM) methods. The real case included 100 subjects, of whom 67 had disease. Four readers interpreted all the cases based on both traditional film imaging methods and digital imaging methods. OR and DBM methods were employed for data analysis. Results The real case showed that the OR and DBM methods had a high degree of consistency, with only slight differences in the confidence intervals. Conclusion It is recommended to use the OR and DBM methods for the statistical analysis of imaging diagnostic test accuracy, ensuring that the impact of reader factors on the evaluation results is fully considered. The results from the OR and DBM methods are relatively similar; when applying these methods in practice, one should consider the specific characteristics of the data and the research design to choose the appropriate analysis method. Besides, there are still challenges when applying the OR and DBM methods, such as software implementation and missing data handling, which require further exploration.
ObjectiveTo explore two methods of sample size estimation in multi-reader multi-case study of radiological diagnostic test and realize them by software. MethodsDemonstration programs were conducted in R software using the Van Dyke dataset, calculating combinations of readers and cases using the OR and DBM methods. These serve as pilot test results for multi-reader multi-case studies, providing a reference for parameter settings in subsequent formal experiments. ResultsWhen the effect size was 0.044, 6 readers and 247 cases could yield 0.80 power; while with an effect size of 0.088, only 6 readers and 44 cases were needed to reach 0.85 power. The sample sizes calculated using the OR method and the DBM method were consistent, and the same sample size calculation results could be obtained through conversion between the two methods. ConclusionFor the estimation of sample size in multi-reader multi-case studies, R software provides a convenient and mature software package for sample size estimation using multi-reader multi-case designs in radiological diagnostic tests, thereby offering a reference for selecting appropriate sample size estimation and statistical analysis methods in radiological diagnostic tests.