Background: It is widely recommended that any developed prediction model - diagnostic or prognostic - is validated externally in terms of its predictive performance measured by calibration and discrimination. When multiple validations have been performed, a systematic review followed by a formal meta-analysis helps to summarize overall performance across multiple settings, and reveals under which circumstances the model performs suboptimal (alternative poorer) and may need adjustment.
Objectives: To discuss how to undertake meta-analysis of the performance of prediction models with either a binary or a time-to-event outcome.
Methods: We address how to deal with incomplete availability of study-specific results (performance estimates and their precision), and how to produce summary estimates of the c -statistic, the observed:expected ratio and the calibration slope. Furthermore, we discuss the implementation of frequentist and Bayesian meta-analysis methods, and propose novel empirically based prior distributions to improve estimation of between-study heterogeneity in small samples. Finally, we illustrate all methods using two examples: a meta-analysis of the predictive performance of EuroSCORE II and of the Framingham Risk Score. All examples and meta-analysis models have been implemented in our newly developed open source R package 'metamisc'.
Results: Frequentist and Bayesian meta-analysis methods often yielded similar summary estimates of prediction model performance. However, estimates of between-study heterogeneity and derived prediction intervals appeared more adequate when we applied Bayesian estimation methods.
Conclusions: Our empirical examples demonstrate that meta-analysis of prediction models is a feasible strategy despite the complex nature of corresponding studies. As developed prediction models are being validated increasingly often, and as the reporting quality is steadily improving, we anticipate that evidence synthesis of prediction model studies will become more commonplace in the near future. The R package metamisc is designed to facilitate this endeavor, and will be updated as new methods become available.
Patient or healthcare consumer involvement: The identification of relevant statistical methods was informed by previous experiences with systematic reviews of prognosis studies.