loader
publication

Publications

Filter

Topic

History

Showing 2 of 2 publications

Measuring the performance of survival models to personalize treatment choices

Various statistical and machine learning algorithms can be used to predict treatment effects at the patient level using data from randomized clinical trials (RCTs). Such predictions can facilitate individualized treatment decisions. Recently, a range of methods and metrics were developed for assessing the accuracy of such predictions. Here, we extend these methods, focusing on the case of survival (time-to-event) outcomes. We start by providing alternative definitions of the participant-level treatment benefit; subsequently, we summarize existing and propose new measures for assessing the performance of models estimating participant-level treatment benefits. We explore metrics assessing discrimination and calibration for benefit and decision accuracy. These measures can be used to assess the performance of statistical as well as machine learning models and can be useful during model development (i.e., for model selection or for internal validation) or when testing a model in new settings (i.e., in an external validation). We illustrate methods using simulated data and real data from the OPERAM trial, an RCT in multimorbid older people, which randomized participants to either standard care or a pharmacotherapy optimization intervention. We provide R codes for implementing all models and measures.

Journal: Stat Med |
Year: 2025
A few things to consider when deciding whether or not to conduct underpowered research

Hernán, using a hypothetical example, argues that policies that prevent researchers from conducting underpowered observational studies using existing databases are misguided explaining that "[w]hen a causal question is important, it is preferable to have multiple studies with imprecise estimates than having no study at all." While we do not disagree with the sentiment expressed, caution is warranted. Small observational studies are a major cause of distrust in science, mainly because their results are often selectively reported. The hypothetical example used to justify Hernán's position is too simplistic and overly optimistic. In this short response, we reconsider Hernán's hypothetical example and offer a list of other factors - beyond simply the importance of the question - that are relevant when deciding whether or not to pursue underpowered research.

Journal: J Clin Epidemiol |
Year: 2021