Welcome to our research page featuring recent publications in the field of biostatistics and epidemiology! These fields play a crucial role in advancing our understanding of the causes, prevention, and treatment of various health conditions. Our team is dedicated to advancing the field through innovative studies and cutting-edge statistical analyses. On this page, you will find our collection of research publications describing the development of new statistical methods and their application to real-world data. Please feel free to contact us with any questions or comments.
Filter
Topic
Showing 1 of 5 publications
A key challenge in estimating the infection fatality rate (IFR), along with its relation with various factors of interest, is determining the total number of cases. The total number of cases is not known not only because not everyone is tested but also, more importantly, because tested individuals are not representative of the population at large. We refer to the phenomenon whereby infected individuals are more likely to be tested than noninfected individuals as "preferential testing". An open question is whether or not it is possible to reliably estimate the IFR without any specific knowledge about the degree to which the data are biased by preferential testing. In this paper we take a partial identifiability approach, formulating clearly where deliberate prior assumptions can be made and presenting a Bayesian model which pools information from different samples. When the model is fit to European data obtained from seroprevalence studies and national official COVID-19 statistics, we estimate the overall COVID-19 IFR for Europe to be 0.53%, 95% C.I.= [0.38%, 0.70%].
Background: Accurate risk prediction is needed in order to provide personalized healthcare for chronic kidney disease (CKD) patients. An overload of prognosis studies is being published, ranging from individual biomarker studies to full prediction studies. We aim to systematically appraise published prognosis studies investigating multiple biomarkers and their role in risk predictions. Our primary objective was to investigate if the prognostic models that are reported in the literature were of sufficient quality and to externally validate them.
Methods: We undertook a systematic review and appraised the quality of studies reporting multivariable prognosis models for end-stage renal disease (ESRD), cardiovascular (CV) events and mortality in CKD patients. We subsequently externally validated these models in a randomized trial that included patients from a broad CKD population.
Results: We identified 91 papers describing 36 multivariable models for prognosis of ESRD, 50 for CV events, 46 for mortality and 17 for a composite outcome. Most studies were deemed of moderate quality. Moreover, they often adopted different definitions for the primary outcome and rarely reported full model equations (21% of the included studies). External validation was performed in the Multifactorial Approach and Superior Treatment Efficacy in Renal Patients with the Aid of Nurse Practitioners trial (n = 788, with 160 events for ESRD, 79 for CV and 102 for mortality). The 24 models that reported full model equations showed a great variability in their performance, although calibration remained fairly adequate for most models, except when predicting mortality (calibration slope >1.5).
Conclusions: This review shows that there is an abundance of multivariable prognosis models for the CKD population. Most studies were considered of moderate quality, and they were reported and analysed in such a manner that their results cannot directly be used in follow-up research or in clinical practice.
This commentary describes the creation of the Zika Virus Individual Participant Data Consortium, a global collaboration to address outstanding questions in Zika virus (ZIKV) epidemiology through conducting an individual participant data meta-analysis (IPD-MA). The aims of the IPD-MA are to (1) estimate the absolute and relative risks of miscarriage, fetal loss, and short- and long-term sequelae of fetal exposure; (2) identify and quantify the relative importance of different sources of heterogeneity (e.g., immune profiles, concurrent flavivirus infection) for the risk of adverse fetal, infant, and child outcomes among infants exposed to ZIKV in utero; and (3) develop and validate a prognostic model for the early identification of high risk pregnancies and inform communication between health care providers and their patients and public health interventions (e.g., vector control strategies, antenatal care, and family planning programs). By leveraging data from a diversity of populations across the world, the IPD-MA will provide a more precise estimate of the risk of adverse ZIKV-related outcomes within clinically relevant subgroups and a quantitative assessment of the generalizability of these estimates across populations and settings. The ZIKV IPD Consortium effort is indicative of the growing recognition that data sharing is a central component of global health security and outbreak response.
As real world evidence on drug efficacy involves non-randomised studies, statistical methods adjusting for confounding are needed. In this context, prognostic score (PGS) analysis has recently been proposed as a method for causal inference. It aims to restore balance across the different treatment groups by identifying subjects with a similar prognosis for a given reference exposure ('control'). This requires the development of a multivariable prognostic model in the control arm of the study sample, which is then extrapolated to the different treatment arms. Unfortunately, large cohorts for developing prognostic models are not always available. Prognostic models are therefore subject to a dilemma between overfitting and parsimony; the latter being prone to a violation of the assumption of no unmeasured confounders when important covariates are ignored. Although it is possible to limit overfitting by using penalization strategies, an alternative approach is to adopt evidence synthesis. Aggregating previously published prognostic models may improve the generalizability of PGS, while taking account of a large set of covariates - even when limited individual participant data are available. In this article, we extend a method for prediction model aggregation to PGS analysis in non- randomised studies. We conduct extensive simulations to assess the validity of model aggregation, compared with other methods of PGS analysis for estimating marginal treatment effects. We show that aggregating existing PGS into a 'meta-score' is robust to misspecification, even when elementary scores wrongfully omit confounders or focus on different outcomes. We illustrate our methods in a setting of treatments for asthma.
In non-randomised studies, inferring causal effects requires appropriate methods for addressing confounding bias. Although it is common to adopt propensity score analysis to this purpose, prognostic score analysis has recently been proposed as an alternative strategy. Whilst both approaches were originally introduced to estimate causal effects for binary interventions, the theory of propensity score has since been extended to the case of general treatment regimes. Indeed, many treatments are not assigned in a binary fashion, and require a certain extent of dosing. Hence, researchers may often be interested in estimating treatment effects across multiple exposures. To the best of our knowledge, the prognostic score analysis has not been yet generalised to this case. In this article, we describe the theory of prognostic scores for causal inference with general treatment regimes. Our methods can be applied to compare multiple treatments using non-randomised data, a topic of great relevance in contemporary evaluations of clinical interventions. We propose estimators for the average treatment effects in different populations of interest, the validity of which is assessed through a series of simulations. Finally, we present an illustrative case in which we estimate the effect of the delay to Aspirin administration on a composite outcome of death or dependence at 6 months in stroke patients.