Augmenting treatment arms with external data through propensity-score weighted power-priors with an application in expanded access 187 9♥ INTRODUCTION There is an increasing regulatory interest in synthesizing evidence from current (randomized) clinical trials with other data sources, to better understand the safety and efficacy of new drugs and medical devices.1,26 Relevant data sources include historical control arms,2,7 natural history studies,3 single-arm trials,22 and other sources of non-trial data, such as expanded access or compassionate use programs.27,28 Ideally the incorporation of non-trial data increases power, reduces sample size, and helps to generalize results that are obtained in trial populations to more ‘real-world’ populations.2 However, the combination of trial and external data introduces several sources of potential bias that need to be attenuated via modeling strategies.7,11 The variation in trial and external data can in general be attributed to either measured imbalances (e.g. in patient characteristics) between data sources and imbalances due to unmeasured confounding and other factors (e.g. center effects). Imbalances in measured characteristics can be addressed by a variety of methods such as covariate adjustment or propensity score methodology (e.g., stratification, matching or weighting). Propensity scores are frequently used to address biases that arise due to confounding in non-randomized experimental settings, by modeling allocation to treatment or control based on a set of covariates.29 However, propensity scores may also be used to distinguish between trial and external data and can thus provide a solution to the issue of confounding in the synthesis of clinical trial data and real-world data.23,30,31 To address unmeasured confounding, statistical methods such as (hierarchical) meta-analytical models,15,20,32 and the use of power-priors,33–35 have been developed, both in frequentist and Bayesian settings. These methods perform ‘dynamic borrowing’, aiming to synthesize more evidence when data sources are ‘comparable’ and to synthesize less (or completely exclude evidence) as data sources differ increasingly. These synthesis methods were primarily developed to combine randomized controls with historical controls. In that context, Pocock suggested strict conditions relating to study design and patient characteristics to ensure that the historical data and the current trial are sufficiently comparable prior to performing a combined analysis.4 One of Pocock’s criteria is that the patient characteristics of the historical and randomized controls have a similar distributions, which may not be realistic in the context of non-trial, real-world data.5 Ample recent scholarship has been devoted to developing methods that simultaneously address both sources of bias. In these ’hybrid’ approaches, propensity score methods are integrated into dynamic borrowing methods.22,36 Multiply the number of standard propensity score methods (e.g. stratification, matching, weighting) with the number of available borrowing methods (such as the modified power prior, the meta-analytic predictive prior and the commensurate prior), and one may quickly get lost in the statistical jungle. In this paper, we aim to combine both fields of
RkJQdWJsaXNoZXIy MTk4NDMw