The European Medicines Agency (EMA) and the European Medicines Regulatory Network established a coordination centre to provide timely and reliable evidence on the use, safety and effectiveness of medicines for human use, including vaccines, from real world healthcare databases across the European Union (EU). This capability is called the Data Analysis and Real World Interrogation Network (DARWIN EU®).

Expression of Interest Call now open for Data Partners.
More information can be found here.

Latest News


Although new user cohorts are preferred, prevalent user designs can be desirable in some instances, particularly when a recently marketed medicine is to be compared to an existing and heavily used active comparator, as many of the new users of the recently launched will be previous users of the existing active comparator medicine. This leads to a situation where new users of a new medicine with no use of the active comparator are scarce, and potentially non representative of the desired target population of all users of the new drug. Prevalent user cohort studies have been used in recent years to study the potential effects of medicines like ACE inhibitors or alpha-1-blockers against SARS-CoV-2 infection and/or COVID-19 severity [D Morales et al. Lancet Dig Health 2021; A Nishimura et al. Front Pharmacol 2022].

Study Type/s

Prevalent user active comparator cohort studies are classified as ‘complex’ analyses.

Study Design

Prevalent user cohorts.


At least two cohorts, including a cohort of prevalent users of at least one drug/medicinal product under investigation (target cohort) and a cohort of prevalent users of at least one drug/medicinal product as an active comparator (comparator cohort). Typically, prevalent user cohorts will be defined by the previous use of a target/comparator medicine during a specific time period or for a pre-specified duration.


Participants in each cohort will be followed from a specific index date different from the date of treatment initiation. Based on previous examples, this will typically be a calendar date, e.g. 1st of March 2022.

Two possibilities of analyses will be offered:

  • In a ‘fixed’ follow-up analysis, follow-up will continue until death, loss of follow-up or a pre-specified time period (e.g., 3 years) regardless of treatment duration
  • In an ‘on treatment’ analysis, follow-up will continue until treatment cessation, death, or loss of follow-up


One or more study outcomes will be pre-specified, based on previous DARWIN EU algorithms or newly developed and validated ones.

In addition, a long list of negative control outcomes will be assessed, which are not known to have a causal association with the drug/s or medicinal product/s under study.


Details will be discussed during programming of pipelines, but new users’ cohort analyses will include:

  • Large-scale characterisation of participants in the target and comparator cohorts, including all features available in the data before or on index date.
  • Large-scale propensity scores (LSPS) will be estimated as the probability of exposure (target cohort) conditional on all available covariates available in the data with a prevalence >1%. LSPS will be estimated using Lasso regression. Different to new user cohorts, LSPS will be estimated using the information available on the index date, which in the case of prevalent users is not the date of therapy initiation but a different previously specified date.
  • Incidence rate/s of each of the outcomes of interest in the target and comparator cohorts after LSPS matching, stratification, or inverse probability weighting
  • Diagnostic/s:
    • Covariate balance: a plot will be produced depicting the standardized mean difference/s between target and comparator cohorts for all available covariates before (x axis) and after propensity score matching/stratification/weighting (y axis) [see Figure 4 for an illustrative example]
    • Equipoise: plots of the distribution of the propensity score stratified by target vs comparator cohort [see Figure 4 for an illustrative example]
    • Analyses will not be conducted where there is insufficient data, based on a pre-specified minimum detectable rate ratio (e.g., MDRR>5)
    • Optional: In addition to the two above, residual confounding/systematic error will be available for estimation, as based on the number of negative control outcome/s significantly associated with the exposure of interest
  • Rate Ratios or Hazard Ratio/s and 95% confidence intervals will be estimated using Poisson or Cox models respectively, comparing the target vs comparator (reference) cohorts after LSPS matching, stratification, or inverse probability weighting
  • Optionally, calibrated RR or HR will be estimated after empirical calibration using negative control outcomes