Standardised Analytics

The DARWIN EU® Coordination Centre (CC) has been set up to maximise the value and efficiency of real-world evidence for regulators and other stakeholders in Europe. With this in mind, it is recognised that Standardised Analytics are needed to speed up evidence generation whilst preserving the quality, reproducibility, and transparency of the proposed research.

Standardised Analytics are enabled through the use of the OMOP Common Data Model (OMOP CDM). This has several advantages:

  1. There is no risk for different implementations of the study design as would be the case when only a common protocol is shared that is translated to code by a programmer at each site. Differences in study outcomes are therefore related to the data instead of the study implementation. 
  2. Standardised Analytics can be made for each Study Type that can be re-used for specific study by defining study parameters. For example, a Drug Utilisation Analytical tool can be made with parameters for the Population and Drugs of Interest.
  3. Standardised Analytics will produce standardised which helps EMA and its committees to streamline their assessment of the results. 
  4. Standardised Analytics can be tested and validated following agreed Quality Assurance Processes and test data sets in the OMOP CDM format.

The CC is developing a Catalogue of Standard Data Analyses to accommodate all the requested study designs. The studies that DARWIN EU® will deliver are grouped by their anticipated level of complexity: Off-the-shelf, Complex, Very Complex. The Off-The-Shelf and Complex Studies can be repeated periodically with a pre-specified regularity (e.g. yearly), called Routine Repeated Analyses.

The development of the Common Analytics is driven by the initial list of studies in the Establishment Phase of DARWIN EU. Below you can find the current version of the catalogue. There will be further discussion with stakeholders in 2023 and the catalogue will be maintained regularly.

Catalogue of Standard Data Analyses

The Catalogue of Standard Data Analyses lists the various analyses currently supported by the CC.

Off-the-shelf studies

These are mainly characterisation questions that can be executed with a generic protocol. This includes disease epidemiology, for example the estimation of the prevalence, incidence of health outcomes in defined time periods and population groups, or drug utilization studies at the population or patient level.

Patient-level characterisation

Study Type/s

Patient-level characterisations are classified as ‘off the shelf’ 

Study Design

Cohort analysis.

Participant/s

Patient-level characterisation studies will include one or more cohort/s of people newly diagnosed with 1 or more pre-specified condition/s and with some amount of data visibility before diagnosis, and with no record of the same condition/s in the previous year (or in all previous history).

Additional eligibility criteria could apply as follows, to be incorporated as sensitivity analyses:

  • Additional restriction/s could apply based on socio-demographics, e.g., people aged 18 or older at the time of diagnosis
  • Additionally, people with a competing (differential) diagnosis could also be excluded (e.g., people with rheumatoid arthritis with a history of psoriatic arthritis could be excluded to minimise misclassification)

Follow-up

Participants will be followed up from their date of new diagnosis (index date) until the earliest of the following: loss to follow-up, end of data availability, a pre-specified time period (e.g. 1 year after index date) or death.

Analyses

Details will be discussed during programming of pipelines, but it is likely that patient-level characterisation will include:

  • Automated large-scale characterisation, including all recorded baseline characteristics available in the data before or on index date, based on code/s, and classified into conditions (medical history), medicine/s use, and procedure/s
  • Pre-specified patient-level characteristics on and/or before index date, based on pre-existing code lists or definitions (e.g., history of type 2 diabetes, or Charlson comorbidity index)
  • Pre-specified patient-level characteristics on and/or before index date, based on concepts and descendants where no previously validated algorithms are available
  • Incidence rate/s of pre-specified outcome/s within a pre-specified time period (e.g. 1 year)
  • Prognosis / progression to a pre-specified outcome within a pre-specified time, e.g., cumulative incidence of certain events or mortality within 1- or 5-years after diagnosis
  • Standard care description, including n (%) receiving each of a pre-specified list of medicine/s, device/s or procedure/s, and combinations within a pre-specified time window after diagnosis

Patient-level DUS analyses

Study Type/s

Patient-level DUS analyses are classified as ‘off the shelf’ studies. Patient-level DUS will offer the possibility to include population-level DUS analyses as part of the same analysis.

Study Design

New drug/s user cohort

Participants

Patient-level DUS analyses will include one or more cohort/s of incident drug users with at least 1 year of data visibility, and no use of that same drug/drug class in that previous year.

Additional eligibility criteria could apply as follows:

  • Source population could be restricted to a specific subpopulation with certain socio-demographic or clinical feature/s, e.g., people with a diagnosis of rheumatoid arthritis who then start to take a disease-modifying anti-rheumatic drug (DMARD)
  • Additional restriction/s could apply as per product label, indication, or study aim/s, e.g., people aged 18 or older at the time of therapy initiation

Follow-up

Participants will be followed up from the date of therapy initiation (index date) until the earliest of loss to follow-up, end of data availability, or death. Patients might be censored at the time they discontinue treatment or switch to an alternative therapy.

Outcome/s

The following outcome/s will be obtained, potentially stratified by pre-specified criteria (age bands, sex, calendar year or month), and other pre-specified criteria:

  • New drug user cohort/s patient-level characteristics on or before index date
  • Indication (where available)
  • Initial dose/strength (as prescribed/dispensed at therapy initiation, where available)
  • Cumulative use within a pre-specified time period (e.g. 1 year) based on number of prescriptions and dose/strength
  • Treatment duration
  • Count of repeated prescriptions during a pre-specified time period (e.g. 1 year)

Analyses

Patient-level DUS analytics will include:

  • Large-scale characterisation of patient-level features based on code/concept and descendants, including socio-demographics, comorbidity, and previous medicine/s use any time in history, and in the year, and/or in the month previous to index date
  • Frequency and % of indication/s, based on pre-specified list of diagnoses recorded before therapy initiation (where available)
  • Reporting of minimum, p25, median, p75, and maximum initially prescribed or dispensed dose/strength (where available)
  • Reporting of minimum, p25, median, p75, and maximum cumulative use within a pre-specified time period (e.g. 1 year)
  • Reporting of minimum, p25, median, p75, and maximum treatment duration
  • Reporting of minimum, p25, median, p75, and maximum number of repeated prescriptions of the index drug during a pre-specified time period (e.g. 1 year)

Population-level DUS analyses

Study Type

Population-level DUS analyses are classified as ‘off the shelf’ studies.

Study Design

Population-level cohort.

Participants

Population-level analyses will include the entire source population with at least some time (typically 1 year) of data visibility available before start of study period.

Additional eligibility criteria will apply as follows:

  • Analyses of incidence of drug use will exclude prevalent users of the same drug/drug class on index date and/or in the previous (washout) year
  • The study population could be restricted to a specific subpopulation with certain socio-demographic e.g., age 18 or older, or with a history of a pre-specified clinical feature/s, e.g., people with a prior diagnosis of rheumatoid arthritis
  • In some cases, a minimum follow-up will be requested e.g., for treatment pattern analyses

 

Outcome/s

The following outcome/s will be obtained, potentially stratified by pre-specified criteria (age bands, sex, calendar year or month):

  • Population-based incidence rates of use of a drug/drug class over calendar time. Periods could be calendar  days, weeks, months, quarters or years.
  • Population-based prevalence of use of a drug/drug class on a given time point (point prevalence) or within a given time period (period prevalence). Periods could be calendar days, weeks, months, quarters or years.

Follow-up

Follow-up will start on a pre-specified calendar time point pre-defined as index date , e.g., 1st January or 1st of each month, and continue for a pre-specified time period, typically week, month, quarter or year.

Analyses

Population-level DUS analyses use the same analytical pipeline as Population-level descriptive epidemiology studies (see separate subsection). Incidence rates will have number of new users (with a pre-specified washout) in the numerator, and total population as person-years (except prevalent users) in the denominator. Prevalence will be calculated as number of users (prevalent or new) over whole source population at a specific time point (i.e., point prevalence) or over a specific time window (i.e., period prevalence). Both may be stratified by socio-demographics (e.g., age bands or sex) and/or calendar period. Additional criteria (e.g. disease severity/duration) may need to be considered and integrated as pre-specified in future studies.

Population-level descriptive epidemiology

Study Type/s

Population-level descriptive epidemiology are classified as ‘off the shelf’ studies.

Study Design

Population-level cohort

Participant/s

Population-level analyses will include the entire source population with at least some time (typically 1 year) of data visibility available before index date.

Additional eligibility criteria will apply as follows:

  • Analyses of disease incidence will exclude patients with a previous/prevalent history of the same disease on index date and/or in the previous (washout) year and/or in all previous history
  • The source population could be restricted to a specific subpopulation with certain socio-demographic or clinical feature/s, e.g., people aged 50+ on index date

Outcome/s

The following outcome/s will be obtained, potentially stratified by pre-specified criteria (age bands, sex, calendar year or month), and other pre-specified criteria:

  • Population-based incidence of a disease/condition (or group of diseases/conditions) on a given time point or over time (stratified by calendar period)
  • Population-based prevalence of disease (or group of diseases/conditions) on a given time point (e.g., a pre-specified date), and/or over time (e.g., stratified by month or year)

Follow-up

Follow-up will start on a pre-specified calendar time point pre-defined as index date , e.g., 1st January or 1st of a given month, and continue for a pre-specified time period, typically a week, a month or a year

Analyses

Incidence rates will have number of newly diagnosed people in the numerator, and total population (satisfying the study eligibility criteria) in the denominator. Prevalence will be calculated as number of people with the diagnosis (prevalent or new) over whole source population on a specific date (point prevalence) or over a window of time (period prevalence). Both can be stratified by socio-demographics (e.g., age bands or sex) and/or calendar period. Other criteria (e.g. disease severity/duration) may need to be considered and integrated in future studies.

Complex

These are studies requiring development or customisation of specific study designs, protocols, analytics, phenotypes. This includes studies on the safety and effectiveness of medicines and vaccines.

Prevalent user active comparator cohort studies

Motivation

Although new user cohorts are preferred, prevalent user designs can be desirable in some instances, particularly when a recently marketed medicine is to be compared to an existing and heavily used active comparator, as many of the new users of the recently launched will be previous users of the existing active comparator medicine. This leads to a situation where new users of a new medicine with no use of the active comparator are scarce, and potentially non representative of the desired target population of all users of the new drug. Prevalent user cohort studies have been used in recent years to study the potential effects of medicines like ACE inhibitors or alpha-1-blockers against SARS-CoV-2 infection and/or COVID-19 severity [D Morales et al. Lancet Dig Health 2021; A Nishimura et al. Front Pharmacol 2022].

Study Type/s

Prevalent user active comparator cohort studies are classified as ‘complex’ analyses.

Study Design

Prevalent user cohorts.

Participant/s

At least two cohorts, including a cohort of prevalent users of at least one drug/medicinal product under investigation (target cohort) and a cohort of prevalent users of at least one drug/medicinal product as an active comparator (comparator cohort). Typically, prevalent user cohorts will be defined by the previous use of a target/comparator medicine during a specific time period or for a pre-specified duration.

Follow-up

Participants in each cohort will be followed from a specific index date different from the date of treatment initiation. Based on previous examples, this will typically be a calendar date, e.g. 1st of March 2022.

Two possibilities of analyses will be offered:

  • In a ‘fixed’ follow-up analysis, follow-up will continue until death, loss of follow-up or a pre-specified time period (e.g., 3 years) regardless of treatment duration
  • In an ‘on treatment’ analysis, follow-up will continue until treatment cessation, death, or loss of follow-up

Outcome/s

One or more study outcomes will be pre-specified, based on previous DARWIN EU algorithms or newly developed and validated ones.

In addition, a long list of negative control outcomes will be assessed, which are not known to have a causal association with the drug/s or medicinal product/s under study.

Analyses

Details will be discussed during programming of pipelines, but new users’ cohort analyses will include:

  • Large-scale characterisation of participants in the target and comparator cohorts, including all features available in the data before or on index date.
  • Large-scale propensity scores (LSPS) will be estimated as the probability of exposure (target cohort) conditional on all available covariates available in the data with a prevalence >1%. LSPS will be estimated using Lasso regression. Different to new user cohorts, LSPS will be estimated using the information available on the index date, which in the case of prevalent users is not the date of therapy initiation but a different previously specified date.
  • Incidence rate/s of each of the outcomes of interest in the target and comparator cohorts after LSPS matching, stratification, or inverse probability weighting
  • Diagnostic/s:
    • Covariate balance: a plot will be produced depicting the standardized mean difference/s between target and comparator cohorts for all available covariates before (x axis) and after propensity score matching/stratification/weighting (y axis) [see Figure 4 for an illustrative example]
    • Equipoise: plots of the distribution of the propensity score stratified by target vs comparator cohort [see Figure 4 for an illustrative example]
    • Analyses will not be conducted where there is insufficient data, based on a pre-specified minimum detectable rate ratio (e.g., MDRR>5)
    • Optional: In addition to the two above, residual confounding/systematic error will be available for estimation, as based on the number of negative control outcome/s significantly associated with the exposure of interest
  • Rate Ratios or Hazard Ratio/s and 95% confidence intervals will be estimated using Poisson or Cox models respectively, comparing the target vs comparator (reference) cohorts after LSPS matching, stratification, or inverse probability weighting
  • Optionally, calibrated RR or HR will be estimated after empirical calibration using negative control outcomes

New user active comparator cohort

Study Type/s

New user active comparator cohort studies are classified as ‘complex’ analyses.

Study Design

New user cohorts.

Participant/s

At least two cohorts, including a cohort of new users of at least one drug/medicinal product under investigation (target cohort) and a cohort of new users of at least one drug/medicinal product as an active comparator (comparator cohort). Typically, new user cohorts exclude previous users of either cohort in the previous year as well as people with <1 year of data visibility before inclusion.

Follow-up

Participants in each cohort will be followed from therapy initiation date (index date). Two possibilities of analyses will be offered:

  • In a ‘fixed’ follow-up analysis, follow-up will continue until death, loss to follow-up or a pre-specified time period (e.g., 3 years) regardless of treatment duration
  • In an ‘on treatment’ analysis, follow-up will continue until treatment cessation, death, or loss of follow-up

Outcome/s

One or more study outcomes will be pre-specified, based on previous DARWIN EU algorithms or newly developed and validated ones.

In addition, a long list of negative control outcomes will be assessed, which are not known to have a causal association with the drug/s or medicinal product/s under study.

Analyses

Details will be discussed during programming of pipelines, but new users cohort analyses will include:

  • Large-scale characterisation of participants in the target and comparator cohorts, including all features available in the data before or on index date
  • Large-scale propensity scores (LSPS) will be estimated as the probability of exposure (target cohort) conditional on all available covariates available in the data with a prevalence >1%. LSPS will be estimated using Lasso regression
  • Incidence rate/s of each of the outcomes of interest in the target and comparator cohorts after LSPS matching, stratification, or inverse probability weighting
  • Diagnostic/s:
    • Covariate balance
    • Equipoise: plots of the distribution of the propensity score stratified by target vs comparator cohort
    • Analyses will not be conducted where there is insufficient data, based on a pre-specified minimum detectable rate ratio (e.g., MDRR>5)
    • Optional: In addition to the two above, residual confounding/systematic error will be available for estimation, as based on the number of negative control outcomes significantly associated with the exposure of interest
  • Rate Ratios or Hazard Ratio/s and 95% confidence intervals will be estimated using Poisson or Cox models respectively, comparing the target vs comparator (reference) cohorts after LSPS matching, stratification, or inverse probability weighting
  • Optionally, calibrated RR or HR will be estimated after empirical calibration using negative control outcomes

Self-controlled case risk interval

Study Type/s

Self-controlled case risk interval studies are classified as ‘complex’ analyses.

Study Design

Self-controlled case risk interval (SCRI).

Participant/s

Just like SCCS, SCRI studies include one or more cohort/s of people who suffer a specified safety event/group of event/s at least once in their record/s. Additional eligibility criteria could apply based on socio-demographics or clinical characteristics.

Follow-up

The SCRI design uses a pre-specified control interval relative to the exposure (typically vaccination) date as the control time. These control intervals can be before or after exposure, but must be defined a priori. We will therefore pre-define study-specific follow-up pre- and/or post-exposure control interval periods, and participants will be followed/observed for the pre-specified control interval, and immediately after/during exposure to a medicinal product. Similar to SCCS, the specified control interval (either pre- and/or post-exposure) periods will be considered as “baseline” or “unexposed”, whilst treatment episode/s are “exposed”.

Outcome/s

One or more study outcomes will be pre-specified, based on previous DARWIN EU algorithms or newly developed and validated ones. Ideally, outcomes should be acute in presentation and with a clear and accurate diagnosis date.

In addition, a long list of negative control outcomes will be assessed, which are not known to have a causal association with the drug/s or medicinal product/s under study.

Analyses

Details will be discussed during programming of pipelines, but SCRI will include:

  • Large-scale characterisation of SCRI participants at the time of diagnosis, including all recorded features available in the data before or on index date, based on SNOMED code/s
  • Pre-specified patient-level characteristics on and/or before diagnosis, based on pre-existing cohorts or definitions (e.g., history of type 2 diabetes, or Charlson comorbidity index).
  • Pre-specified patient-level characteristics on and/or before diagnosis, based on concepts and descendants where no previously validated algorithms are available
  • Incidence rate/s during pre-specified control interval and exposed time
  • Diagnostic/s:
    • Event-exposure independence: a histogram of the time between the event date and the end of observation for individuals censored and uncensored will be plotted to assess for potentially event-dependent observation time
    • Analyses will not be conducted where there is insufficient data, based on a pre-specified minimum detectable rate ratio (e.g., MDRR>5)
    • Optional: In addition to the two above, residual confounding/systematic error will be available for estimation, as based on the number of negative control outcomes significantly associated with the exposure of interest
  • Incidence rate ratios and 95% confidence intervals will be estimated using conditional Poisson regression models, comparing the exposed vs the control interval period.
  • Adjusted incidence rate ratios and 95% confidence intervals will be calculated after adjustment for age and seasonality
  • Optionally, calibrated incidence rate ratios will be estimated after empirical calibration of the adjusted incidence rate ratio based on the observed systematic error

Self-controlled case series

Study Type/s

Self-controlled case series are classified as ‘complex’ (C3) analyses.

Study Design

Self-controlled case series (SCCS).

Participant/s

SCCS will include one or more cohort/s of people who suffer a specified safety event/group of events at least once in their record/s, hence their denomination as “case series”. Additional eligibility criteria could apply based on socio-demographics or clinical characteristics.

Follow-up

We will pre-define follow-up periods according to exposure history. Typically, participants in an SCCS will be followed for some time before (pre-exposure), during (exposed) and post-exposure to a medicinal product. Sometimes, a washout is imposed before the beginning of the exposure period. Pre- and post-exposure periods will be considered as “baseline” or “unexposed”, whilst treatment episode/s are “exposed”. The washout period will be disregarded and not accounted for in the analyses. See Figure 3 for an illustration.

In some analyses, only the first event will be considered for each participant to minimise biases, with follow-up censored after that first event.

Outcome/s

One or more study outcomes will be pre-specified, based on previous DARWIN EU algorithms or newly developed and validated ones. Ideally, outcomes should be acute in presentation and with a clear and accurate diagnosis date.

In addition, a long list of negative control outcomes will be assessed, which are known to have no causal association with the drug/s or medicinal product/s under study.

Analyses

Details will be discussed during programming of pipelines, but SCCS will include:

  • Large-scale characterisation of SCCS participants at the time of diagnosis (index date), including all recorded features available in the data before or at that date, based on SNOMED code/s
  • Pre-specified patient-level characteristics on before or at index date, based on pre-existing cohorts or definitions (e.g., history of type 2 diabetes, or Charlson comorbidity index).
  • Pre-specified patient-level characteristics on before or at index date, based on concepts and descendants where no previously validated algorithms are available
  • Incidence rates during exposed and unexposed time
  • Diagnostic/s:
    • Event-exposure independence: a histogram of the time between the event date and the end of observation for individuals censored and uncensored will be plotted to assess potential for event-dependent observation time
    • Analyses will not be conducted where there is insufficient data, based on a pre-specified minimum detectable rate ratio (e.g., MDRR>5)
    • Optional: In addition to the two above, residual confounding/systematic error will be available for estimation, as based on the distribution of results from the negative control outcome analyses
  • Incidence rate ratios and 95% confidence intervals will be estimated using conditional Poisson regression models, comparing the exposed vs the baseline period.
  • Adjusted incidence rate ratios and 95% confidence intervals will be calculated after adjustment for age and seasonality
  • Optionally, calibrated incidence rate ratios will be estimated after empirical calibration of the adjusted incidence rate ratio based on the observed systematic error

Time series analyses and Difference-in-difference studies

Study Type

These are considered complex epidemiological studies as they require bespoke modelling of interventions like public health restrictions after the calculation of population- and/or patient-level disease epidemiology estimates. These analyses will take the form of interrupted time series to analyse the impact of an intervention (e.g., a public health restriction or change/s in law) on the occurrence of an outcome at the population-level (e.g., incidence of COVID-19 before vs after imposition of public health restrictions) and/or at the patient level (e.g., characteristics of people newly diagnosed before vs after change/s in law).

Study Design

Population-level cohort/s AND Patient-level characterisation.

Participant/s

Population-level analyses will include the entire source population with at least 1 year of data visibility available before start of study period. Additional eligibility criteria could apply as in population-level disease epidemiology (see above). Stratification will be used in the case of Difference-in-difference studies to obtain a “control” (unimpacted) counterfactual for comparison.

Patient-level analyses will be restricted to newly diagnosed subjects, using a 1-year washout, and with potential additional eligibility (see above).

Exposure/s

At least one public health measure, regulatory action (e.g., banning or change in use of a medicinal product) or other intervention will have been imposed at a known date in time. In the case of Difference-in-difference analyses, these should only affect a known subpopulation, with another subpopulation not affected by the intervention acting as the ‘unimpacted’ counterfactual.

The study period will be pre-specified, and divided into pre-exposure (i.e., unimpacted time) and post-exposure (i.e., impacted) calendar time.

In Difference-in-difference studies, the observed pre- vs post-exposure changes in the ‘Exposed’ will be compared to those in the ‘Unexposed’ subpopulation.

A lag period between the intervention and the start of the post-exposure study period could be considered to allow for the action to have an impact on the study outcome/s.

Outcome/s

The following will be estimated for a pre-specified study period, including at least 1 year before and after the intervention/s of interest:

  • Population-based incidence rates of a condition/group of condition/s over time
  • Population-based prevalence of a condition/group of conditions over time
  • For Difference-in-differences: population-based incidence and prevalence of a condition/group of condition/s over time, and in the exposed vs unexposed populations separately

At the patient level, two cohorts of newly diagnosed people will be studied, namely those diagnosed in the period before (unimpacted) vs after the intervention (impacted). The following outcome/s will be obtained and compared for both cohorts:

  • Patient-level characteristics amongst the newly diagnosed before vs after intervention
  • (Optional) Patient-level characteristics amongst the prevalent cases on a given date/s before vs after intervention
  • Prognosis / progression to a pre-specified outcome within a pre-specified time for those diagnosed before vs after the exposure of interest
  • Standard care description, including n (%) receiving each of a pre-specified list of medicine/s, common combinations among those diagnosed before vs after the exposure of interest
  • (Optional) Standard care description, including n (%) receiving each of a pre-specified list of medicine/s, common combinations among prevalent cases on specified dates before vs after the exposure of interest

Follow-up

For population-level analyses, follow-up will start on a pre-specified calendar time point, at least 1 year before the proposed intervention, and will continue for at least 1 year after it.

For patient-level analyses, follow-up will go from the date of diagnosis (newly diagnosed cases) or prespecified date (prevalent cases) until the earliest of loss to follow-up, end of data availability, or death.

Analyses

Incidence and prevalence rate/s of disease over time will be estimated as detailed in section 3.1. Once these are available, segmented regression methods will be used to estimate the impact of the proposed intervention/s on pre- vs post-intervention change/s in trends of population-level prevalence and/or incidence. Coefficients for the segmented regression will be reported  to quantify the impact of the intervention/s on the incidence and prevalence of use, together with Durbin-Watson residuals as a diagnostic. In case of autocorrelation, ARIMA/X models will be fitted instead of segmented regression if data permits.

For difference-in-difference studies, parallel trends before the intervention will be identified as a requisite for this type of study. If confirmed, Difference-in-difference models will be used to subtract the difference of the unexposed group to the exposed one whilst controlling for time varying factors, thus estimating the causal effect of the intervention.

For patient-level analyses, standardised mean differences of each of the covariates for the comparison between new cases diagnosed or prevalent cases in the pre- vs post-intervention period will be obtained as a measure of the impact of the exposure on the profile of new cases.

RMM effectiveness 

Study Type

Trend analyses and RMM effectiveness are considered complex DUS as they require bespoke modelling of interventions like risk minimization measures (RMM) after the completion of population- and/or patient-level DUS. These will typically take the form of interrupted time series to analyse the impact of a regulatory action on the use of a medicine at the population-level (e.g., incidence of use before vs after RMM) and/or at the patient level (e.g., characteristics of new drug users before vs after RMM).

Study Design

Population-level cohort and New drug user cohort

Participant/s

Population-level analyses will include the entire source population  with at least some time (typically 1 year) of data visibility available before start of study period. Additional eligibility criteria could apply as in population-level DUS (see above).

Patient-level analyses will be restricted to new or prevalent users of a specified list of medicine/s or medicinal product/s during a specified time point/period, using a washout, and with potential additional eligibility criteria considered (see above).

Exposure/s

At least one RMM will have been imposed at a known date in time. This RMM (or series of RMMs) will be the main study exposure/intervention for analysis. The study period will therefore be pre-specified, and divided into before (i.e., unimpacted time) and after (i.e., impacted) calendar time.

A lag period between the publication or communication of the RMM and the start of the post-exposure study period could be considered to allow for the RMM to have an impact on the study outcome/s.

Outcome/s

The following will be estimated for a pre-specified study period, including at least 1 year before and after the RMM exposure/s of interest:

  • Population-based incidence rates of use of a drug/drug class over time
  • Population-based prevalence of use of a drug/drug class over time

At the patient level, two cohorts of new or prevalent drug user/s will be studied, namely those who initiated or were users of the treatment of interest in the period before (unimpacted) vs after the RMM (impacted). The following outcome/s will be obtained and compared for both cohorts:

  • New drug user cohort/s patient-level characteristics on or before index date
  • (Optionally) prevalent drug user cohort/s patient-level characteristics on or before index date

Follow-up

For population-level analyses, follow-up will start on a pre-specified calendar time point, at least 1 year before the imposed RMM, and will continue for at least 1 year after it.

For patient-level analyses, follow-up will go from the date of therapy initiation (for new users) or a pre-specified date (for prevalent users) until the earliest of loss to follow-up, end of data availability, or death. Patients might be censored at the time they discontinue treatment or switch to an alternative therapy, or at the date of RMM.

Analyses

Incidence and prevalence rate/s of drug/s use over time will be estimated as detailed in section 2.1. Once these are available, segmented regression methods will be used to estimate the impact of the imposed RMM/s on population-based pre- vs post-intervention trends of drug/s use. Coefficients for the segmented regression indicating the difference in trend between the periods and immediately after the intervention (step change) will be reported to quantify as a formal test of the impact of the RMM/s on the incidence and prevalence of use, together with Durbin-Watson residuals as a diagnostic. In case of autocorrelation, ARIMA/X models will be fitted instead of segmented regression, if data permits.

For patient-level analyses, standardised mean differences of each of the covariates for the comparison between new/prevalent drug users in the pre-RMM vs post-RMM period will be obtained as a measure of the impact of the RMM on the profile of new drug users. Additionally, measures of patient-level DUS will be provided, stratified by time of therapy initiation pre or post RMM.

 

Very Complex Studies 

Studies which cannot rely only on electronic health care databases, or which would require complex and/or novel methodological work.

No very complex study designs have been defined at this moment

 

Quality Assurance of Software Development

Software Quality Assurance, also called Software Testing Life Cycle process, aims assure that the software does what is supposed to do according to the Software Requirements Specification (SRS). 

The code quality assurance process within DARWIN EU® will follow the agile development method as shown in Figure 1.  

 

Figure 1. Agile software development method 

This method is continuously applied during the software lifecycle to monitor and improve the quality. The process is monitored by Quality Assessment Lead that interacts closely with the Product Owner. 

It contains the following steps: 

  • Plan: In the Plan Phase it is agreed what code will require testing and what part of this will be prioritized for the next sprint. 
  • Design: In the design phase the Software Specification Requirements (SRS) and Test Plan are created or updated (in following sprints). The SRS provides the benchmark for test planning since it specifies all the function and non-functional requirements, in which situations it is expected to be applied in, and by whom. The Test Plan is a living document that gets updated with development sprints. 
  • Develop: In the develop step the quality measures are implemented in the code base or get updated according to the Test Plan. This includes the creation of the unit tests etc.
  • Test: This is the quality control phase of the process where the software is tested according to the plan to ensure that it provides the correct results, it has cross platform compatibility, it is stable, and it is efficient. The outcome of these tests are recorded and are taken into account for the next sprint. Issues identified are flagged as bugs and are either immediately rectified in the case of significant errors (e.g. incorrect application of functionality providing false answers), or they are (deemed enhancements) to be fixed in the next development cycle of the package, e.g., the software requires too much memory. 
  • Deploy: In this phase a major or minor software release can be created or the code is kept in development status for further improvement. 

The Quality of the software development focuses on 2 main themes: 1. does the software produce the correct results. This is tested using small scale examples (units tests) where the correct answer is known. 2. Does the software run on the required platforms without producing errors. This is examined by checking, e.g., incorrect parameters, platform differences (both database management and operating systems).