Print page Resize text Change font-size Change font-size Change font-size High contrast

Home > Standards & Guidances > Methodological Guide

ENCePP Guide on Methodological Standards in Pharmacoepidemiology


4.3.Specific aspects of study design



4.3.1. Positive and negative control exposures and outcomes

4.3.2. Use of an active comparator

4.3.3. Interrupted time series analyses and Difference-in-Differences method



4.3.1. Positive and negative control exposures and outcomes


The validity of causal associations may be tested by using control exposures or outcomes. A negative control outcome is a variable known not to be causally affected by the treatment of interest. Likewise, a negative control exposure is a variable known not to causally affect the outcome of interest. Conversely, a positive control outcome is a variable that is understood to be positively associated with the exposure of interest and a positive control exposure is one which is known to increase the risk of the outcome of interest.


Well-selected positive and negative controls support decision-making on whether the data at hand correctly support the study results for known associations or correctly demonstrate lack of association. Positive controls with negative findings and negative controls with positive findings may signal the presence of bias, as illustrated in Utilization of Positive and Negative Controls to Examine Comorbid Associations in Observational Database Studies (Med Care 2017;55(3):244-51). This general principle, with additional examples, is described in Negative Controls: A Tool for Detecting Confounding and Bias in Observational Studies (Epidemiology 2010 May; 21(3): 383–388.) and Control Outcomes and Exposures for Improving Internal Validity of Nonrandomized Studies (Health Serv Res. 2015;50(5):1432-51). Negative controls have also been used to identify other sources of bias including selection bias and measurement bias in Brief Report: Negative Controls to Detect Selection Bias and Measurement Bias in Epidemiologic Studies (Epidemiology. 2016 Sep; 27(5): 637–641) and in Negative control exposure studies in the presence of measurement error: implications for attempted effect estimate calibration (Int J Epidemiol. 2018 Apr; 47(2): 587–596). The use of negative and positive controls has therefore been recommended as a diagnostic test to evaluate whether the study design produced valid results. Practical considerations for their selection are provided in Chapter 18. Method Validity of The Book of OHDSI (2021).


Selecting drug-event combinations as reliable controls nevertheless poses important challenges: it is difficult to establish for negative controls proof of absence of an association, and it is still more problematic to select positive controls because it is desirable not only to measure an association but also an accurate estimate of the effect size. This has led to attempts to establish libraries of controls that can be used to characterise the performance of different observational datasets in detecting various types of associations using a number of different study designs. Although the methods used to identify negative and positive controls may be questioned according to Evidence of Misclassification of Drug-Event Associations Classified as Gold Standard 'Negative Controls' by the Observational Medical Outcomes Partnership (OMOP) (Drug Saf. 2016;39(5):421-32), this approach may allow to separate random and systematic errors in epidemiological studies, providing a context for evaluating uncertainty surrounding effect estimates.


Beyond the detection of bias, positive and negative controls can be used to correct unmeasured confounding, such as through empirical calibration on p-values or confidence intervals, as described in Interpreting observational studies: Why empirical calibration is needed to correct p-values (Stat Med. 2014;33(2):209-18), Robust empirical calibration of p-values using observational data (Stat Med. 2016;35(22):3883-8), Empirical confidence interval calibration for population-level effect estimation studies in observational healthcare data (Proc Natl Acad Sci. USA 2018;115(11): 571-7).


The empirical calibration approach has been used in both the case-based study design (Empirical assessment of case-based methods for identification of drugs associated with acute liver injury in the French National Healthcare System database (SNDS), Pharmacoepidemiol Drug Saf. 2021;30(3):320-33) and the cohort design (Risk of depression, suicide and psychosis with hydroxychloroquine treatment for rheumatoid arthritis: a multinational network cohort study, Rheumatology (Oxford) 2021;60:3222-34). While this method may reduce the number of false positive results, it may also reduce the ability to detect a true safety or efficacy signal and is computationally expensive, as suggested in Limitations of empirical calibration of p-values using observational data (Stat Med. 2016;35(22):3869-82) and Empirical confidence interval calibration for population-level effect estimation studies in observational healthcare data (Proc Natl Acad Sci. USA 2018;115(11): 571-7).


An Overview of key negative controls techniques has been published by the Duke-Margolis Center for Health Policy, providing a brief description of key assumptions, strengths and limitations of using negative controls (Duke-Margolis/ FDA workshop on Understanding the Use of Negative Controls to Assess the Validity of Non-Interventional Studies of Treatment Using Real-World Evidence, March 8, 2023).


4.3.2. Use of an active comparator


The main purpose of using an active comparator is to reduce confounding by indication and by disease severity. Its use is optimal in the context of the new user design (see Chapter 6.1.1), where patients with the same indication initiating different treatments are compared, as described in The active comparator, new user study design in pharmacoepidemiology: historical foundations and contemporary application (Curr Epidemiol Rep. 2015;2(4):221-8). Active comparators implicitly restrict comparisons to patients with an indication for treatment who are actually receiving treatment. Therefore, use of an active comparator not only reduces confounding by indication, but also confounding by frailty and healthy user bias.


Active-comparator design and new-user design in observational studies (Nat Rev Rheumatol. 2015;11:437-41) points out that active comparator studies give insight how safe/effective a certain therapy is, compared to a therapeutic alternative, which is usually the more meaningful research question. Ideally, an active comparator should be interchangeable with the therapy of interest, and represent the counterfactual risk of a given event with the therapeutic alternative. This means that the active comparator should be indicated for the same disease and disease severity and have the same absolute or relative exclusion criteria. The active comparator represents the background risk in the diseased and should be known to have no effect on the event(s) of interest or competing events. If the effect of the active comparator is unknown, multiple comparators, including non-users, should be used.


Identification of an active comparator should be based on clinician input and respective guidelines, acceptability of its use within the chosen data source should be verified and the balance in patient characteristics should be reviewed as described in Core concepts in pharmacoepidemiology: Confounding by indication and the role of active comparators (Pharmacoepidemiol Drug Saf. 2022;31(3):261-269). In situations where an acceptable active comparator is lacking, such as due to unavailability of a therapeutic alternative, extensive channeling or reimbursement restrictions, the validity of the planned study needs to be assessed. Alternative methods to reduce confounding by indication, such as use of inactive comparators and alternative approaches such as methods based on  propensity scores and instrumental variable analysis to balance patients’ characteristics, should be considered.


4.3.3. Interrupted time series analyses and Difference-in-Differences method


In evaluating the effectiveness of population-level interventions that are implemented at a specific point in time (with clearly defined before-after periods, such as policy effect date, regulatory action date) interrupted time series (ITS) studies are becoming the standard approach. ITS, a quasi-experimental design to evaluate the longitudinal effects of interventions through regression modelling, establishes the expected pre-intervention trend for an outcome of interest. The counterfactual scenario in the absence of the intervention serves as the comparator, the expected trend that provides a comparison for the evaluation of the impact of the intervention by examining any change occurring following the intervention period (Interrupted time series regression for the evaluation of public health interventions: a tutorial, Int J Epidemiol. 2017;46:348-55).


ITS analysis requires that several assumptions are met and its implementation is technically sophisticated, as explained in Regression based quasi-experimental approach when randomisation is not an option: Interrupted time series analysis (BMJ. 2015; 350:h2750). The use of ITS regression in pharmacovigilance impact research is illustrated in Chapter 16.4.


When data on exposed and control populations are available, Difference-in-Differences (DiD) methods are sometimes preferable. These methods compare the outcome mean or trend for exposed and control groups before and after a certain time point (usually indicating a treatment or intervention point), providing insight into the changes of the variable for the exposed population relative to the change in the negative outcome group. This approach can be a more robust approach to causal inference than ITS, by comparing the exposed group to a control group subject to the same time-varying factors. First, DiD takes the difference for both groups before and after the intervention; then it subtracts the difference of the control group from the exposed group to control for time-varying factors, thus estimating the clean impact of the intervention.


A basic introduction can be found in Impact evaluation using Difference-in-Differences (RAUSP Management Journal 2019;54:519-532) and further extensions, for example assessment of variation in treatment timing, in Difference-in-differences with variation in treatment timing (Journal of Econometrics 2021;225:254-77). A good overview applied to public health policy research is available in Designing Difference in Difference Studies: Best Practices for Public Health Policy Research (Annu Rev Public Health 2018;39:53-469). A recent review from the econometrics perspective discusses possible avenues when some core assumptions are violated and models with relaxed hypotheses are needed, and provides recommendations which can be applied to pharmacoepidemiology (What’s Trending in Difference-in-Differences? A Synthesis of the Recent Econometrics Literature, J Econom. 2023;235(2); 2218-2244).


« Back