Print page Resize text
High contrast

Home > Standards & Guidances > Methodological Guide

The target trial approach and its emulation by an observational study was initially introduced in 1989 (The clinical trial as a paradigm for epidemiologic research. J Clin Epidemiol 1989;42(6):491-6) and later extended to pharmacoepidemiology as a conceptual framework helping researchers to identify and avoid potential biases (Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available. Am. J. Epidemiol 2016;183(8) 758-64). The underlying idea is to first imagine a hypothetical randomised trial (“target trial”) that would answer the research question instead of starting to design a study around the limitations of the available observational data. In the first step, the target trial is described with regards to the eligibility criteria, the treatment strategies, the assignment procedure, the follow-up period, the outcome, the causal contrasts and the analysis plan. In the second step, the researcher specifies how the observational data is used to emulate the target trial, e.g. how time zero is defined, and the trade-offs needed to conduct the observational study, e.g. regarding eligibility criteria, interventions, confounders and outcomes. The explicit description of the target trial and the specification of how this trial is emulated with observational data lead to study designs and analytic approaches that prevent common biases such as immortal time bias or prevalent user bias. It also facilitates a systematic methodological evaluation and comparison of observational studies (Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses. J Clin Epidemiol. 2016;79: 70-5). How to estimate the effect of treatment duration on survival outcomes using observational data (BMJ 2018;360: k182) proposes methods for overcoming bias with this approach when quantifying the effect of treatment duration.

5.3.2. Methods to address selection bias

New user (incident user) designs restrict the study population to persons who are observed at the start of treatment. New user design helps mitigate selection bias by preventing ‘depletion of susceptibles’ – an unwanted exclusion from a safety assessment of persons discontinuing treatments following early adverse reactions. It also helps alleviate healthy user bias for preventive treatments in some circumstances (see Healthy User and Related Biases in Observational Studies of Preventive Interventions: A Primer for Physicians. J Gen Intern Med 2011, 26(5):546-50). The article Evaluating medication effects outside of clinical trials: new-user designs (Am J Epidemiol 2003;158 (9):915–20) defines new-user designs and explains how they can be implemented in case-control studies. One should also be aware of the difference between a new user (which requires absence of prior use of a given drug/drug class during a prespecified washout period) and a treatment-naïve user (which requires absence of prior treatment for a given indication). A treatment-naïve status may not be ascertainable in left-truncated data.

The active comparator new user design (see Chapter 5.3.4.2) would ideally compare two treatments that are marketed contemporaneously. However, a more common situation is where a recently marketed drug is compared with an older established alternative. For such situations, the article Prevalent new-user cohort designs for comparative drug effect studies by time-conditional propensity scores (Pharmacoepidemiol Drug Saf 2017;26(4):459-68) introduces a cohort design allowing identification of matched subjects using the comparator drug at the same point in the course of disease as the (newly marketed) drug of interest. The design utilises time-based and prescription-based exposure sets to compute time-dependent propensity scores of initiating the new drug

The use of case only study designs can also reduce selection bias if the statistical assumptions of the method are fulfilled (see Chapter 5.3.4.1).

Misclassification can occur in exposure, outcome or covariate variables. Outcome misclassification occurs when a non-case is classified as a case (false positive error) or a case is classified as a non-case (false negative error). Errors are quantified as estimates of positive predictive value, negative predictive value, sensitivity and specificity. Most database studies will be subject to outcome misclassification to some degree, although case adjudication against an established case definition or a reference standard can remove false positives, and false negatives can be mitigated if a broad search algorithm is used. The influence of misclassification on the point estimate should be quantified or, if this is not possible, its impact on the interpretation of the results should be discussed. Exposure misclassification may also occur and one should avoid the epidemiologic ‘mantra’ about non-differential misclassification of exposure producing conservative estimates. It holds true, on the average, for dichotomous exposures that have an effect, but does not necessarily apply to any given estimate (Proper interpretation of non-differential misclassification effects: expectations vs observations. Int J Epidemiol 2005;34(3):680-7).

Good practices for quantitative bias analysis (Int J Epidemiol 2014;43(6):1969-85) advocates explicit and quantitative assessment of misclassification bias, including guidance on which biases to assess in each situation, what level of sophistication to use, and how to present the results. When outcome status is misclassified, relative measures of association are unbiased if specificity of ascertainment is high.

In Use of the Positive Predictive Value to Correct for Disease Misclassification in Epidemiologic Studies (Am J Epidemiol 1993;138(11):1007-15), Brenner and Gefeller propose a method based on estimates of the positive predictive value which requires validation of a sample of patients with the outcome only, while assuming that sensitivity is non-differential. A web application allows correction of risk ratio or cumulative incidence point estimates and confidence intervals for bias due to outcome misclassification based on Brenner and Gefeller's methodology. The article Basic methods for sensitivity analysis of biases (Int J Epidemiol 1996;25(6):1107-16) provides different examples of methods for examining the sensitivity of study results to biases, with a focus on methods that can be implemented without computer programming.

*5.3.4.1. Case-only
designs*

Case-only designs reduce confounding by using the exposure history of each case as its own control and thereby eliminate confounding by characteristics that are constant over time, such as sex, socio-economic factors, genetics and chronic diseases. A review of case only designs is available in Use of self-controlled designs in pharmacoepidemiology (J Intern Med 2014; 275(6): 581-9).

A simple form of a case-only design is the symmetry analysis (initially described as prescription sequence symmetry analysis), introduced as a screening tool in Evidence of depression provoked by cardiovascular medication: a prescription sequence symmetry analysis (Epidemiology 1996;7(5):478-84).

The case-crossover design compares the risk of exposure in a time period prior to an outcome with that in an earlier reference time-period, or set of time periods, to examine the effect of transient exposures on acute events (see The Case-Crossover Design: A Method for Studying Transient Effects on the Risk of Acute Events, Am J Epidemiol 1991;133(2):144-53). The case-time-control designs are a modification of case-crossover designs which use exposure history data from a traditional control group to estimate and adjust for the bias from temporal changes in prescribing (The case-time-control design, Epidemiology 1995;6(3):248-53). However, if not well matched, the case-time-control group may reintroduce selection bias (Confounding and exposure trends in case-crossover and case-time-control designs (Epidemiology 1996;7(3):231-9). Methods have been suggested to overcome the exposure-trend bias while controlling for time-invariant confounders (see Future cases as present controls to adjust for exposure trend bias in case-only studies, Epidemiology 2011;22(4):568-74. Persistent User Bias in Case-Crossover Studies in Pharmacoepidemiology (Am J Epidemiol 2016; 184(10):761-9) demonstrates that case-crossover studies of drugs that may be used indefinitely are biased upward. This bias is alleviated, but not removed completely, by using a control group.

In the self-controlled case series (SCCS) design, the observation period following each exposure for each case is divided into risk period(s) (e.g. number of days immediately following each exposure) and a control period (observed time outside this risk period). Incidence rates within the risk period after exposure are compared with incidence rates within the control period. The Tutorial in biostatistics: the self-controlled case series method (Stat Med 2006; 25(10):1768-97) explains how to fit SCCS models using standard statistical packages. The bias introduced by inaccurate specification of the risk window is discussed and a data-based approach for identifying the optimal risk windows is proposed in Identifying optimal risk windows for self-controlled case series studies of vaccine safety (Stat Med 2011; 30(7):742-52). The SCCS also assumes that the event itself does not affect the chance of being exposed. The pseudo-likelihood method developed to address this possible issue is described in Cases series analysis for censored, perturbed, or curtailed post-event exposures (Biostatistics 2009;10(1):3-16). Use of the self-controlled case-series method in vaccine safety studies: review and recommendations for best practice (Epidemiol Infect 2011;139(12):1805-17) assesses how the SCCS method has been used across 40 vaccine studies, highlights good practice and gives guidance on how the method should be used and reported. Using several methods of analysis is recommended, as it can reinforce conclusions or shed light on possible sources of bias when these differ for different study designs.

When should case-only designs be used for safety monitoring of medical products? (Pharmacoepidemiol Drug Saf 2012;21(Suppl. 1):50-61) compares the SCCS and case-crossover methods as to their use, strength and major difference (directionality). It concludes that case-only analyses of intermittent users complement the cohort analyses of prolonged users because their different biases compensate for one another. It also provides recommendations on when case-only designs should and should not be used for Drug Safety monitoring. Empirical performance of the self-controlled case series design: lessons for developing a risk identification and analysis system (Drug Saf 2013;36(Suppl. 1):S83-S93) evaluates the performance of the SCCS design using 399 drug-health outcome pairs in 5 observational databases and 6 simulated datasets. Four outcomes and five design choices were assessed. Within-person study designs had lower precision and greater susceptibility to bias because of trends in exposure than cohort and nested case-control designs (J Clin Epidemiol 2012;65(4):384-93) compares cohort, case-control, case-cross-over and SCCS designs to explore the association between thiazolidinediones and the risks of heart failure and fracture and anticonvulsants and the risk of fracture. Bias was removed when follow-up was sampled both before and after the outcome, or when a case-time-control design was used.

*5.3.4.2. Use of an
active comparator*

The main purpose of using an active comparator is to reduce confounding by indication or by severity. Its use is optimal in the context of the new user design, whereby comparison is between patients with the same indication initiating different treatments (The active comparator, new user study design in pharmacoepidemiology: historical foundations and contemporary application, Curr Epidemiol Rep 2015;2(4):221-8). An active comparator should be chosen to represent the counterfactual risk of a given outcome with a different treatment, i.e. it should have a known and positive safety profile with respect to the events of interest and ideally represent the background risk in the diseased (for example, safety of antiepileptics in pregnancy in relation to risk of congenital malformations could be compared against that of lamotrigine, which is not known to be teratogenic). The paper Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available (Am J Epidemiol. 2016;183(8):758-64) channels counterfactual theory for comparing the effects of treatment strategies helping avoid common methodologic pitfalls. The C-Word: Scientific Euphemisms Do Not Improve Causal Inference From Observational Data (Am J Public Health 2018;108(5):616-19) highlights the need to be explicit about the causal objective of a study to help for the emulation of a particular target trial and support the choice of confounding adjustment variables.

With newly marketed medicines, no active comparator with ideal comparability of patients’ characteristics may be available because prescribing newly marketed medicines may be driven to a greater extent by patients’ prognostic characteristics (early users may be either sicker or healthier than all patients with the indication) and by reimbursement considerations than prescribing of established medicines. This is described for comparative effectiveness studies in Assessing the comparative effectiveness of newly marketed medications: methodological challenges and implications for drug development (Clin Pharmacol Ther 2011;90(6):777-90) and in Newly marketed medications present unique challenges for nonrandomized comparative effectiveness analyses. (J Comp Eff Res 2012;1(2):109-11). Other challenges include treatment effect heterogeneity as patient characteristics of users evolve over time, and low precision owing to slow drug uptake.

*5.3.4.3. Disease risk
scores*

An approach to controlling for a large number of confounding variables is to summarise them in a single multivariable confounder score. Stratification by a multivariate confounder score (Am J Epidemiol 1976;104(6):609-20) shows how control for confounding may be based on stratification by the score. An example is a disease risk score (DRS) that estimates the probability or rate of disease occurrence conditional on being unexposed. The association between exposure and disease is then estimated with adjustment for the disease risk score in place of the individual covariates.

DRSs are however difficult to estimate if outcomes are rare. Use of disease risk scores in pharmacoepidemiologic studies (Stat Methods Med Res 2009;18(1):67-80) includes a detailed description of their construction and use, a summary of simulation studies comparing their performance to traditional models, a comparison of their utility with that of propensity scores, and some further topics for future research. Disease risk score as a confounder summary method: systematic review and recommendations (Pharmacoepidemiol Drug Saf 2013;22(2);122-29), examines trends in the use and application of DRS as a confounder summary method and shows that large variation exists with differences in terminology and methods used.

In Role of disease risk scores in comparative effectiveness research with emerging therapies (Pharmacoepidemiol Drug Saf 2012;21 Suppl 2:138–47), it is argued that DRS may have a place when studying drugs that are recently introduced to the market. In such situations, as characteristics of users change rapidly, exposure propensity scores may prove highly unstable. DRSs based mostly on biological associations would be more stable. However, DRS models are still sensitive to misspecification as discussed in Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models (Epidemiology 2016;27(1):133-42).

*5.3.4.4. Propensity
scores*

Databases used in pharmacoepidemiological studies often include records of prescribed medications and encounters with medical care providers, from which one can construct surrogate measures for both drug exposure and covariates that are potential confounders. It is often possible to track day-by-day changes in these variables. However, while this information can be critical for study success, its volume can pose challenges for statistical analysis.

A propensity score (PS) is analogous to the disease risk score in that it combines a large number of possible confounders into a single variable (the score). The exposure propensity score (EPS) is the conditional probability of exposure to a treatment given observed covariates. In a cohort study, matching or stratifying treated and comparison subjects on EPS tends to balance all of the observed covariates. However, unlike random assignment of treatments, the propensity score may not balance unobserved covariates. Invited Commentary: Propensity Scores (Am J Epidemiol. 1999;150(4):327–33) reviews the uses and limitations of propensity scores and provide a brief outline of the associated statistical theory. The authors present results of adjustment by matching or stratification on the propensity score.

High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Healthcare Claims Data (Epidemiol. 2009;20(4):512-22) discusses the high dimensional propensity score (hd-PS) model approach. It attempts to empirically identify large numbers of potential confounders in healthcare databases and, by doing so, to extract more information on confounders and proxies. Covariate selection in high-dimensional propensity score analyses of treatment effects in small samples (Am J Epidemiol 2011;173(12):1404-13) evaluates the relative performance of hd-PS in smaller samples. Confounding adjustment via a semi-automated high-dimensional propensity score algorithm: an application to electronic medical records (Pharmacoepidemiol Drug Saf 2012;20(8):849-57) evaluates the use of hd-PS in a primary care electronic medical record database. In addition, the article Using high-dimensional propensity scores to automate confounding control in a distributed medical product safety surveillance system (Pharmacoepidemiol Drug Saf 2012;21(S1):41-9) summarises the application of this method for automating confounding control in sequential cohort studies as applied to safety monitoring systems using healthcare databases and also discusses the strengths and limitations of hd-PS.

Most cohort studies match patients 1:1 on the propensity score. Increasing the matching ratio may increase precision but also bias. One-to-many propensity score matching in cohort studies (Pharmacoepidemiol Drug Saf 2012;21(S2):69-80) tests several methods for 1:n propensity score matching in simulation and empirical studies and recommends using a variable ratio that increases precision at a small cost of bias. Matching by propensity score in cohort studies with three treatment groups (Epidemiology 2013;24(3):401-9) develops and tests a 1:1:1 propensity score matching approach offering a way to compare three treatment options.

The use of several measures of balance for developing an optimal propensity score model is described in Measuring balance and model selection in propensity score methods (Pharmacoepidemiol Drug Saf 2011;20(11):1115-29) and further evaluated in Propensity score balance measures in pharmacoepidemiology: a simulation study (Pharmacoepidemiol Drug Saf 2014;23(8):802-11). In most situations, the standardised difference performs best and is easy to calculate (see Balance measures for propensity score methods: a clinical example on beta-agonist use and the risk of myocardial infarction (Pharmacoepidemiol Drug Saf 2011;20(11):1130-7) and Reporting of covariate selection and balance assessment in propensity score analysis is suboptimal: a systematic review (J Clin Epidemiol 2015;68(2):112-21)). Metrics for covariate balance in cohort studies of causal effects (Stat Med 2013;33:1685-99) shows in a simulation study that the c-statistics of the PS model after matching and the general weighted difference perform as well as the standardized difference and are preferred when an overall summary measure of balance is requested. Treatment effects in the presence of unmeasured confounding: dealing with observations in the tails of the propensity score distribution--a simulation study (Am J Epidemiol. 2010;172(7):843-54) demonstrates how ‘trimming’ of the propensity score eliminates subjects who are treated contrary to prediction and their exposed/unexposed counterparts, thereby reducing bias by unmeasured confounders.

Performance of propensity score calibration-–a simulation study (Am J Epidemiol 2007;165(10):1110-8) introduces ‘propensity score calibration’ (PSC). This technique combines propensity score matching methods with measurement error regression models to address confounding by variables unobserved in the main study. This is done by using additional covariate measurements observed in a validation study, which is often a subset of the main study.

Although in most situations propensity score models, with the possible exception of hd-PS, do not have any advantages over conventional multivariate modelling in terms of adjustment for identified confounders, several other benefits may be derived. Propensity score methods may help to gain insight into determinants of treatment including age, frailty and comorbidity and to identify individuals treated against expectation. A statistical advantage of PS analyses is that if exposure is not infrequent it is possible to adjust for a large number of covariates even if outcomes are rare, a situation often encountered in Drug Safety research. Furthermore, assessment of the PS distribution may reveal non-positivity. An important limitation of PS is that it is not directly amenable for case-control studies. A critical assessment of propensity scores is provided in Propensity scores: from naive enthusiasm to intuitive understanding (Stat Methods Med Res 2012;21(3):273-93). Semiautomated and machine-learning based approaches to propensity score methods are currently being developed (Automated data-adaptive analytics for electronic healthcare data to study causal treatment effects (Clin Epidemiol 2018;10:771-88).

*5.3.4.5. Instrumental
variables*

Instrumental variable (IV) analysis is an approach to address uncontrolled confounding in comparative studies. An introduction to instrumental variables for epidemiologists (Int J Epidemiol 2000;29(4):722-9) presents those developments, illustrated by an application of IV methods to non-parametric adjustment for non-compliance in randomised trials. The author mentions a number of caveats but concludes that IV corrections can be valuable in many situations. IV analysis in comparative safety and effectiveness research is reviewed in Instrumental variable methods in comparative safety and effectiveness research (Pharmacoepidemiol Drug Saf 2010; 19(6):537-54). A review of IV analysis for observational comparative effectiveness studies suggested that in the large majority of studies, in which IV analysis was applied, one of the assumption could be violated (Potential bias of instrumental variable analyses for observational comparative effectiveness research, Ann Intern Med. 2014;161(2):131-8).

A proposal for reporting instrumental variable analyses has been suggested in Commentary: how to report instrumental variable analyses (suggestions welcome) (Epidemiology 2013;24(3):370-4). In particular the type of treatment effect (average treatment effect/homogeneity condition or local average treatment effect/monotonicity condition) and the testing of critical assumptions for valid IV analyses should be reported. In support of these guidelines, the standardized difference has been proposed to falsify the assumption that confounders are not related to the instrumental variable (Quantitative falsification of instrumental variables assumption using balance measures, Epidemiology 2014;25(5):770-2).

The complexity of the issues associated with confounding by indication, channelling and selective prescribing is explored in Evaluating short-term drug effects using a physician-specific prescribing preference as an instrumental variable (Epidemiology 2006;17(3):268-75). A conventional, adjusted multivariable analysis showed a higher risk of gastrointestinal toxicity for selective COX-2-inhibitors than for traditional NSAIDs, which was at odds with results from clinical trials. However, a physician-level instrumental variable approach (a time-varying estimate of a physician’s relative preference for a given drug, where at least two therapeutic alternatives exist) yielded evidence of a protective effect due to COX-2 exposure, particularly for shorter term drug exposures. Despite the potential benefits of physician-level IVs their performance can vary across databases and strongly depends on the definition of IV used as discussed in Evaluating different physician's prescribing preference based instrumental variables in two primary care databases: a study of inhaled long-acting beta2-agonist use and the risk of myocardial infarction (Pharmacoepidemiol Drug Saf 2016;25 Suppl 1:132-41).

Instrumental variable methods in comparative safety and effectiveness research (Pharmacoepidemiol Drug Saf 2010;19(6):537–54) is a practical guidance on IV analyses in pharmacoepidemiology. Instrumental variable methods for causal inference (Stat Med 2014;33(13):2297-340) is a tutorial, including statistical code for performing IV analysis.

An important limitation of IV analysis is that weak instruments (small association between IV and exposure) lead to decreased statistical efficiency and biased IV estimates as detailed in Instrumental variables: application and limitations (Epidemiology 2006;17:260-7). For example, in the above mentioned study on non-selective NSAIDs and COX-2-inhibitors, the confidence intervals for IV estimates were in the order of five times wider than with conventional analysis. Performance of instrumental variable methods in cohort and nested case-control studies: a simulation study (Pharmacoepidemiol Drug Saf 2014; 2014;23(2):165-77) demonstrated that a stronger IV-exposure association is needed in nested case-control studies compared to cohort studies in order to achieve the same bias reduction. Increasing the number of controls reduces this bias from IV analysis with relatively weak instruments.

Selecting on treatment: a pervasive form of bias in instrumental variable analyses (Am J Epidemiol 2015;181(3):191-7) warns against bias in IV analysis by including only a subset of possible treatment options.

*5.3.4.6. Prior event
rate ratios*

Another method proposed to control for unmeasured confounding is the Prior Event Rate Ratio (PERR) adjustment method, in which the effect of exposure is estimated using the ratio of rate ratios (RRs) from periods before and after initiation of a drug exposure, as discussed in Replicated studies of two randomized trials of angiotensin converting enzyme inhibitors: further empiric validation of the ‘prior event rate ratio’ to adjust for unmeasured confounding by indication (Pharmacoepidemiol Drug Saf 2008;17(7):671-685). For example, when a new drug is launched, direct estimation of the drugs effect observed in the period after launch is potentially confounded. Differences in event rates in the period before the launch between future users and future non-users may provide a measure of the amount of confounding present. By dividing the effect estimate from the period after launch by the effect obtained in the period before launch, the confounding in the second period can be adjusted for. This method requires that confounding effects are constant over time, that there is no confounder-by-treatment interaction, and outcomes are non-lethal events.

Performance of prior event rate ratio adjustment method in pharmacoepidemiology: a simulation study (Pharmacoepidemiol Drug Saf 2015(5);24:468-477) discusses that the PERR adjustment method can help to reduce bias as a result of unmeasured confounding in certain situations but that theoretical justification of assumptions should be provided.

*5.3.4.7. Handling
time-dependent confounding in the analysis*

Methods for dealing with time-dependent confounding (Stat Med. 2013;32(9):1584-618) provides an overview of how time-dependent confounding can be handled in the analysis of a study. It provides an in-depth discussion of marginal structural models and g-computation.

Beyond the G-estimation and the Marginal Structural Model (MSM) described below, traditional and efficient approaches to deal with time dependent variables should be considered in the design of the study, such as nested case control studies with assessment of time varying exposure windows.

G-estimation is a method for estimating the joint effects of time-varying treatments using ideas from instrumental variables methods. G-estimation of Causal Effects: Isolated Systolic Hypertension and Cardiovascular Death in the Framingham Heart Study (Am J Epidemiol 1998;148(4):390-401) demonstrates how the G-estimation procedure allows for appropriate adjustment of the effect of a time-varying exposure in the presence of time-dependent confounders that are themselves influenced by the exposure.

The use of Marginal Structural Models can be an alternative to G-estimation. Marginal Structural Models and Causal Inference in Epidemiology (Epidemiology 2000;11(5):550-60) introduces a class of causal models that allow for improved adjustment for confounding in situations of time-dependent confounding.

MSMs have two major advantages over G-estimation. Even if it is useful for survival time outcomes, continuous measured outcomes and Poisson count outcomes, logistic G-estimation cannot be conveniently used to estimate the effect of treatment on dichotomous outcomes unless the outcome is rare. The second major advantage of MSMs is that they resemble standard models, whereas G-estimation does not (see Marginal Structural Models to Estimate the Causal Effect of Zidovudine on the Survival of HIV-Positive Men. Epidemiology 2000;11(5):561-70).

Effect of highly active antiretroviral therapy on time to acquired immunodeficiency syndrome or death using marginal structural models (Am J Epidemiol 2003;158(7):687-94) provides a clear example in which standard Cox analysis failed to detect a clinically meaningful net benefit of treatment because it does not appropriately adjust for time-dependent covariates that are simultaneously confounders and intermediate variables. This net benefit was shown using a marginal structural survival model. In Time-dependent propensity score and collider-stratification bias: an example of beta2-agonist use and the risk of coronary heart disease (Eur J Epidemiol 2013;28(4):291-9), various methods to control for time-dependent confounding are compared in an empirical study on the association between inhaled beta-2-agonists and the risk of coronary heart disease. MSMs resulted in slightly reduced associations compared to standard Cox-regression.

*5.3.4.8. The
trend-in-trend design*

The Trend-in-trend Research Design for Causal Inference (Epidemiology 2017;28: 529-36) presents a semi-ecological design, whereby trends in exposure and in outcome rates are compared in subsets of the population that have different rates of uptake for the drug in question. These subsets are identified through PS modelling. There is a formal framework for transforming the observed trends into an effect estimate. Simulation and empirical studies showed the design to be less statistically efficient than a cohort study, but more resistant to confounding. The trend-in-trend method may be useful in settings where there is a strong time trend in exposure, such as a newly approved drug.

One may test the validity of putative causal associations by using control exposures or outcomes. Well-chosen positive and negative controls help convince investigator that the data at hand correctly detect existing associations or correctly demonstrate lack of association when none is expected. Positive controls turning out as negative and negative as positive may signal presence of a bias, as illustrated in a study demonstrating health adherer bias by showing that adherence to statins was associated with decreased risks of biologically implausible outcomes (Statin adherence and risk of accidents: a cautionary tale, Circulation 2009;119(15):2051-7). The general principle, with additional examples, is described in Control Outcomes and Exposures for Improving Internal Validity of Nonrandomized Studies (Health Serv Res 2015;50(5):1432-51).

Selecting drug-event combinations as reliable controls poses a challenge: it is difficult to establish for negative controls proof of absence of an association, and it is still more problematic to select positive controls because it is desirable not only to establish an association but also an accurate estimate of the effect size. This has led to attempts to establish libraries of controls that can be used to characterise the performance of different observational datasets in detecting various types of association using a number of different study designs. Although this kind of controls may be questioned according to Evidence of Misclassification of Drug-Event Associations Classified as Gold Standard 'Negative Controls' by the Observational Medical Outcomes Partnership (OMOP) (Drug Saf 2016;39(5):421-32), the approach of calibrating the performance of epidemiological methods prior to performing a study holds the promise of providing a trustworthy framework for interpretation of the results, as shown by Interpreting observational studies: Why empirical calibration is needed to correct p-values (Stat Med. 2014;33(2):209-18), Robust empirical calibration of p-values using observational data (Stat Med 2016;35(22):3883-8) and Empirical confidence interval calibration for population-level effect estimation studies in observational healthcare data (Proc Natl Acad Sci USA 2018;115 (11): 571-7).

Triangulation is not a separate methodological approach, but rather a framework, formally described in Triangulation in aetiological epidemiology (Int J Epidemiol 2016;45(6):1866-86). Triangulation is defined as “the practice of obtaining more reliable answers to research questions through integrating results from several different approaches, where each approach has different key sources of potential bias that are unrelated to each other.” In some ways, the paper formalises approaches already used in many nonrandomised pharmacoepidemiologic studies, including control exposures and outcomes, sensitivity analyses, comparing results from different population and different study designs – all within the same study and while explicitly specifying the direction of bias in each approach. Triangulation was used (without using the explicit term) in Associations of maternal antidepressant use during the first trimester of pregnancy with preterm birth, small for gestational age, autism spectrum disorder, and attention-deficit/hyperactivity disorder in offspring (JAMA 2017;317(15):1553-62), whereby, within the same study, the authors used negative controls (paternal exposure to antidepressants), and assess the association using different study design and study population (sibling design).

5.3.7. Interrupted time series analysis

In evaluating effectiveness of population-level interventions that are implemented at a specific point in time (clear before-after periods, such as policy effect date, regulatory action date) interrupted time series (ITS) studies are becoming the standard approach. The ITS analysis establishes the expected pre-intervention trend for an outcome of interest. The counterfactual scenario in the absence of the intervention serves as the comparator, the expected trend that provides a comparison for the evaluation of the impact of the intervention by examining any change occurring following the intervention period (Interrupted time series regression for the evaluation of public health interventions: a tutorial. Int J Epidemiol. 2017; 46(1):348-55). ITS is a quasi-experimental design and has been described as the “next best” approach for dealing with interventions in the absence of randomisation. ITS analysis requires several assumptions and its implementation is technically sophisticated, as explained in Regression based quasi-experimental approach when randomisation is not an option: Interrupted time series analysis (BMJ 2015; 350:h2750). The use of ITS regression in impact research is illustrated in Annex 2 ‘Guidance on methods for pharmacovigilance impact research’ of this Guide.