Comparative effectiveness research (CER) is designed to inform healthcare decisions for the prevention, the diagnosis and the treatment of a given health condition. CER therefore compares the potential benefits and harms of therapeutic strategies available in routine practice. The compared interventions may be related to similar treatments, such as competing medicines within the same class or with different mechanism of actions, or to different therapeutic approaches, such as surgical procedures and drug therapies. The comparison may focus only on the relative medical benefits and risks of the different options, or it may weigh both their costs and their benefits. The methods of comparative effectiveness research (Annu Rev Public Health 2012;33:425-45) defines the key elements of CER as a) a head-to-head comparison of active treatments, b) study populations typical of the day-to-day clinical practice, and c) a focus on evidence to inform health care tailored to the characteristics of individual patients.
The term ‘Relative effectiveness assessment (REA)’ is also used when comparing multiple technologies or a new technology against standard of care, while ‘rapid’ REA refers to performing an assessment within a limited timeframe in the case of a new marketing authorisation or a new indication granted for an approved medicine (see What is a rapid review? A methodological exploration of rapid reviews in Health Technology Assessments. Int J Evid Based Healthc. 2012;10(4):397-410).
CER may use a variety of data sources and methods. Methods to generate evidence for CER are divided below in four categories according to the data source: clinical trials, observational data, synthesis of published RCTs and cross-design synthesis.
126.96.36.199. CER based on clinical trials
Randomised clinical trials (RCTs) are considered the gold standard for demonstrating the efficacy of medicinal products but they rarely measure the benefits, risks or comparative effectiveness of an intervention when used in routine clinical practice. Moreover, relatively few RCTs are designed with an alternative therapeutic strategy as a comparator, which limits the utility of the resulting data in establishing recommendations for treatment choices. For these reasons, other methodologies such as pragmatic trials and large simple trials may complement traditional confirmatory RCTs in CER. These trials are discussed in Chapter 188.8.131.52.
In order to facilitate comparison of results of CER between clinical trials, the COMET (Core Outcome Measures in Effectiveness Trials) Initiative aims at developing agreed minimum standardized sets of outcomes (‘core outcome sets’, COS) to be assessed and reported in effectiveness trials of a specific condition. Choosing Important Health Outcomes for Comparative Effectiveness Research: An Updated Review and User Survey (PLoS One 2016;11(1):e0146444.) provides an updated review of studies that have addressed the development of COS for measurement and reporting in clinical trials. It is also worth noting that regulatory disease guidelines also establish outcomes of clinical interest to assess if a new therapeutic intervention works. Use of the same endpoint across RCTs thus facilitate comparisons.
184.108.40.206. CER using observational data
Use of observational data in CER
Use of observational evidence is generally not appropriate to replace RCT information for efficacy, except in specific circumstances. When and How Can Real World Data Analyses Substitute for Randomized Controlled Trials? (Clin Pharmacol. Ther. 2017;102(6):924-33) suggests that RWE may be preferred over RCTs when studying a highly promising treatment for a disease with no other available treatments, where ethical considerations may preclude randomising patients to placebo, particularly if the disease is likely to result in severely compromised quality of life or mortality. In these cases, RWE could support product regulation by providing evidence on the safety and effectiveness of the therapy against the typical disease progression observed in the absence of treatment. This comparator disease trajectory may be assessed from historical controls that were diagnosed prior to the availability of the new treatment, or other sources. When Can We Rely on Real‐World Evidence to Evaluate New Medical Treatments? (Clin Pharmacol Ther. 2021;doi:10.1002/cpt.2253) recommends that decisions regarding use of RWE in the evaluation of new treatments should depend on the specific research question, characteristics of the potential study settings and characteristics of the settings where study results would be applied, and take into account three dimensions in which RWE studies might differ from traditional clinical trials: use of real-world data (RWD), delivery of real-world treatment and real-world treatment assignment.
Outside of some specific circumstances, observational data and clinical trials are considered complementary to generate optimal evidence. For example, clinical trials may include historical controls from observational studies, or identify eligible study participants from disease registries. In defense of Pharmacoepidemiology-Embracing the Yin and Yang of Drug Research (N Engl J Med 2007;357(22):2219-21) shows that strengths and weaknesses of RCTs and observational studies may make both designs necessary in the study of drug effects. Hybrid approaches for CER allow to enrich clinical trials with observational data, for example:
Use of historical controls to fully or partially replace concurrent controls (see A roadmap to using historical controls in clinical trials – by Drug Information Association Adaptive Design Scientific Working Group (DIA-ADSWG), Orphanet J Rare Dis. 2020;15:69)
Use of prior information derived from relevant empirical data on a relative treatment effect (see Prior Elicitation for Use in Clinical Trial Design and Analysis: A Literature Review, Int J Environ Res Public Health 2021;18(4):1833)
Single arm studies, where the entire control group can be external (see Methods for external control groups for single arm trials or long‐term uncontrolled extensions to randomized clinical trials, Pharmacoepidemiol Drug Saf. 2020; 29(11):1382–92).
Methods for CER using observational data
Causal inference methods applicable to observational studies described in Chapter 5.2.3 of this Guide are generally applicable to CER, e.g. propensity score methods, instrumental variables, prior event rate ratios, G-estimation or marginal structural models.
More specifically, the Agency for Healthcare Research and Quality (AHRQ)’s Developing a Protocol for Observational Comparative Effectiveness Research: A User’s Guide (2013) identifies minimal standards and best practices for observational CER. It provides principles on a wide range of topics for designing research and developing protocols, with relevant questions to be addressed and checklists of key elements to be considered. The RWE Navigator website discusses methods using observational, real-world data, with a focus on effectiveness research, such as the source of real-world data, study designs, approaches to summarising and synthesising the evidence, modelling of effectiveness and methods to adjust for bias and governance aspects. It also presents a glossary of terms and case studies.
A roadmap to using historical controls in clinical trials – by Drug Information Association Adaptive Design Scientific Working Group (DIA-ADSWG) (Orphanet J Rare Dis. 2020;15:69) describes methods to minimise disadvantages of using historical controls in clinical trials, i.e. frequentist methods (e.g. propensity score methods and meta-analytical approach) or Bayesian methods (e.g. power prior method, adaptive designs and the meta-analytic combined [MAC] and meta-analytic predictive [MAP] approaches for meta-analysis). It also provides recommendations on approaches to apply historical controls when they are needed while maximising scientific validity to the extent feasible.
In the context of hybrid studies, key methodological issues to be considered when combining RWD and RCT data include:
Differences between the RWD and RCT in terms of data quality and applicability
Differences between available RWD sources (e.g., due to heterogeneity in studied populations, differences in study design, etc.)
Risk of bias (particularly for the RWD)
Generalisability (especially for RCT findings beyond the overall treatment effect).
The target trial emulation approach was developed as a conceptual framework helping researchers to identify and avoid potential biases in observational studies. This approach is further described below (section 220.127.116.11) and in Chapter 4.4.2.
Methods for systematic reviews and meta-analyses of observational studies are presented in Chapter 9 and Annex 1 of this ENCePP Guide. They are also addressed in the Cochrane Handbook for Systematic Reviews of Interventions and the Methods Guide for Effectiveness and Comparative Effectiveness Reviews presented in section 18.104.22.168 of this chapter.
Assessment of observational studies used in CER
Given the potential for bias and confounding in CER based on observational non-randomised studies, the results of such studies need to be adequately assessed. The Good ReseArch for Comparative Effectiveness (GRACE) initiative (IQVIA, 2016) provides guidance to enhance the quality of observational comparative effectiveness research (CER) and a checklist to facilitate its use for decision support. The GRACE principles provide guidance on the evaluation of the quality of observational CER studies to help decision-makers in recognizing high-quality studies and researchers in design and conduct high quality studies. A checklist to evaluate the quality of observational CER studies is also provided. How well can we assess the validity of non-randomised studies of medications? A systematic review of assessment tools (BMJ Open 2021;11:e043961) examined whether assessment tools for non-randomised studies address critical elements that influence the validity of findings from non-randomised studies for comparative safety and effectiveness of medications. It concludes that major design-specific sources of bias (e.g., lack of new-user design, lack of active comparator design, time-related bias, depletion of susceptibles, reverse causation) and statistical assessment of internal and external validity are not sufficiently addressed in most of the tools evaluated, although these critical elements should be integrated to systematically investigate the validity of non-randomised studies on comparative safety and effectiveness of medications. The article also provides a glossary of terms, a description of the characteristics the tools and a description of methodological challenges they address.
Comparison of results of observational studies and RCTs
Even if observational are not appropriate to replace RCTs for many CER topics, comparison of their results for a same research question is currently a domain of interest. The underlying assumption is that if observational studies consistently match the results of published trials and predict the results of ongoing trials, this will increase the confidence in the validity of future RWD analyses performed in the absence of randomised trial evidence. In a review of five interventions, Randomized, controlled trials, observational studies, and the hierarchy of research designs (N Engl J Med 2000;342(25):1887-92) found that the results of well-designed observational studies (with either a cohort or case-control design) did not systematically overestimate the magnitude of treatment effects. Interim results from the 10 first emulations reported in Emulating Randomized Clinical Trials With Nonrandomized Real-World Evidence Studies: First Results From the RCT DUPLICATE Initiative (Circulation 2021;143(10):1002-13) found that differences between the RCT and corresponding RWE study populations remained but the RWE emulations achieved a hazard ratio estimate that was within the 95% CI from the corresponding RCT in 8 of 10 studies. Selection of active comparator therapies with similar indications and use patterns enhanced the validity of real-world evidence. Final results of this project are discussed in the presentation Lessons Learned from Trial Replication analyses: Findings from the Duplicate Demonstration Project (Duke-Margolis Workshop 2022). Emulation differences versus biases when calibrating RWE findings against RCTs (Clin Pharmacol Ther. 2020;107(4):735-7) provides guidance on how to investigate and interpret differences in treatment effect estimates from the two study types.
A reason for discrepancies between results of observational studies and RCTs may be the use of prevalent drug users in the former. Evaluating medication effects outside of clinical trials: new-user designs (Am J Epidemiol 2003;158(9):915-20) explains the biases introduced by use of prevalent drug users and how a new-user (or incident user) design eliminate these biases by restricting analyses to persons under observation at the start of the current course of treatment. The Incident User Design in Comparative Effectiveness Research (Pharmacoepidemiol Drug Saf. 2013; 22(1):1–6) reviews published CER case studies in which investigators had used the incident user design, discusses its strengths (reduced bias) and weaknesses (reduced precision of comparative effectiveness estimates) and provides recommendations to investigators considering to use this design.
To disentangle differences between observational studies and RCTs, it may be helpful to map the risk of bias as well as applicability of each study with regards to the research question. The use of the estimand framework of the ICH E9 (R1) addendum may help ensure that (or evaluate whether) observational studies and RCTs are addressing the same research question. It can, however, be difficult to narrow down the definitions and analyses across all RWD sources to attain a homogeneous estimand definition and interpretation.
22.214.171.124. CER based on evidence synthesis of published RCTs
The Cochrane Handbook for Systematic Reviews of Interventions (version 6.2, 2022) describes in detail the process of preparing and maintaining systematic reviews on the effects of healthcare interventions. Although its scope is focused on Cochrane reviews, it has a much wider applicability. It includes guidance on the standard methods applicable to every review (planning a review, searching and selecting studies, data collection, risk of bias assessment, statistical analysis, GRADE and interpreting results), as well as more specialised topics. The Grading of Recommendations Assessment, Development, and Evaluation (GRADE) working group offers a structured process for rating quality of evidence and grading strength of recommendations in systematic reviews, health technology assessment and clinical practice guidelines. The Methods Guide for Effectiveness and Comparative Effectiveness Reviews (AHRQ, 2018) provides a series of chapters aimed at providing resources supporting comparative effectiveness reviews. They are focused on the US Effective Health Care (EHC) programme and may therefore have limitations as regards their generalisability.
A pairwise meta-analysis of RCT results is used when the primary aim is to estimate the relative effect of two interventions. Network meta-analysis for indirect treatment comparisons (Statist Med. 2002;21:2313–24) introduced methods for assessing the relative effectiveness of two treatments when they have not been compared directly in a randomised trial but have each been compared to other treatments. Overview of evidence synthesis and network meta-analysis – RWE Navigator explains the methods, provides best practices and gives access to published articles on this topic. A prominent issue that has been overlooked by some systematic literature reviews and network meta-analyses is the fact that RCTs included in a network meta-analysis are usually not comparable with each other even though they all compared to placebo. Different screening and inclusion/exclusion criteria often create different patient groups and these differences are rarely discussed in indirect comparisons. Before indirect comparison are performed, researchers should therefore check the similarity/differences between the RCTs.
126.96.36.199. CER based on cross-design synthesis
Decision making should ideally be based on all available evidence, including both randomised and non-randomised studies and on both individual patient data and published aggregated data. Clinical trials are highly suitable to investigate efficacy but less practical to study long-term outcomes or rare diseases. On the other hand, observational data offer important insights about treatment populations, long-term outcomes (e.g., safety), patient-reported outcomes, prescription patterns, active comparators, etc. Combining evidence from these two sources could therefore be helpful to reach certain effectiveness/safety conclusions earlier or to address more complex questions. Several methods have been proposed but are still experimental. The article Framework for the synthesis of non-randomised studies and randomised controlled trials: a guidance on conducting a systematic review and meta-analysis for healthcare decision-making (BMJ Evid Based Med. 2022;27(2):109-19) used a 7-step mixed methods approach to develop guidance for researchers and healthcare decision-makers on when and how to best combine evidence from non-randomised studies and RCTs to improve transparency and build confidence in the resulting summary effect estimates. It provides recommendations on the most appropriate statistical approaches based on analytical scenarios in healthcare decision making and also highlights potential challenges for the implementation of this approach.
The Methodological Guidelines for Rapid Relative Effectiveness Assessment of Pharmaceuticals (EUnetHTA, 2013) cover a broad spectrum of issues on REA. They address methodological challenges that are encountered by health technology assessors while performing rapid REA and provide and discuss practical recommendations on definitions to be used and how to extract, assess and present relevant information in assessment reports. Specific topics covered include the choice of comparators, strengths and limitations of various data sources and methods, internal and external validity of studies, the selection and assessment of endpoints and the evaluation of relative safety.
188.8.131.52. Secondary use of data for CER
Electronic healthcare records, patient registries and other data sources are increasingly used in clinical effectiveness studies as they capture real clinical encounters and may document reasons for treatment decisions that are relevant for the general patient population. As they are primarily designed for clinical care and not research, information on relevant covariates and in particular on confounding factors may not be available or adequately measured. These aspects are presented in other chapters of this Guide (see Chapter 5, Methods to address bias and confounding; Chapter 7, Secondary use of data, and other chapters for secondary use of data in other contexts) but they need to be specifically considered in the context of CER. For example, A roadmap to using historical controls in clinical trials – by Drug Information Association Adaptive Design Scientific Working Group (DIA-ADSWG) (Orphanet J Rare Dis. 2020;15:69) describes the main sources of RWD to be used as historical controls, with an Appendix providing guidance on factors to be evaluated in the assessment of the relevance of RWD sources and resultant analysis.
A model based on counterfactual theory for CER using large administrative healthcare databases has been suggested, in which causal inference from observational studies based on large administrative health databases is viewed as an emulation of a randomised trial. This target trial emulation approach is described in Chapter 4.4.2. It consists in designing first a hypothetical ideal randomised trial (“target trial”) that would answer the research question. A second step identifies how to best emulate the design elements of the target trial using the available observational data source and the analytic approaches to apply, given the trade-offs in an observational setting. This approach aims to prevent some common biases, such as immortal time bias or prevalent user bias while also identifying situations where adequate emulation may not be possible using the data at hand. An example is the study Comparative Effectiveness of BNT162b2 and mRNA-1273 Vaccines in U.S. Veterans (N Engl J Med. 2022;386(2):105-15), which used a target trial emulation design where recipients of each vaccine were matched in a 1:1 ratio according to their baseline risk factors. This design cannot be applied where baseline measurements are not collected at treatment start, which may be the case in some patient registries. The target trial emulation approach may also be informed by the estimand framework. However, from a practical point of view, nailing down estimand definitions may be more challenging in observational data since variable definitions and measurement methods tend to be less standardised.
184.108.40.206 Data quality
Data quality is essential to ensure the rigor of CER and secondary use of data requires special attention. Comparative effectiveness research using electronic health records data: Ensure data quality (SAGE research methods, 2020). discusses challenges and share experiences encountered during the process of transforming electronic health record data into a research quality dataset in the context of CER. This aspect and other quality issues are also discussed in Chapter 12 on Quality management.
In order to address missing information, some CER studies have attempted to integrate information from health databases with information collected ad hoc from study subjects. Enhancing electronic health record measurement of depression severity and suicide ideation: a Distributed Ambulatory Research in Therapeutics Network (DARTNet) study (J Am Board Fam Med. 2012;25(5):582-93) shows the value of linking direct measurements and pharmacy claims data to data from electronic healthcare records. Assessing medication exposures and outcomes in the frail elderly: assessing research challenges in nursing home pharmacotherapy (Med Care 2010;48(6 Suppl):S23-31) describes how merging longitudinal electronic clinical and functional data from nursing home sources with Medicare and Medicaid claims data can support unique study designs in CER but pose many challenging design and analytic issues.