Print page Resize text Change font-size Change font-size Change font-size High contrast

Home > Standards & Guidances > Methodological Guide

ENCePP Guide on Methodological Standards in Pharmacoepidemiology


9.1. General considerations


A growing number of pharmacoepidemiological studies use data from networks of databases, often from different countries. Pooling data across different databases affords insight into the generalisability of the results and may improve precision. Some of these networks are based on long-term contracts with selected partners and are very well structured (such as Sentinel, the Vaccine Safety Datalink (VSD), the Canadian Network for Observational Drug Effect Studies (CNODES)) and the recently set-up Data Analysis and Real World Interrogation Network (DARWIN EU®). Others are collaborations based on open science principles such as the Observational Health Data Sciences and Informatics (OHDSI) program.


In Europe, collaborations for multi-database studies have been strongly encouraged as part of the drug safety research funded by the European Commission (EC) as well as public-private partnerships such as the Innovative Medicines Initiative (IMI). This funding resulted in the conduct of groundwork necessary to overcome the hurdles of data sharing across countries for specific projects (e.g. PROTECT, ADVANCE, EMIF, EHDEN, ConcePTION) and specific post-authorisation studies. The European Commission is currently establishing an European Health Data Space (EHDS) and major breakthroughs in this field are expected, with the Joint Action Towards the European Health Data Space – TEHDAS developing joint European principles for the secondary use of health data.


The 2009 H1N1 influenza pandemic (see Safety monitoring of Influenza A/H1N1 pandemic vaccines in EudraVigilance, Vaccine 2011;29(26):4378-87) and the 2020 COVID-19 pandemic showed the value of an operational infrastructure to rapidly and effectively monitor the safety of therapeutics and vaccines. In this context, EMA established contracts with academic and private partners to support readiness of research networks to perform observational research. Three dedicated projects started in 2020: ACCESS (vACcine Covid-19 monitoring readinESS), CONSIGN (COVID-19 infectiOn aNd medicineS In preGNancy) and E-CORE (Evidence for COVID-19 Observational Research Europe). Other initiatives have emerged to address specific COVID-19 related research questions, such as the CVD-COVID-UK consortium (see Linked electronic health records for research on a nationwide cohort of more than 54 million people in England: data resource, BMJ. 2021;373:n826), providing a secure access to linked health data from primary and secondary care, registered deaths, COVID-19 laboratory data, vaccination data and cardiovascular specialist audits. Similarly, linked data have been made available in trusted research environments in Scotland and Wales.


EMA funded several studies to address research questions on the monitoring of COVID-19 vaccines using federated analytics with a common data model (CDM), which resulted in publications on background rates of adverse events of special interest (Characterising the background incidence rates of adverse events of special interest for covid-19 vaccines in eight countries: multinational network cohort study, BMJ. 2021;373:n1435; Background rates of 41 adverse events of special interest for COVID-19 vaccines in 10 European healthcare databases - an ACCESS cohort study, Vaccine. 2023;41(1):251-262); thrombosis and risk of coagulopathy post-COVID-19 (Venous or arterial thrombosis and deaths among COVID-19 cases: a European network cohort study, Lancet Infectious Diseases 2022;22(8):1142-52); comparative risk of thrombosis and thrombocytopenia following COVID-19 vaccines (Comparative risk of thrombosis with thrombocytopenia syndrome or thromboembolic events associated with different covid-19 vaccines: international network cohort study from five European countries and the US, BMJ. 2022;379:e071594); and myocarditis (Myocarditis and pericarditis associated with SARS-CoV-2 vaccines: A population-based descriptive cohort and a nested self-controlled risk interval study using electronic health care data from four European countries, Front Pharmacol. 2022;13:1038043).


In this Chapter, the term networking is used to reflect collaboration between researchers for sharing expertise and resources. The ENCePP Database of Research Resources, which provides an inventory of research centres and data sources collaborating on specific pharmacoepidemiology and pharmacovigilance studies in Europe, may facilitate such networking by allowing the identification of research centres and data sources by country, study, type of research, and other relevant fields.


The use of research networks in medicines safety and utilisation, and in disease epidemiology, is well established, with a significant body of practical experience. Their use in effectiveness research is now increasing (see Assessing strength of evidence for regulatory decision making in licensing: What proof do we need for observational studies of effectiveness?, Pharmacoepidemiol Drug Saf. 2020;29(10):1336-40).


From a methodological point of view, studies adopting a multi-database design have many advantages over single database studies:

  • It increases the size of the study population. This especially facilitates research on rare events, on medicines used in specialised settings (see Ability of primary care health databases to assess medicinal products discussed by the European Union Pharmacovigilance Risk Assessment Committee, Clin Pharmacol Ther. 2020;107(4):957-65), or when the interest is in subgroup effects.

  • It exploits the heterogeneity of treatment options across countries, which allows studying the effect of different medicines used for the same indication, or specific patterns of utilisation.

  • It exploits differences in outcome/event rates across countries/regions.

  • It provides additional knowledge on the generalisability of results and on the consistency of associations, for instance whether a safety issue can be identified in several countries. Possible inconsistencies might be caused by different biases or truly different effects in the databases, revealing causes of differential effects, and these might be investigated.

  • It involves experts from various countries addressing case definitions, terminologies, coding in databases, and research practices. This provides opportunities to increase consistency of results of observational studies.

  • For primary data collection from multiple data sources, it shortens the time needed for obtaining the desired sample size and therefore accelerates the investigation of safety issues or other outcomes.

The articles Approaches for combining primary care electronic health record data from multiple sources: a systematic review of observational studies (BMJ Open 2020;10(10): e037405) and Different strategies to execute multi-database studies for medicines surveillance in real world setting: a reflection on the European model (Clin Pharmacol Ther. 2020;108(2):228-35) describe key characteristics of studies using multiple data sources and different models applied for combining data or results from multiple databases. A common characteristic of all models is the fact that data partners maintain physical and operational control over electronic data in their existing environment, and therefore, the data extraction is always performed locally. Differences, however, exist in the following areas: use of a common protocol; use of a CDM; and where and how the data analysis is conducted.


Use of a CDM implies that local formats are translated into a predefined, common data structure, which allows launching a similar data extraction and analysis script across several databases. Sometimes the CDM also imposes a common terminology, such as for the OMOP CDM. The CDM can be systematically applied on the entire database (generalised CDM) or on the subset of data needed for a specific study (study-specific CDM). While transforming the database in a CDM, comparisons between source and target data across all variables and dimensions is strongly recommended as part of the quality control of the process, in order to make sure that the transformation faithfully represents the source data, both in terms of completeness and accuracy. A number of tools exist for checking the resulting data, including the OHDSI DataQualityDashboard, which involves thousands of checks for conformance, completeness, and plausibility, based on the harmonised framework for data quality assessment developed by Khan et al. (EGEMS 2016;4(1):1244).

In the European Union, study specific CDMs have generated results for several projects, and several databases have been converted to a generalised CDM version that exists alongside the native version. This conversion was accelerated as a result of the observational research needed to respond to the COVID-19 pandemic. An example of application of generalised CDMs are studies conducted in the OHDSI community, such as Association of angiotensin converting enzyme (ACE) inhibitors and angiotensin 2 receptor blockers (ARB) on COVID-19 incidence and complications or the ConcePTION study From Inception to ConcePTION: Genesis of a Network to Support Better Monitoring and Communication of Medication Safety During Pregnancy and Breastfeeding (Clin Pharmacol Ther. 2022;111(1):321-31). More recently, DARWIN EU® has galvanised the use of the OMOP CDM for regulatory purposes, with the completion of the first studies, and the planned commissioning of many additional studies in the coming years (see list of completed DARWIN EU® studies).


« Back