Print page Resize text Change font-size Change font-size Change font-size High contrast

Home > Standards & Guidances > Methodological Guide

ENCePP Guide on Methodological Standards in Pharmacoepidemiology


4.6. Research networks


4.6.1. General considerations


The need to pool data across different databases in order to gain power and increase generalisability of the results is becoming increasingly necessary. In Europe, collaborations for multi-database studies have been strongly encouraged over the last years by the drug safety research funded by the European Commission (EC) and public-private partnerships such as the Innovative Medicines Initiative (IMI). The funding resulted in the conduct of groundwork necessary to overcome the hurdles of data sharing across countries. A growing number of studies use data from networks of databases, often from different countries.


In the US, the HMO Research Network (HMORN), the OHDSI and the Sentinel initiative are examples of consortia involving health maintenance organisations that have formal, recognised research capabilities. Networking implies collaboration between investigators in sharing expertise and resources. The ENCePP Database of Research Resources may facilitate such networking by providing an inventory of research centres and data sources that can collaborate on specific pharmacoepidemiology and pharmacovigilance studies in Europe. It allows the identification of centres and data sets by country, type of research and other relevant fields.


From a methodological point of view, research networks have many advantages:

  • The potential for pooling data or results maximises the amount of information gathered for a specific issue addressed in different databases.
  • Research networks increase the size of study populations and shorten the time needed for obtaining the desired sample size. Hence, they can facilitate research on rare events and speed-up investigation of drug safety issues.
  • The heterogeneity of treatment options across countries allows studying the effect of individual drugs.
  • Research networks may provide additional knowledge on whether a drug safety issue exists in several countries and thereby reveal causes of differential drug effects, on the generalisability of results, on the consistency of information and on the impact of biases on estimates.
  • Involvement of experts from various countries addressing case definitions, terminologies, coding in databases and research practices provides opportunities to increase consistency of results of observational studies.
  • Sharing of data sources facilitates harmonisation of data elaboration and transparency in analyses and benchmarking of data management.

Different models have been applied for combining data or results from multiple databases. A common characteristic of all models is the fact that data partners maintain physical and operational control over electronic data in their existing environment. Differences however exist on whether a common protocol or a common data model is applied across all databases to extract, analyse and combine the data. A common data model (CDM) approach provides a similar representation of the database that allows standardisation of administrative and clinical information and facilitates a combined analysis across several databases. The CDM can be systematically applied on all data of a database (generalised CDM) or on the subset of data needed for a specific study (study-specific CDM).


4.6.2. Models of studies using multiple data sources


i) Local data extraction and analysis, separate protocols

The traditional way to combine data from multiple data sources is when data extraction and analysis are performed independently at each centre based on separate protocols. This is usually followed by meta-analysis of the different estimates obtained (see Chapter 5.7).


ii) Local data extraction and analysis, common protocol

In this option, data are extracted and analysed locally on the basis of a common protocol. Definitions of exposure, outcomes and covariates, analytical programmes and reporting formats are standardised according to a common protocol and the results of each analysis are shared in an aggregated format and pooled together through meta-analysis. This approach allows assessment of database/population characteristics and their impact on estimates but reduces variability of results determined by differences in design. Examples of research networks that use the common protocol approach are the PROTECT project (as described in Improving Consistency and Understanding of Discrepancies of Findings from Pharmacoepidemiological Studies: the IMI PROTECT Project. (Pharmacoepidemiol Drug Saf 2016;25(S1): 1–165) and the Canadian Network for Observational Drug Effect Studies (CNODES).


This approach requires very detailed common protocols and data specifications that reduce variability in interpretations by researchers.


Multi-centre, multi-database studies with common protocols: lessons learnt from the IMI PROTECT project (Pharmacoepidemiol Drug Saf 2016;25(S1):156-165) states that a priori pooling of data from several databases may disguise heterogeneity that may provide useful information on the safety issue under investigation. On the other hand, parallel analysis of databases allows exploring reasons for heterogeneity through extensive sensitivity analyses. This approach eventually increases consistency in findings from observational drug effect studies or reveal causes of differential drug effects.


iii) Local data extraction and central analysis, common protocol


For some studies, it has been possible to analyse centrally patient level data extracted based on a common protocol, such as in Selective serotonin reuptake inhibitors during pregnancy and risk of persistent pulmonary hypertension in the newborn: population based cohort study from the five Nordic Countries (BMJ 2012;344:d8012). If databases are very similar in structure and content as is the case for some Nordic registries, a CDM might not be required for data extraction. The central analysis allows removing an additional source of variability linked to the statistical programing and analysis.


iv) Local data extraction and central analysis, study-specific common data model


Data can also be extracted from local databases using a study-specific, database-tailored extraction into a CDM and pre-processed locally. The resulting data can be transmitted to a central data warehouse as patient-level data or aggregated data for further analysis. Examples of research networks that used this approach by employing a study-specific CDM with transmission of anonymised patient-level data (allowing a detailed characterisation of each database) are EU-ADR (as explained in Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project, Pharmacoepidemiol Drug Saf 2011;20(1):1-11), SOS, ARITMO, SAFEGUARD, GRIP and ADVANCE.


An approach to expedite the analysis of heterogeneity, called the component strategy, was initially developed in the EMIF project and could also be compatible with the generalised common data model (see Identifying Cases of Type 2 Diabetes in Heterogeneous Data Sources: Strategy from the EMIF Project. PLoS ONE. 2016;11(8):e0160648).


v) Local data extraction and central analysis, generalised common data model


Two examples of research networks which use a generalised CDM are the Sentinel Initiative (as described in The U.S. Food and Drug Administration's Mini-Sentinel Program, Pharmacoepidemiol Drug Saf 2012;21(S1):1–303) and OHDSI. The main advantage of a general CDM is that it can be used for virtually any study involving the database. OhDSI is based on the Observational Medical Outcomes Partnership (OMOP) CDM which is used by many organisations and has been tested for its suitability for safety studies (see for example Validation of a common data model for active safety surveillance research. J Am Med Inform Assoc. 2012;19(1):54–60). OMOP also developed an open source repository for the analytical tools created within the project.


In A Comparative Assessment of Observational Medical Outcomes Partnership and Mini-Sentinel Common Data Models and Analytics: Implications for Active Drug Safety Surveillance (Drug Saf. 2015;38(8):749-65), it is suggested that slight conceptual differences between the Sentinel and the OMOP models do not significant impact on identifying known safety associations. Differences in risk estimations can be primarily attributed to the choices and implementation of the analytic approach.

4.6.3. Challenges of different models


The different models presented above present many challenges:


Related to the scientific content

Related to the organisation of the network

  • Differences in culture and experience between academia, public institutions and private partners.

  • Differences in the type and quality of information contained within each mapped database.

  • Different ethical and governance requirements in each country regarding processing of anonymised or pseudo-anonymised healthcare data.

  • Choice of data sharing model and access rights of partners.

  • Issues linked to intellectual property and authorship.

  • Sustainability and funding mechanisms.

Each model has strengths and weaknesses in facing the above challenges (Data Extraction and Management in Networks of Observational Health Care Databases for Scientific Research: A Comparison of EU-ADR, OMOP, Mini-Sentinel and MATRICE Strategies (EGEMS. 2016 Feb)).  Experience has shown that many of these difficulties can be overcome by full involvement and good communication between partners, and a project agreement between network members defining roles and responsibilities and addressing issues of intellectual property and authorship.



Individual Chapters:


1. Introduction

2. Formulating the research question

3. Development of the study protocol

4. Approaches to data collection

4.1. Primary data collection

4.1.1. Surveys

4.1.2. Randomised clinical trials

4.2. Secondary data collection

4.3. Patient registries

4.3.1. Definition

4.3.2. Conceptual differences between a registry and a study

4.3.3. Methodological guidance

4.3.4. Registries which capture special populations

4.3.5. Disease registries in regulatory practice and health technology assessment

4.4. Spontaneous report database

4.5. Social media and electronic devices

4.6. Research networks

4.6.1. General considerations

4.6.2. Models of studies using multiple data sources

4.6.3. Challenges of different models

5. Study design and methods

5.1. Definition and validation of drug exposure, outcomes and covariates

5.1.1. Assessment of exposure

5.1.2. Assessment of outcomes

5.1.3. Assessment of covariates

5.1.4. Validation

5.2. Bias and confounding

5.2.1. Selection bias

5.2.2. Information bias

5.2.3. Confounding

5.3. Methods to handle bias and confounding

5.3.1. New-user designs

5.3.2. Case-only designs

5.3.3. Disease risk scores

5.3.4. Propensity scores

5.3.5. Instrumental variables

5.3.6. Prior event rate ratios

5.3.7. Handling time-dependent confounding in the analysis

5.4. Effect measure modification and interaction

5.5. Ecological analyses and case-population studies

5.6. Pragmatic trials and large simple trials

5.6.1. Pragmatic trials

5.6.2. Large simple trials

5.6.3. Randomised database studies

5.7. Systematic reviews and meta-analysis

5.8. Signal detection methodology and application

6. The statistical analysis plan

6.1. General considerations

6.2. Statistical analysis plan structure

6.3. Handling of missing data

7. Quality management

8. Dissemination and reporting

8.1. Principles of communication

8.2. Communication of study results

9. Data protection and ethical aspects

9.1. Patient and data protection

9.2. Scientific integrity and ethical conduct

10. Specific topics

10.1. Comparative effectiveness research

10.1.1. Introduction

10.1.2. General aspects

10.1.3. Prominent issues in CER

10.2. Vaccine safety and effectiveness

10.2.1. Vaccine safety

10.2.2. Vaccine effectiveness

10.3. Design and analysis of pharmacogenetic studies

10.3.1. Introduction

10.3.2. Identification of generic variants

10.3.3. Study designs

10.3.4. Data collection

10.3.5. Data analysis

10.3.6. Reporting

10.3.7. Clinical practice guidelines

10.3.8. Resources

Annex 1. Guidance on conducting systematic revies and meta-analyses of completed comparative pharmacoepidemiological studies of safety outcomes