The need to pool data across different databases in order to gain power and increase generalisability of the results is becoming increasingly necessary. In Europe, collaborations for multi-database studies have been strongly encouraged over the last years by the drug safety research funded by the European Commission (EC) and public-private partnerships such as the Innovative Medicines Initiative (IMI). The funding resulted in the conduct of groundwork necessary to overcome the hurdles of data sharing across countries. A growing number of studies use data from networks of databases, often from different countries.
In the US, the HMO Research Network (HMORN), the OHDSI and the Sentinel initiative are examples of consortia involving health maintenance organisations that have formal, recognised research capabilities. Networking implies collaboration between investigators in sharing expertise and resources. The ENCePP Database of Research Resources may facilitate such networking by providing an inventory of research centres and data sources that can collaborate on specific pharmacoepidemiology and pharmacovigilance studies in Europe. It allows the identification of centres and data sets by country, type of research and other relevant fields.
From a methodological point of view, research networks have many advantages:
Different models have been applied for combining data or results from multiple databases. A common characteristic of all models is the fact that data partners maintain physical and operational control over electronic data in their existing environment. Differences however exist on whether a common protocol or a common data model is applied across all databases to extract, analyse and combine the data. A common data model (CDM) approach provides a similar representation of the database that allows standardisation of administrative and clinical information and facilitates a combined analysis across several databases. The CDM can be systematically applied on all data of a database (generalised CDM) or on the subset of data needed for a specific study (study-specific CDM).
The traditional way to combine data from multiple data sources is when data extraction and analysis are performed independently at each centre based on separate protocols. This is usually followed by meta-analysis of the different estimates obtained (see Chapter 5.7).
In this option, data are extracted and analysed locally on the basis of a common protocol. Definitions of exposure, outcomes and covariates, analytical programmes and reporting formats are standardised according to a common protocol and the results of each analysis are shared in an aggregated format and pooled together through meta-analysis. This approach allows assessment of database/population characteristics and their impact on estimates but reduces variability of results determined by differences in design. Examples of research networks that use the common protocol approach are the PROTECT project (as described in Improving Consistency and Understanding of Discrepancies of Findings from Pharmacoepidemiological Studies: the IMI PROTECT Project. (Pharmacoepidemiol Drug Saf 2016;25(S1): 1–165) and the Canadian Network for Observational Drug Effect Studies (CNODES).
This approach requires very detailed common protocols and data specifications that reduce variability in interpretations by researchers.
Multi-centre, multi-database studies with common protocols: lessons learnt from the IMI PROTECT project (Pharmacoepidemiol Drug Saf 2016;25(S1):156-165) states that a priori pooling of data from several databases may disguise heterogeneity that may provide useful information on the safety issue under investigation. On the other hand, parallel analysis of databases allows exploring reasons for heterogeneity through extensive sensitivity analyses. This approach eventually increases consistency in findings from observational drug effect studies or reveal causes of differential drug effects.
For some studies, it has been possible to analyse centrally patient level data extracted based on a common protocol, such as in Selective serotonin reuptake inhibitors during pregnancy and risk of persistent pulmonary hypertension in the newborn: population based cohort study from the five Nordic Countries (BMJ 2012;344:d8012). If databases are very similar in structure and content as is the case for some Nordic registries, a CDM might not be required for data extraction. The central analysis allows removing an additional source of variability linked to the statistical programing and analysis.
Data can also be extracted from local databases using a study-specific, database-tailored extraction into a CDM and pre-processed locally. The resulting data can be transmitted to a central data warehouse as patient-level data or aggregated data for further analysis. Examples of research networks that used this approach by employing a study-specific CDM with transmission of anonymised patient-level data (allowing a detailed characterisation of each database) are EU-ADR (as explained in Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project, Pharmacoepidemiol Drug Saf 2011;20(1):1-11), SOS, ARITMO, SAFEGUARD, GRIP and ADVANCE.
An approach to expedite the analysis of heterogeneity, called the component strategy, was initially developed in the EMIF project and could also be compatible with the generalised common data model (see Identifying Cases of Type 2 Diabetes in Heterogeneous Data Sources: Strategy from the EMIF Project. PLoS ONE. 2016;11(8):e0160648).
Two examples of research networks which use a generalised CDM are the Sentinel Initiative (as described in The U.S. Food and Drug Administration's Mini-Sentinel Program, Pharmacoepidemiol Drug Saf 2012;21(S1):1–303) and OHDSI. The main advantage of a general CDM is that it can be used for virtually any study involving the database. OhDSI is based on the Observational Medical Outcomes Partnership (OMOP) CDM which is used by many organisations and has been tested for its suitability for safety studies (see for example Validation of a common data model for active safety surveillance research. J Am Med Inform Assoc. 2012;19(1):54–60). OMOP also developed an open source repository for the analytical tools created within the project.
In A Comparative Assessment of Observational Medical Outcomes Partnership and Mini-Sentinel Common Data Models and Analytics: Implications for Active Drug Safety Surveillance (Drug Saf. 2015;38(8):749-65), it is suggested that slight conceptual differences between the Sentinel and the OMOP models do not significant impact on identifying known safety associations. Differences in risk estimations can be primarily attributed to the choices and implementation of the analytic approach.
The different models presented above present many challenges:
Related to the scientific content
Differences in the underlying health care systems and mechanisms of data generation and collection
Mapping of differing disease coding systems (e.g., the International Classification of Disease, 10th Revision (ICD-10),Read codes, the International Classification of Primary Care (ICPC-2)) and narrative medical information in different languages.
Validation of study variables and access to source documents for validation.
Related to the organisation of the network
Differences in culture and experience between academia, public institutions and private partners.
Differences in the type and quality of information contained within each mapped database.
Different ethical and governance requirements in each country regarding processing of anonymised or pseudo-anonymised healthcare data.
Choice of data sharing model and access rights of partners.
Issues linked to intellectual property and authorship.
Sustainability and funding mechanisms.
Each model has strengths and weaknesses in facing the above challenges (Data Extraction and Management in Networks of Observational Health Care Databases for Scientific Research: A Comparison of EU-ADR, OMOP, Mini-Sentinel and MATRICE Strategies (EGEMS. 2016 Feb)). Experience has shown that many of these difficulties can be overcome by full involvement and good communication between partners, and a project agreement between network members defining roles and responsibilities and addressing issues of intellectual property and authorship.
|10. Specific topics|
|Annex 1.||Guidance on conducting systematic revies and meta-analyses of completed comparative pharmacoepidemiological studies of safety outcomes|