In Europe, collaborations for multi-database studies have been strongly encouraged over the last years by the drug safety research funded by the European Commission (EC) and public-private partnerships such the Innovative Medicines Initiative. The funding resulted in the conduct of groundwork necessary to overcome the hurdles of data sharing across countries. In the US, the HMO Research Network, the Vaccine Safety Datalink (VSD) and Sentinel are examples of consortia involving health maintenance organisations that have formal, recognised research capabilities.
Networking implies collaboration between investigators in sharing expertise and resources. The ENCePP Database of Research Resources may facilitate such networking by providing an inventory of research centres and data sources that might collaborate on specific pharmacoepidemiology and pharmacovigilance studies in Europe. It allows the identification of centres and data sets by country, type of research and other relevant fields. In addition, an important component of collaboration among researchers is the potential for pooling of raw data and meta-analyses to maximise the information gathered for an issue that is addressed in different databases.
From a methodological point of view, data networks have many advantages:
By increasing the size of study populations, networks may shorten the time needed for obtaining the desired sample size. Hence, networks can facilitate research on rare events and accelerate investigation of drug safety issues.
Heterogeneity of treatment options across countries allows studying the effect of individual drugs.
Multidatabase studies may provide additional knowledge on whether a drug safety issue exists in several countries and thereby reveal causes of differential drug effects, on the generalisability of results, on the consistency of information and on the impact of biases on estimates.
Involvement of experts from various countries addressing case definitions, terminologies, coding in databases and research practices provides opportunities to increase consistency of results of observational studies.
Sharing of data sources facilitates harmonisation of data elaboration and transparency in analyses and benchmarking of data management.
Different models have been applied for combining data from various databases ranging from a very disparate to a more integrated approach:
Meta-analysis of results of individual studies with potentially different design e.g. Variability in risk of gastrointestinal complications with individual NSAIDs: results of a collaborative meta-analysis (BMJ 1996;312:1563-6), which compared the relative risks of serious gastrointestinal complications reported with individual NSAIDs by conducting a systematic review of twelve hospital and community based case-control and cohort studies, and found a relation between use of the drugs and admission to hospital for haemorrhage or perforation. Annex 1 of this Guide provides guidance on meta-analyses of completed pharmacoepidemiological studies of safety outcomes.
Combining results from common protocol studies conducted in different databases, allowing assessment of database/population characteristics and of choices of study design and analysis as determinants of variability of results (e.g. Pharmacoepidemiological Research on Outcomes of Therapeutics by a European Consortium (PROTECT) project, the Canadian Network for Observational Drug Effect Studies (CNODES).
Distributed data approach in which data partners maintain physical and operational control over electronic data in their existing environment (e.g. the Sentinel project itself and the extension for vaccines PRISM). A common data model allows standardisation of administrative and clinical information across data partners, execution of standardised programs and sharing of the output of these programs in a summary form. Methods are available to allow multivariate adjusted analyses in multiple databases without violating patient privacy (see Multivariate-adjusted pharmacoepidemiologic analyses of confidential information pooled from multiple healthcare utilisation databases. Pharmacoepidemiol Drug Saf 2010;19:848-57).
Pooling of aggregated data (person-time based or person-level based) extracted locally from databases or electronic health records using a common data model and common software, and transmitted electronically to a central data warehouse for further analysis (see Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project. Pharmacoepidemiol Drug Saf 2011;20(1):1-11).
Collaborative cross-national pharmacoepidemiological network, such as the one developed by the five Nordic countries with similar healthcare systems and databases and which covers the entire population of 25 million inhabitants (The Nordic countries as a cohort for pharmacoepidemiological research. Basic Clin Pharmacol & Toxicol 2010;106:86–94). This network has been used for analytical pharmacoepidemiological studies linking drug exposure to other health registries (for example in Selective serotonin reuptake inhibitors during pregnancy and risk of persistent pulmonary hypertension in the newborn: population based cohort study from the five Nordic Countries. BMJ 2012;344:d8012).
These different models have different strengths and weaknesses and present different challenges. These may include:
Differences in the underlying health care systems and mechanisms of data generation and collection
Differences in culture and experience between academia, public institutions and private partners.
Different ethical and governance requirements in each country regarding processing of anonymised or pseudo-anonymised healthcare data.
Mapping of differing disease coding systems (for examples, the International Classification of Disease, 10th Revision (ICD-10), Read codes in the United Kingdom and the International Classification of Primary Care (ICPC-2)) and languages of narrative medical information.
Choice of data sharing model and access rights of partners.
Validation of diagnoses and access to source documents for validation.
Issues linked to intellectual property and authorship.
Sustainability and funding mechanisms.
Experience has shown that many of these difficulties can be overcome by full involvement and good communication between partners, and a project agreement between network members defining roles and responsibilities and addressing issues of intellectual property and authorship. Technical solutions also exist for data sharing and mapping of terminologies, such as those adopted in the EMIF project.
Multi-centre, multi-database studies with common protocols: lessons learnt from the IMI PROTECT project(Pharmacoepidemiol Drug Saf 2016;25(S1):156-165) concludes that conducting multi-database studies requires very detailed common protocols and data specifications that reduce variability in interpretations by researchers. Whilst a priori pooling data from several databases may disguise heterogeneity that may provide useful information on the safety issue under investigation, parallel analysis of databases allow exploring reasons for heterogeneity through extensive sensitivity analyses. This approach eventually increases consistency in findings from observational drug effect studies or reveal causes of differential drug effects.
Many pharmacoepidemiology research networks in the EU have been established under EC grant agreements. The coming years should demonstrate whether and how the expertise and infrastructures involved could be maintained and used in the conduct of post-authorisation studies.