Print page Resize text Change font-size Change font-size Change font-size High contrast

Home > Standards & Guidances > Methodological Guide

ENCePP Guide on Methodological Standards in Pharmacoepidemiology Validation

In healthcare databases, the correct assessment of drug exposure, outcome and covariate is crucial to the validity of research. The role of automated record linkage in the postmarketing surveillance of drug safety: a critique (Clin Pharmacol Ther 1989;46:371-86) evaluates the validity of research conducted in automated databases according to a standard set of criteria, including validity of exposure, outcome and confounding. It points out that diagnoses obtained from a review of codes of electronic record systems require validation. The validation of electronic information on drug exposure, outcome or covariate definitions should also be included in the technical handbook of every database, ideally providing estimates of sensitivity, specificity, and the positive and negative predictive value. Validity of diagnostic coding within the General Practice Research Database: a systematic review (Br J Gen Pract 2010;60:e128-36), the book Pharmacoepidemiology (B. Strom, S.E. Kimmel, S. Hennessy. 5th Edition, Wiley, 2012) and Mini-Sentinel's systematic reviews of validated methods for identifying health outcomes using administrative and claims data: methods and lessons learned contain examples.


Completeness and validity of all variables used as exposure, outcomes, potential confounders and effect modifiers should be considered. Assumptions included in case definitions or other algorithms may need to be confirmed. For databases routinely used in research, documented validation of key variables may have been done previously by the data provider or other researchers. Any extrapolation of previous validation should, however, consider the effect of any differences in variables or analyses and subsequent changes to health care, procedures and coding. A full understanding of both the health care system and procedures that generated the data is required. This is particularly important for studies relying upon accurate timing of exposure, outcome and covariate recording such as in the self-controlled case series.  External validation against chart review or physician / patient questionnaire is possible with some resources. However, the questionnaires cannot always be considered as ‘gold standard’. Review of records against a case definition by experts may also be possible. While false positives are more easily measured than false negatives (unless the outcome is extremely common in the study population), specificity of an outcome is more important than sensitivity when considering bias in relative risk estimates (see A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol 2005;58(4):323-37). Alternatively, internal logic checks can test for completeness and accuracy of variables. For example, one can investigate whether an outcome was followed by (or proceeded from) appropriate exposure or procedures.

Concordance between datasets such as comparison of cancer or death registries with clinical or administrative records can validate individual records or overall incidence or prevalence rates.


Individual Chapters:


1. General aspects of study protocol

2. Research question

3. Approaches to data collection

3.1. Primary data collection

3.2. Secondary use of data

3.3. Research networks

3.4. Spontaneous report database

3.5. Using data from social media and electronic devices as a data source

3.5.1. General considerations

4. Study design and methods

4.1. General considerations

4.2. Challenges and lessons learned

4.2.1. Definition and validation of drug exposure, outcomes and covariates Assessment of exposure Assessment of outcomes Assessment of covariates Validation

4.2.2. Bias and confounding Choice of exposure risk windows Time-related bias Immortal time bias Other forms of time-related bias Confounding by indication Protopathic bias Surveillance bias Unmeasured confounding

4.2.3. Methods to handle bias and confounding New-user designs Case-only designs Disease risk scores Propensity scores Instrumental variables Prior event rate ratios Handling time-dependent confounding in the analysis

4.2.4. Effect modification

4.3. Ecological analyses and case-population studies

4.4. Hybrid studies

4.4.1. Pragmatic trials

4.4.2. Large simple trials

4.4.3. Randomised database studies

4.5. Systematic review and meta-analysis

4.6. Signal detection methodology and application

5. The statistical analysis plan

5.1. General considerations

5.2. Statistical plan

5.3. Handling of missing data

6. Quality management

7. Communication

7.1. Principles of communication

7.2. Guidelines on communication of studies

8. Legal context

8.1. Ethical conduct, patient and data protection

8.2. Pharmacovigilance legislation

8.3. Reporting of adverse events/reactions

9. Specific topics

9.1. Comparative effectiveness research

9.1.1. Introduction

9.1.2. General aspects

9.1.3. Prominent issues in CER Randomised clinical trials vs. observational studies Use of electronic healthcare databases Bias and confounding in observational CER

9.2. Vaccine safety and effectiveness

9.2.1. Vaccine safety General aspects Signal detection Signal refinement Hypothesis testing studies Meta-analyses Studies on vaccine safety in special populations

9.2.2. Vaccine effectiveness Definitions Traditional cohort and case-control studies Screening method Indirect cohort (Broome) method Density case-control design Test negative design Case coverage design Impact assessment Methods to study waning immunity

9.3. Design and analysis of pharmacogenetic studies

9.3.1. Introduction

9.3.2. Identification of genetic variants

9.3.3. Study designs

9.3.4. Data collection

9.3.5. Data analysis

9.3.6. Reporting

9.3.7. Clinical practice guidelines

9.3.8. Resources

Annex 1. Guidance on conducting systematic revies and meta-analyses of completed comparative pharmacoepidemiological studies of safety outcomes