Print page Resize text Change font-size Change font-size Change font-size High contrast

Home > Standards & Guidances > Methodological Guide

ENCePP Guide on Methodological Standards in Pharmacoepidemiology



Section 3.1. Primary data collection

3.1. Primary data collection

Primary data collection has an important role in pharmacoepidemiology. Case-control studies using hospital or community-based primary data collection have allowed the evaluation of drug-disease associations for rare complex conditions that require very large source populations and in-depth case assessment by clinical experts. Classic examples are Appetite-Suppressant Drugs and the Risk of Primary Pulmonary Hypertension (N Engl J Med 1996;335:609-16), The design of a study of the drug etiology of agranulocytosis and aplastic anemia (Eur J Clin Pharmacol 1983;24:833-6) and Medication Use and the Risk of Stevens–Johnson Syndrome or Toxic Epidermal Necrolysis (N Engl J Med 1995;333:1600-8).


For some conditions, case-control surveillance networks have been developed and used for selected studies and for signal generation and clarification, e.g. Signal generation and clarification: use of case-control data (Pharmacoepidemiol Drug Saf 2001;10:197-203).


General guidance on the conduct of prospective pharmacoepidemiology studies can be found in the ISPE Good Pharmacoepidemiology Practices (GPP) and the IEA Good Epidemiology Practice (GEP). The GPP is especially useful for its recommendations on aspects rarely covered by guidelines, such as data quality issues and archiving. Both guidelines address the importance of patient data protection and the ethical principles of research using patient healthcare and personal data.


Patient registries are sometimes requested by regulators at the time of authorisation of a medicinal product in order to determine clinical effectiveness and monitor safety. A registry should be considered a structure within which studies can be performed, i.e. a data source where entry is defined either by diagnosis of a disease (disease registry) or prescription of a drug (exposure registry). AHRQ has published a comprehensive document on ‘good registry practices’ entitled Registries for Evaluating Patient Outcomes: A User's Guide, 3rd Edition, which guides the planning, design, implementation, analysis, interpretation and evaluation of the quality of a registry. A section also covers linking of registries to other data sources. PARENT Joint Action is an EU initiative that aims to rationalise the development and governance of patient registries, enabling their secondary use for public health and research purposes. It is developing methodological and governance guidelines and a Registry of Registries to facilitate cross-border use. The FDA’s Guidance for Industry-Establishing Pregnancy Exposure Registries advises on good practice for designing a pregnancy registry with a description of research methods and elements to be addressed. The Systematic overview of data sources for drug safety in pregnancy research provides an inventory of pregnancy exposure registries and alternative data sources on safety of prenatal drug exposure and discusses their strengths and limitations. For paediatric populations, detailed information on neonatal age (e.g. in days, not just in years), pharmacokinetic differences and organ maturation need to be considered. The CHMP Guideline on Conduct of Pharmacovigilance for Medicines Used by the Paediatric Population provides further relevant information.


Surveys are increasingly used in pharmacoepidemiology, especially in the areas of disease epidemiology and risk minimisation evaluation. They require a sampling strategy that allows for external validity and maximised response rates. Useful textbooks on these aspects are Survey Sampling (L. Kish, Wiley, 1995) and Survey Methodology (R.M. Groves, F.J. Fowler, M.P. Couper, J.M. Lepkowski, E. Singer, R. Tourangeau, 2nd Edition, Wiley 2009). Questionnaires used in surveys should be validated based on accepted measures including construct, criterion and content validity, inter-rater and test-retest reliability, sensitivity and responsiveness. Although primarily focused on quality of life research, the book Quality of Life: the assessment, analysis and interpretation of patient-related outcomes (P.M. Fayers, D. Machin, 2nd Edition, Wiley, 2007) offers a comprehensive review of the theory and practice of developing, testing and analysing questionnaires in different settings. Health Measurement Scales: a practical guide to their development and use (D. L. Streiner, G. R. Norman, 4th Edition, Oxford University Press, 2008) is a very helpful guide to those involved in measuring subjective states and learning style in patients and healthcare providers.

Randomised clinical trials (RCTs) are a form of primary data collection. There are numerous textbooks and publications on methodological and operational aspects of clinical trials; they are not covered here. An essential guideline on clinical trials is the European Medicines Agency (EMA) Note for Guidance on Good Clinical Practice, which specifies obligations for the conduct of clinical trials to ensure that the data generated in the trial are valid.


Individual Chapters:


1. General aspects of study protocol

2. Research question

3. Approaches to data collection

3.1. Primary data collection

3.2. Secondary use of data

3.3. Research networks

3.4. Spontaneous report database

3.5. Using data from social media and electronic devices as a data source

3.5.1. General considerations

4. Study design and methods

4.1. General considerations

4.2. Challenges and lessons learned

4.2.1. Definition and validation of drug exposure, outcomes and covariates Assessment of exposure Assessment of outcomes Assessment of covariates Validation

4.2.2. Bias and confounding Choice of exposure risk windows Time-related bias Immortal time bias Other forms of time-related bias Confounding by indication Protopathic bias Surveillance bias Unmeasured confounding

4.2.3. Methods to handle bias and confounding New-user designs Case-only designs Disease risk scores Propensity scores Instrumental variables Prior event rate ratios Handling time-dependent confounding in the analysis

4.2.4. Effect modification

4.3. Ecological analyses and case-population studies

4.4. Hybrid studies

4.4.1. Pragmatic trials

4.4.2. Large simple trials

4.4.3. Randomised database studies

4.5. Systematic review and meta-analysis

4.6. Signal detection methodology and application

5. The statistical analysis plan

5.1. General considerations

5.2. Statistical plan

5.3. Handling of missing data

6. Quality management

7. Communication

7.1. Principles of communication

7.2. Guidelines on communication of studies

8. Legal context

8.1. Ethical conduct, patient and data protection

8.2. Pharmacovigilance legislation

8.3. Reporting of adverse events/reactions

9. Specific topics

9.1. Comparative effectiveness research

9.1.1. Introduction

9.1.2. General aspects

9.1.3. Prominent issues in CER Randomised clinical trials vs. observational studies Use of electronic healthcare databases Bias and confounding in observational CER

9.2. Vaccine safety and effectiveness

9.2.1. Vaccine safety General aspects Signal detection Signal refinement Hypothesis testing studies Meta-analyses Studies on vaccine safety in special populations

9.2.2. Vaccine effectiveness Definitions Traditional cohort and case-control studies Screening method Indirect cohort (Broome) method Density case-control design Test negative design Case coverage design Impact assessment Methods to study waning immunity

9.3. Design and analysis of pharmacogenetic studies

9.3.1. Introduction

9.3.2. Identification of genetic variants

9.3.3. Study designs

9.3.4. Data collection

9.3.5. Data analysis

9.3.6. Reporting

9.3.7. Clinical practice guidelines

9.3.8. Resources

Annex 1. Guidance on conducting systematic revies and meta-analyses of completed comparative pharmacoepidemiological studies of safety outcomes