Print page Resize text Change font-size Change font-size Change font-size High contrast

Home > Standards & Guidances > Methodological Guide

ENCePP Guide on Methodological Standards in Pharmacoepidemiology


3. Development of the study protocol

The study protocol is the core document of a study that should be drafted as one of the first steps in any research project once the research question has been clearly defined. The final version must precisely describe everything being done in the study so that the study can be reproduced. It should be amended and updated as needed and amendments should be justified.


For PASS described in the Guideline of good pharmacovigilance practices (GVP) Module VIII - Post-authorisation safety studies, the Commission Implementing Regulation (EU) No 520/2012 provides legal definitions of the start of data collection (the date from which information on the first study subject is first recorded in the study dataset, or, in the case of secondary use of data, the date from which data extraction starts) and of the end of data collection (the date from which the analytical dataset is completely available). These dates provide timelines for the commencement of the study and the submission of the final study report to the competent authorities. It also provides the format of protocols, abstracts and final study reports for imposed PASS. Based on these formats, the European Medicines Agency (EMA) published detailed templates for the protocol and final study report which it recommends to be used for all PASS, including meta-analyses and systematic reviews. The ISPE Guidelines for Good Pharmacoepidemiology Practices (GPP) provides guidance on what is expected from a pharmacoepidemiology study protocol and on the different aspects to be covered. It states that the protocol should include a description of the data quality and integrity, including abstraction of original documents, extent of source data verification, and validation of endpoints. The FDA’s Best Practices for Conducting and Reporting Pharmacoepidemiologic Safety Studies Using Electronic Health Care Data Sets includes a description of the elements that should be addressed in the protocols of such studies, including the choice of data sources and study population, the study design and the analyses. The ENCePP Checklist for Study Protocols seeks to stimulate researchers to consider important epidemiological aspects when designing a pharmacoepidemiological study and writing a study protocol. The Agency for Healthcare Research and Quality (AHRQ) published Developing a Protocol for Observational Comparative Effectiveness Research: A User’s Guide including best practice principles and checklists on a wide range of topics that are also applicable to observational studies outside the scope of comparative effectiveness research. The Appendix 1 of the CIOMS International Ethical Guidelines for Health-related Research Involving Humans (Geneva: 2016) provides a list of 48 items to be included in a protocol or associated documents for health-related research involving humans.


It should be kept in mind that different regulators and stakeholders (e.g., data owners) might require different protocol structure from the one recommended by the above mentioned guidances and several versions of the protocol for the same study might be required.


The protocol should cover at least the following aspects:

  • The research question the study is designed to answer, which might be purely descriptive, exploratory or explanatory (hypothesis driven). The protocol should include a background description that explains the origin (scientific, regulatory, etc.) and current knowledge on the research question. It will also explain the context of the research question, including what data are currently available and how these data can or cannot contribute to answering the question. The context will also be defined in terms of what information sources can be used to generate appropriate data and how the proposed study methodology will be shaped around these.
  • The main study objective and possible secondary objectives, which are operational definitions of the research question. In defining secondary objectives, consideration could be given to time and cost, which may impose constraints and choices, for example in terms of sample size, duration of follow-up or data collection.


  • The source and study population to be used to answer the research question. The protocol should describe whether this population is already identified, and whether data are already available (allowing a secondary data collection from a database) or whether it needs to be recruited de novo. The limits of the desired population will be defined, including inclusion/exclusion criteria, timelines (such as index dates for inclusion in the study) and any exposure criteria and events defining cases and exposed study groups.


  • Exposure of interest that needs to be pre-specified and defined, including duration and intensity of exposure, source of data and methods of ascertainment.


  • Outcomes of interest that need to be pre-specified and defined, including data sources, operational definitions and methods of ascertainment such as data elements in field studies or appropriate codes in database studies.



  • The covariates and potential confounders that need to be pre-specified and defined, including how they will be measured.


  • The statistical plan for the analysis of the resulting data, including statistical methods and software, adjustment strategies, and how the results are going to be presented.


  • The identification and minimisation of potential biases.


  • Major assumptions, critical uncertainties and challenges in the design, conduct and interpretation of the results of the study given the research question and the data used.


  • Ethical considerations, as described in the section on governance of the current document.

The various data collection forms including the Case Report Form (CRF) or descriptions of the data elements may be appended to the protocol, providing an exact representation of how the data will be collected. The study protocols could include a section specifying ways in which the CRF will be piloted, tested and finalised. Amendments of final CRFs should be justified. For field studies, physician or patient forms would be included depending on the data collection methodology. Other forms may be included as needed, such as patient information or patient-oriented summaries.



Individual Chapters:


1. Introduction

2. Formulating the research question

3. Development of the study protocol

4. Approaches to data collection

4.1. Primary data collection

4.1.1. Surveys

4.1.2. Randomised clinical trials

4.2. Secondary data collection

4.3. Patient registries

4.3.1. Definition

4.3.2. Conceptual differences between a registry and a study

4.3.3. Methodological guidance

4.3.4. Registries which capture special populations

4.3.5. Disease registries in regulatory practice and health technology assessment

4.4. Spontaneous report database

4.5. Social media and electronic devices

4.6. Research networks

4.6.1. General considerations

4.6.2. Models of studies using multiple data sources

4.6.3. Challenges of different models

5. Study design and methods

5.1. Definition and validation of drug exposure, outcomes and covariates

5.1.1. Assessment of exposure

5.1.2. Assessment of outcomes

5.1.3. Assessment of covariates

5.1.4. Validation

5.2. Bias and confounding

5.2.1. Selection bias

5.2.2. Information bias

5.2.3. Confounding

5.3. Methods to handle bias and confounding

5.3.1. New-user designs

5.3.2. Case-only designs

5.3.3. Disease risk scores

5.3.4. Propensity scores

5.3.5. Instrumental variables

5.3.6. Prior event rate ratios

5.3.7. Handling time-dependent confounding in the analysis

5.4. Effect measure modification and interaction

5.5. Ecological analyses and case-population studies

5.6. Pragmatic trials and large simple trials

5.6.1. Pragmatic trials

5.6.2. Large simple trials

5.6.3. Randomised database studies

5.7. Systematic reviews and meta-analysis

5.8. Signal detection methodology and application

6. The statistical analysis plan

6.1. General considerations

6.2. Statistical analysis plan structure

6.3. Handling of missing data

7. Quality management

8. Dissemination and reporting

8.1. Principles of communication

8.2. Communication of study results

9. Data protection and ethical aspects

9.1. Patient and data protection

9.2. Scientific integrity and ethical conduct

10. Specific topics

10.1. Comparative effectiveness research

10.1.1. Introduction

10.1.2. General aspects

10.1.3. Prominent issues in CER

10.2. Vaccine safety and effectiveness

10.2.1. Vaccine safety

10.2.2. Vaccine effectiveness

10.3. Design and analysis of pharmacogenetic studies

10.3.1. Introduction

10.3.2. Identification of generic variants

10.3.3. Study designs

10.3.4. Data collection

10.3.5. Data analysis

10.3.6. Reporting

10.3.7. Clinical practice guidelines

10.3.8. Resources

Annex 1. Guidance on conducting systematic revies and meta-analyses of completed comparative pharmacoepidemiological studies of safety outcomes