Print page Resize text Change font-size Change font-size Change font-size High contrast

Home > Standards & Guidances > Methodological Guide

ENCePP Guide on Methodological Standards in Pharmacoepidemiology


11.1. General considerations

There is a considerable body of literature explaining statistical methods for observational studies but little addressing the statistical analysis plan (SAP). A SAP is a document authored prior to the start of the observational study that presents significant details about how the data will be coded and analysed. While the study protocol will have specified the questions to be addressed by the study and will contain an overview of the statistical methods, the SAP is the document in which the statistics to be calculated, and expected tabular and graphical presentations of the results of the study, are fully described.

Guidance on general principles and justification for the need for a SAP are provided in Design of Observational Studies (P.R. Rosenbaum, Springer Series in Statistics, 2020).

The following objectives of a SAP apply to most studies, including observational studies:

  • Transparency as to how the analysis will proceed by specifying in advance the methodology that will be applied. A SAP should always be completed prior to start of data analysis. Revisions after the start of the analysis might be possible provided these changes are noted and justified in a revised SAP.

  • Communication to the study team, especially statisticians, involved in the study. It promotes good planning and efficiency for other stakeholders such as reviewers and the target audience of the study. Readers of observational research might dismiss important findings when they were not prespecified. 

  • Replication so that in the future, for similar studies, the same analytical steps can be performed. The SAP should be sufficiently detailed so that it can be followed and reproduced by any statistician. Thus, it should provide clear and complete templates for each analysis.

A study is generally designed with the objective of addressing a set of research questions. A main component of the study is an initial raw dataset including a set of variables that do not usually provide a direct answer to the questions. The SAP details the statistical calculations that will be performed on these observed data and the patterns of results that will in turn be interpreted.


Pre-specification of statistical and epidemiological analyses can be challenging for data that are not collected specifically to answer the research questions. This is often the case in observational studies where secondary data are used (see Chapter 7.2 Secondary use of data). However, thoughtful specification of the way missing values will be handled or the use of a small part of the data as a pilot set to guide analysis can be useful techniques to overcome such problems. Handling of missing data is further discussed in Chapter 5.3.


Specific to observational studies, strong emphasis will be given to measures applied to control and possibly quantify bias. Avoiding bias in observational studies: part 8 in a series of articles on evaluation of scientific publications (Dtsch Arztebl Int. 2009;106(41):664-8) explains how these main methodological problems can be avoided by careful planning. Factors that may bias the results of observational studies are described in Chapter 5.1.


A feature common to most studies is that some analyses that are not pre-specified will be performed in response to observations in the data to help interpretation of results. It is important to distinguish between such data-driven analyses and pre-specified findings. Post-hoc modifications to the analytical strategy should be duly noted and justified. The SAP provides a confirmation of this process.



« Back