12.1. General considerations
12.2. Timing of the statistical analysis plan
12.3. Information in the statistical analysis plan
Compared to the protocol that includes a section outlining the analyses, the SAP is a more technical, stand-alone document describing in detail the planned analyses, population definitions and methodology.
Given the influence of statistical decisions on study conclusions, a well-documented and transparent statistical plan is essential. Developing a SAP forces researchers to think about which data to collect, in which format. This may then guide decisions on e.g., measurement instruments and timing of (repeated) measurements.
Further guidance on general principles and justification for the need for a SAP are provided in Design of Observational Studies (P.R. Rosenbaum, Springer Series in Statistics, 2020).
The following objectives of a SAP apply to most studies, including observational studies:
Transparency as to how the analysis will proceed, by specifying in advance the methodology that will be applied. A SAP should always be completed prior to start of data analysis. Revisions after the start of the analysis might be possible, provided these changes are noted and justified in a revised SAP.
Communication to the study team, especially statisticians, involved in the study. It promotes good planning and efficiency for other stakeholders such as reviewers and the target audience of the study. Readers of observational research might dismiss important findings if they were not pre-specified.
Reproducibility so that in the future, for similar studies, the same analytical steps can be performed. The SAP should be sufficiently detailed so that it can be followed and reproduced by any statistician. Thus, it should provide clear and complete templates for each analysis.
Validity of study outcomes, with the SAP enabling the researcher to separate the pre-planned analyses to address the research question from data-driven analyses, to understand and interpret the data.
Pre-specification of statistical and epidemiological analyses can be challenging for data that are not collected specifically to answer the research question. This is often the case in observational studies where secondary use of data is frequent (see Chapter 8.2). Nevertheless, The Value of Statistical Analysis Plans in Observational Research: Defining High-Quality Research From the Start (JAMA 2012;308(8):773-4) provides arguments to produce a SAP for observational research which is more vulnerable to issues of reproducibility. A main component of an observational study is an initial raw dataset including a set of variables that do not usually provide a direct answer to the research question. The SAP details the statistical calculations that will be performed on these observed data and the patterns of results that will in turn be interpreted.
Specific to observational studies, strong emphasis needs to be given to measures applied to control and possibly quantify bias. Avoiding bias in observational studies: part 8 in a series of articles on evaluation of scientific publications (Dtsch Arztebl Int. 2009;106(41):664-8) explains how these methodological issues can be avoided by careful planning. Factors that may bias the results of observational studies are described in Chapter 6.1. In this context, thoughtful specification of the way missing values will be handled and the use of a small part of the data as a pilot set to guide the analysis can be useful approaches. Handling of missing data is discussed in Chapter 6.3.
In some studies, analyses that are not pre-specified will be performed in response to observations in the data, in order to support interpretation of the results. It is important to distinguish between such data-driven analyses and pre-specified findings. Post-hoc modifications to the analytical strategy should be duly noted and justified in the revision history of the SAP.
The SAP is crucial for guiding data analysis and therefore, it is useful to formulate it at an early stage. In particular, the SAP should be developed before any informal inspection of aspects of the data or results that might influence opinions regarding the study hypotheses. Ideally, the SAP will be developed as soon as the protocol is finalised.
The SAP may be submitted to regulatory authorities as part of a submission package, e.g., as an appendix to the protocol and/or study report. The SAP is stored in the study master file, and is used during audits to check if statistical analyses are performed as planned. The role of the SAP is explained in the International Council for Harmonisation (ICH) E9 guideline 'Statistical principles for clinical trials, and these recommendations may also apply to observational studies.
A particular concern in retrospective studies is that decisions about the analysis should be made blinded to any knowledge of the results. This should be a consideration in the study design, particularly when feasibility assessments are to be performed to inform the design phase. Such feasibility assessments should be independent of the study results (see Chapter 2).
At any cost, a SAP should always be completed before the data have been unblinded for the statistician. This contributes to transparency of the study process and confirms that the set of analyses have not been influenced by the data. Making alterations to a planned statistical analysis after seeing the data increases the risk of bias, inflates the probability of type I errors, and might jeopardise the validity of the study findings, and acceptance of the study.
A SAP is usually structured to reflect the protocol but will provide more granularity regarding the statistical methodology and population definition. Ideally, it includes and addresses the following elements in detail:
Statistics on who wrote the SAP, its version number, when it was approved, and who signed it.
Testable hypotheses to answer the study objectives (see Chapter 2). Defining primary and secondary objectives is important to avoid 'data dredging' and must correspond to the research question. A hypothesis is the product of deductive reasoning, going from general premises to specific results one would expect if those general premises were indeed true. This usually involves a set of possible relationships between a set of variables. It should be clearly stated how each outcome will be measured. Negative findings may be equally important as positive findings. Another reason to carefully choose the primary outcome is to minimise multiplicity effects. These occur when there are multiple statistical tests needed to assess the primary outcome, which increases the likelihood of false positives.
Definitions of study variables. Outcome variables based on historical data may involve complex transformations to approximate clinical variables not explicitly measured in the dataset used. These transformations should be discriminated from those made to improve the fit of a statistical model. In either case, the rationale should be provided. In the latter case, this will include which tests of fit will be used and under what conditions a transformation will be used. Next to the outcomes, the other variables used in the study also need to be further formalised. The formatting (e.g., categorisation, dichotomisation), modifications or derivations need to be described, with a special attention given to time-dependent variables (e.g., age, BMI).
Study design (see Chapter 4) and sample size considerations. It should be noted that in observational studies using data that already exist and where no additional data is collected, sample size is not preclusive and the ethical injunction against 'underpowered' studies has no obvious force provided the results, in particular the 'absence of effect' and 'insufficient evidence', which should be properly presented and interpreted. The anticipated overall number of study subjects, as well as the minimum number per strata for stratified analyses, can be provided as an indicative guide.
Methods for dealing with missing data.
Methods for dealing with outliers.
Procedures for dealing with protocol variations, non-compliance, and withdrawals.
Methods for estimating points and intervals.
Rules for calculating composite or derived variables, including data-driven definitions and any additional details required to minimise ambiguity.
Baseline and covariate data used.
Definition of study period (study entry/index date, follow-up period, study exit)
Inclusion of randomisation factors (if applicable).
Methods for dealing with data from several locations/sources.
Methods for dealing with treatment interactions.
Methods for multiple comparisons and subgroup analysis.
Computer systems and statistical software packages used to analyse data.
Statistical principles including confidence intervals and level of significance. The level of statistical significance to be employed, as well as whether one-tailed or two-tailed tests will be used, should be specified. Observational studies may be subject to repeated testing of accumulating data, which needs adjustment of significance levels to reduce inflated type-I errors (false positive findings). When false positives are a greater concern, a smaller confidence interval should be considered. Any planned adjustment of the significance level to control for type 1 error that can arise from comparisons across multiple subgroups or analysis of multiple predictors or outcomes (secondary analyses) should be presented. However, different objectives of the study may require a lower or higher strength of evidence – for instance, policy recommendations regarding drug licensing may require a lower chance of false positive decisions when deciding whether further investigation is needed for a product safety issue. It should be noted that statistical packages often employ standard procedures – for instance default p-values (i.e., 5%) or confidence intervals (i.e., 95%).
Sensitivity analyses. Sensitivity analyses allow to study the effect of potential violations of assumptions and/or results depending on specific observations (subjects) and are used to support the conclusions of the main analysis. Analyses to merely explore the data are considered exploratory analyses and should be described as such.
Decision criteria. If decisions are drawn from the study results, a section of the SAP should explain the different outcomes that might be selected for each decision, which statistics influence the decision-making process, and which values of the statistics will be considered to support each outcome.
Tables and figures for presentation of the study results. Skeleton tables should include a title, row labels and column entries be clearly spelled out, with only figures/numbers in the cells lacking. The analysis will produce the contents of the cells in a targeted manner, that is, hardly any other numbers will need to be generated. The same principle applies to graphs.
Consideration of the estimand framework is recommended to help informing choices regarding study design and data analysis and clarify how to interpret study findings. Tell me what you want, what you really really want: estimands in observational pharmacoepidemiologic comparative effectiveness and safety studies (Pharmacoepidemiol Drug Saf. 2023 Mar 22) discusses how defining an estimand is instrumental to the process of designing and analysing pharmacoepidemiological comparative effectiveness or safety studies. It applies the ICH Addendum on Estimands and Sensitivity Analysis in Clinical Trials to the Guideline on Statistical principles for Clinical Trial (2019) on estimands to three case studies and shows how defining an estimand ensures that the study targets a treatment effect that aligns with the treatment decision the study aims to inform.
For further reading on how to draft a SAP tailored to observational studies, see DEBATE-statistical analysis plans for observational studies (BMC Med Res Methodol. 2019;19(1):233); Guide to the statistical analysis plan (Pediatric Anesthesia 2019;29:237-42) which provides an exhaustive list of SAP items applicable to both prospective and retrospective observational studies; and The value of statistical analysis plans in observational research: defining high-quality research from the start (JAMA. 2012;308(8):773-4). A good example of a SAP where the main components are included can be found in Necrotizing soft tissue infections - a multicentre, prospective observational study (INFECT): protocol and statistical analysis plan (Acta Anaesthesiol Scand. 2018;62(2);272-79). Modern Epidemiology, 4th ed. (T. Lash, T.J. VanderWeele, S. Haneuse, K. Rothman. Wolters Kluwer, 2020) summarises the phases in a statistical analysis that should all be thought out and described beforehand.