6.2. Timing of the statistical analysis plan
6.3. Decision criteria
6.4. Statistical analysis plan structure
There is a considerable body of literature explaining statistical methods for observational studies but very little addressing the statistical analysis plan. A clear guide to general principles and the need for a plan is given in Design of Observational Studies (P.R. Rosenbaum, Springer Series in Statistics, 2010. Chapter 18), which also gives useful advice on how to test complex hypotheses in a way that minimises the chances of drawing incorrect conclusions.
Planning analyses for randomised clinical trials is covered in a number of publications. These often give checklists of the component parts of an analysis plan and much of this applies equally to non-randomised designs. A good reference in this respect is the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH): ICH E9 ‘Statistical Principles for Clinical Trials’ and its addendum on estimands and sensitivity analysis in clinical trials (ICH E9(R1).
While specific guidance on the statistical analysis plan for epidemiological studies is sparse, the following principles will apply to most of the studies.
A study is generally designed with the objective of addressing a set of research questions. However, the initial product of a study is a set of numerical and categorical observations that do not usually provide a direct answer to the questions that the study is designed to address. The statistical analysis plan details the mathematical transformations that will be performed on the observed data in the study and the patterns of results that will be interpreted as supporting answers to the questions. An important part of the statistical analysis plan will explain how problems in the data will be handled in such calculations, for example missing or partial data.
The statistical analysis plan should be sufficiently detailed so that it can be followed and reproduced by any competent analyst. Thus, it should provide clear and complete templates for each analysis.
Pre-specification of statistical and epidemiological analyses can be challenging for data that are not
collected specifically to answer the study questions. This is often the case in observational studies, where secondary data are used. However, thoughtful specification of the way missing values will be handled or the use of a small part of the data as a pilot set to guide analysis can be useful techniques to overcome such problems. A feature common to most studies is that some not pre-specified analyses will be performed in response to observations in the data to help interpretation of results. It is important to distinguish between such data-driven analyses and the pre-specified findings. Post-hoc modifications to the analysis strategy should be noted and explained. The statistical analysis plan provides a confirmation of this process.
Strong emphasis will be given in studies using observational data to measures taken to control and quantify levels of bias. Thus, part of the analysis plan will be devoted to converting scientific understanding of the causal relationships between the exposures and outcomes that are the primary focus of the study and other variables that are available in the dataset into a credible mathematical model. It is also advisable to include appropriate negative controls – (exposure, outcome) pairs that are strongly believed not to be causally related for which a similar model is considered reasonable – in the analysis as these may indicate uncontrolled confounding.
A particular concern in retrospective studies is that decisions about the analysis should be made blinded to any knowledge of the results. This should be a consideration in the study design, particularly when feasibility studies are to be performed to inform the design phase. Feasibility studies should be independent of the main study results.
The study protocol will have specified the questions to be addressed in the study and will contain a generic description of the study type and the statistical techniques. However, the statistical analysis plan is likely to be the document in which the statistics to be calculated and tabular and graphical presentations are fully described. Since the decision criteria for the study are specified in terms of the observed values of these detailed statistics, it is worth formulating the statistical analysis plan at an early stage and, in particular, before any informal inspection of aspects of the data or results that might influence opinions regarding the study hypotheses. Ideally the statistical analysis plan will be developed as soon as the protocol is finalised.
If decisions are to be made based on the results of the study, a section of the statistical analysis plan should explain the different outcomes that might be selected for each decision, which statistics influence the decision making process and which values of the statistics will be considered to support each outcome. Often the statistical analysis will employ standard routines incorporated in statistical packages that have outputs seen as implicit decision criteria – for instance p values or confidence intervals. However, different applications of the study may require lower or higher strength of evidence – for instance policy recommendations regarding drug licensing may require a lower chance of false positive decisions than the classical one when deciding whether further investigation is needed for a product safety issue. Hence consideration of decision-making criteria with explicit reference to the type of decision to be made is beneficial.
The statistical and epidemiological analysis plan is usually structured to reflect the protocol and will address, where relevant, the following points:
1. A description of the study data sources, linkage methods, and study design including intended study population, inclusion and exclusion criteria and study period with discussion of strengths and weaknesses.
2. Formal definitions of exposure including transformations to determine duration and quantity of exposure.
3. Definition of follow-up and censoring if applicable.
4. Formal definitions of any outcomes, for example ‘fatal myocardial infarction’ that might be defined as ‘death within 30 days of a myocardial infarction’. Outcome variables based on historical data may involve complex transformations to approximate clinical variables not explicitly measured in the dataset used. These transformations should be discriminated from those made to improve the fit of a statistical model. In either case the rationale should be given. In the latter case this will include which tests of fit will be used and under what conditions a transformation will be used.
5. Formal definitions for other variables – e.g. thresholds for abnormal levels of blood parameters. When values of variables for a subject vary with time, care should be given to explaining how the values will be determined at each time point and recorded in the dataset for use in a statistical model.
6. The effect measures and statistical methods used to address each primary and secondary objective.
7. Blinding evaluators to exposure variables in order to avoid making subjective judgments about the study.
8. Methods of dealing with confounding, and assessing bias such as:
Which confounders will be considered and how they will be defined
Adjustment for confounders in statistical models
Restriction in analysis
Matching, including propensity-score matching
Self-controlled study designs
Statistical approach for any selection of a subset of confounders
Methods for assessing the level of confounding adjustment achieved
Sensitivity analysis for residual confounding
How negative controls will be selected for the model
9. Handling of missing data, including:
How missing data will be reported;
Methods of imputation;
Sensitivity analyses for handling missing data;
How censored data will be treated and rationale
10. Fit of the model – if considered for a predictive model, including:
Criteria for assessing fit;
Alternative models in the event of clear lack of it.
11. Interim analyses – if considered:
Criteria, circumstances and possible drawbacks for performing an interim analysis and possible actions (including stopping rules) that can be taken on the basis of such an analysis
12. How the achieved patient population will be characterised:
Description of target population;
Description of the analysis population if different, e.g. after propensity score matching or in instrumental variable analyses.
13. Treatment of multiplicity issues not elsewhere covered.
14. Sample size considerations should be presented, making explicit the data source from which the expected variation of relevant quantities and the clinically relevant differences are derived. It should be noted that in observational studies on data that already exist and where no additional data can be collected, sample size is not preclusive and the ethical injunction against 'underpowered' studies has no obvious force provided the results, in particular the 'absence of effect' and 'insufficient evidence', are properly presented and interpreted.
Missing data occur when no data value is stored for the variable in the current observation. Missing data are a common occurrence and can have a significant effect on the conclusions that can be drawn from the data. There are different patterns of missing data: completely at random, at random or not at random.
The book Statistical analysis with missing data (Little RJA, Rubin DB. 2nd ed.,Wiley 2002) describes many aspects of the handling of missing data. The section ‘Handling of missing values’ in Rothman’s Modern Epidemiology, 3rd ed. (K. Rothman, S. Greenland, T. Lash. Lippincott Williams & Wilkins, 2008) is a summary of the state of the art, focused on practical issues for epidemiologists. Ways of dealing with such data include complete subject analysis (subjects with missing values are deleted from the analyses) and imputation methods (missing data are predicted based on the observed values and the pattern of missingness). A method commonly used in epidemiology is to create a category of the variable, or an indicator, for the missing values. This practice can be invalid even if the data are missing completely at random and should be avoided (see Indicator and Stratification Methods for Missing Explanatory Variables in Multiple Linear Regression. J Am Stat Assoc 1996;91(433):222-30).
A concise review of methods to handle missing data is provided in the section ‘Missing data’ of the Encyclopedia of Epidemiologic Methods (Gail MH, Benichou J, Editors. Wiley 2000). Identifying the pattern of missing data is important as some methods for handling missing data assume a defined pattern of missingness. Biased results may be obtained if it is incorrectly assumed that data are missing at random. In general, it is desirable to show that conclusions drawn from the data are not sensitive to the particular strategy used to handle missing values. To investigate this, it may be helpful to repeat the analysis with a variety of approaches.
Other useful references on handling of missing data include the books Multiple Imputation for Nonresponse in Surveys (Rubin DB, Wiley, 2004) and Analysis of Incomplete Multivariate Data (Schafer JL, Chapman & Hall/CRC, 1997), and the articles Using the outcome for imputation of missing predictor values was preferred (J Clin Epi 2006;59(10):1092-101), Recovery of information from multiple imputation: a simulation study (Emerg Themes Epidemiol 2012;9(1):3) and Evaluation of two-fold fully conditional specification multiple imputation for longitudinal electronic health record data (Stat Med. 2014;33(21):3725-37).