Print page Resize text Change font-size Change font-size Change font-size High contrast

Home > Standards & Guidances > Methodological Guide

ENCePP Guide on Methodological Standards in Pharmacoepidemiology


10.3.2. Identification of genetic variants

Identification of genetic variation associated with important drug or therapy-related outcomes can follow two main approaches.


The first is the candidate gene approach in which as many as dozens to thousands of genetic variations within one or several genes, including a common form of variations known as single nucleotide polymorphisms (SNPs), are genotyped, including the coding and noncoding sequence. Generally they are chosen on the grounds of biological plausibility, which may have been proven before in previous studies, or of knowledge of functional genes known to be involved in pharmacokinetic and pharmacodynamics pathways or related to the disease or intermediate phenotype. Methodological and statistical issues in pharmacogenomics (J Pharm Pharmacol 2010;62(2):161-6) discusses pros and cons of a candidate gene approach and a genome-wide scan approach (see below), and A tutorial on statistical methods for population association studies (Nat Rev Genet 2006;7(10):781-91) gives an outline of key methods that can be used. The advantage of the candidate gene approach is that resources can be directed to several important genetic polymorphisms and the higher a priori chance of relevant drug-gene interactions. This approach, however, requires a priori information about the likelihood of the polymorphism, gene, or gene-product interacting with a drug or drug pathway. Moving towards individualized medicine with pharmacogenomics (Nature 2004;429:464-8) explains that lack or incompleteness of information on genes from previous studies may result in the failure in identifying every important genetic determinant in the genome.


The second approach is hypothesis-generating or hypothesis-agnostic, known as genome-wide, which identifies genetic variants across the whole genome. By comparing the frequency of genetic or SNP markers between drug responders and non-responders, or those with or without drug toxicity, important genetic determinants are identified. In this approach, no previous information or specific gene/variant hypothesis is needed. Because of the concept of linkage disequilibrium, whereby certain genetic determinants tend to be co-inherited together, it is possible that the genetic associations identified through a genome-wide approach may not be truly biologically functional polymorphisms, but instead may simply be a linkage-related marker of another genetic determinant that is the true biologically relevant genetic determinant. Thus, this approach is considered discovery in nature. It may detect the SNPs in genes, which were previously not considered as candidate genes, or even SNPs outside of the genes. Nonetheless, failure to cover all relevant genetic risk factors can still be a problem, though less than with the candidate gene approach. It is therefore important to conduct replication and validation studies (in vivo and in vitro) to ascertain the generalisability of findings to populations of patients, to characterise the mechanistic basis of the effect of these genes on drug action, and to identify true biologic genetic determinants. This approach is useful for studying complex diseases where multiple genetic variations contribute to disease risk, but are applicable to disease and treatment outcomes.


Various genome-wide approaches are currently available including genome and exome sequencing, and application of various chips that type hundreds of thousands to billions of SNPs (e.g. exome chip). Finally, power is usually limited to detect only common variants with a large effect, and therefore large sample sizes should be considered, e.g. through pooling of biobanks.



Individual Chapters:


1. Introduction

2. Formulating the research question

3. Development of the study protocol

4. Approaches to data collection

4.1. Primary data collection

4.1.1. Surveys

4.1.2. Randomised clinical trials

4.2. Secondary data collection

4.3. Patient registries

4.3.1. Definition

4.3.2. Conceptual differences between a registry and a study

4.3.3. Methodological guidance

4.3.4. Registries which capture special populations

4.3.5. Disease registries in regulatory practice and health technology assessment

4.4. Spontaneous report database

4.5. Social media and electronic devices

4.6. Research networks

4.6.1. General considerations

4.6.2. Models of studies using multiple data sources

4.6.3. Challenges of different models

5. Study design and methods

5.1. Definition and validation of drug exposure, outcomes and covariates

5.1.1. Assessment of exposure

5.1.2. Assessment of outcomes

5.1.3. Assessment of covariates

5.1.4. Validation

5.2. Bias and confounding

5.2.1. Selection bias

5.2.2. Information bias

5.2.3. Confounding

5.3. Methods to handle bias and confounding

5.3.1. New-user designs

5.3.2. Case-only designs

5.3.3. Disease risk scores

5.3.4. Propensity scores

5.3.5. Instrumental variables

5.3.6. Prior event rate ratios

5.3.7. Handling time-dependent confounding in the analysis

5.4. Effect measure modification and interaction

5.5. Ecological analyses and case-population studies

5.6. Pragmatic trials and large simple trials

5.6.1. Pragmatic trials

5.6.2. Large simple trials

5.6.3. Randomised database studies

5.7. Systematic reviews and meta-analysis

5.8. Signal detection methodology and application

6. The statistical analysis plan

6.1. General considerations

6.2. Statistical analysis plan structure

6.3. Handling of missing data

7. Quality management

8. Dissemination and reporting

8.1. Principles of communication

8.2. Communication of study results

9. Data protection and ethical aspects

9.1. Patient and data protection

9.2. Scientific integrity and ethical conduct

10. Specific topics

10.1. Comparative effectiveness research

10.1.1. Introduction

10.1.2. General aspects

10.1.3. Prominent issues in CER

10.2. Vaccine safety and effectiveness

10.2.1. Vaccine safety

10.2.2. Vaccine effectiveness

10.3. Design and analysis of pharmacogenetic studies

10.3.1. Introduction

10.3.2. Identification of generic variants

10.3.3. Study designs

10.3.4. Data collection

10.3.5. Data analysis

10.3.6. Reporting

10.3.7. Clinical practice guidelines

10.3.8. Resources

Annex 1. Guidance on conducting systematic revies and meta-analyses of completed comparative pharmacoepidemiological studies of safety outcomes