Abstract:
Abstract Although the importance of selecting cases and controls from the same population has been recognized for decades, the recent advent of genome‐wide association studies has heightened awareness of this issue. Because these studies typically deal with large samples, small differences in allele frequencies between cases and controls can easily reach statistical significance. When, unbeknownst to a researcher, cases and controls have different substructures, the number of false‐positive findings is inflated. There have been three recent developments of purely statistical approaches to assessing the ancestral comparability of case and control samples: genomic control, structured association, and multivariate reduction analyses. The widespread use of high‐throughput technology has allowed the quick and accurate genotyping of the large number of markers required by these methods. Group 13 dealt with four population stratification issues: single‐nucleotide polymorphism marker selection, association testing, nonstandard methods, and linkage disequilibrium calculations in stratified or mixed ethnicity samples. We demonstrated that there are continuous axes of ethnic variation in both data sets of Genetic Analysis Workshop 16. Furthermore, ignoring this structure created P ‐value inflation for a variety of phenotypes. Principal‐components analysis (or multidimensional scaling) can control inflation as covariates in a logistic regression. One can weigh for local ancestry estimation and allow the use of related individuals. Problems arise in the presence of extremely high association or unusually strong linkage disequilibrium (e.g., in chromosomal inversions). Our group also reported a method for performing an association test controlling for substructure, when genome‐wide markers are not available, to explicitly compute stratification Genet. Epidemiol . 33 (Suppl. 1):S88–S92, 2009. © 2009 Wiley‐Liss, Inc.
Tópico:
Genetic Associations and Epidemiology