CEBP CTRC-AACR San Antonio Breast Cancer Symposium 2008 Conference on Cancer Prevention - Washington, D.C.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Cancer Research Clinical Cancer Research
Cancer Epidemiology Biomarkers & Prevention Molecular Cancer Therapeutics
Molecular Cancer Research Cancer Prevention Research
Cancer Prevention Journals Portal Cancer Reviews Online
Annual Meeting Education Book Meeting Abstracts Online

This Article
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Heiman, G. A.
Right arrow Articles by Cazes, M.-H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Heiman, G. A.
Right arrow Articles by Cazes, M.-H.
Cancer Epidemiology Biomarkers & Prevention Vol. 14, 1579-1582, June 2005
© 2005 American Association for Cancer Research


Letter

Robustness of Case-Control Studies to Population Stratification

Gary A. Heiman

Department of Epidemiology, Mailman School of Public Health Columbia University, New York, New York

Prakash Gorroochurn

Division of Statistical Genetics, Department of Biostatistics, Mailman School of Public Health Columbia University, New York, New York, University of Mauritius, Reduit, Mauritius

Susan E. Hodge and David A. Greenberg

Division of Statistical Genetics, Department of Biostatistics, Mailman School of Public Health Columbia University, New York, New York, Clinical-Genetic Epidemiology Unit New York State Psychiatric Institute, New York, New York

To the Editors: Using computer simulations, Khlat et al. (1) quantified type I error increase caused by population stratification. They argued that under "realistic scenarios" (where subpopulations account for ≤10% of the study population and allelic frequency differences are ≤0.2), the inflation of type I error is of limited concern.

Results from both computer simulations (2) and theoretical analyses (3) suggest a more nuanced and complex view of population stratification. Our results are consistent with some of those by Khlat et al. (1). First, a large inflation in type I error can occur when the two subpopulations are equal sized but have moderate marker allele and disease frequency differences. Second, the confounding risk ratio provides a poor measure of the increase in type I error rate under population stratification, as we showed in ref. (3). However, our computer simulations also show that having a mixture of unequal-sized subpopulations (e.g., 10% versus 90%) does not necessarily lead to a reduction in the inflated type I error rate; neither does an increase in subpopulation number, in contrast to Wacholder et al. (4). This is particularly true when marker allele frequency, disease prevalence, and population size are not independent (3). For example, founder effects and bottlenecks can create dependence between subpopulation size and disease prevalence and/or marker allele frequency. This may lead to subpopulations with substantially elevated marker allele frequency and disease prevalence rates relative to the majority population (5).

Consequently, a marker difference of 0.2 is not the maximum difference that one can expect, as suggested by Khlat et al. Indeed, Khlat et al. showed that when the marker difference is only 0.2, the type I error can reach 19% (i.e., on average, one of every five positive findings is false), which we argue is a substantial increase in the type I error rate over the nominal rate of 5%. This would be even higher with larger marker differences. Our results indicate that when the population variables are not independent, the type I error rate does not approach the nominal type I error rate of 0.05 even with a large number of subpopulations (3). Moreover, even when the population variables are independent, type I error may not converge to 0.05 until a very large number of subpopulations are reached.

Thus, the issue of population stratification is more complex than has been believed and is of more than "limited" concern. Allele frequency differences and possible interdependence of population variables suggest that population stratification cannot be dismissed out of hand.


    References
 Top
 References
 References 
 

  1. Khlat M, Cazes MH, Genin E, Guiguet M. Robustness of case-control studies of genetic factors to population stratification: magnitude of bias and type I error. Cancer Epidemiol Biomarkers Prev 2004;13:1660–4.[Abstract/Free Full Text]
  2. Heiman GA, Hodge SE, Gorroochurn P, Zhang J, Greenberg DA. Effect of population stratification on case-control association studies. I. Elevation in false positive rates and comparison to confounding risk ratios (a simulation study). Hum Hered 2004;58:30–9.[CrossRef][Medline]
  3. Gorroochurn P, Hodge SE, Heiman G, Greenberg DA. Effect of population stratification on case-control association studies. II. False-positive rates and their limiting behavior as number of subpopulations increases. Hum Hered 2004;58:40–8.[CrossRef][Medline]
  4. Wacholder S, Rothman N, Caporaso N. Population stratification in epidemiologic studies of common genetic variants and cancer: quantification of bias. J Natl Can 2000;92:1151–8.
  5. Holgate P. A mathematical study of founder principle of evolutionary genetics. J Appl Prob 1966;3:115–28.

 
Myriam Khlat and Marie-Hélène Cazes

Institut National d'Etudes Démographiques, Paris, France

In Response: In our study of the robustness of case-control studies to population stratification, we conclude that the bias and type I error resulting from population stratification are likely to be "limited in methodologically sound case-control studies of moderate size, except in quite unrealistic scenarios." Heiman et al. argue that their own analyses (1, 2) lead to a more nuanced view of the stratification bias in case-control studies. We fully agree with their assertion that "relatively small CRR values can actually represent highly inflated type I error," and this is exactly the reason why we have focused on type I error in relation to the population variables and sample size. We feel that the conclusions of our study do complement and strengthen those of Wacholder et al. (3), and also that our findings are very much in accord with the findings of Heiman et al. The main point of divergence concerns the degree of variability of allelic frequencies across population subgroups, as Heiman et al. pinpoint the special situations of very small subpopulations exhibiting subtantially elevated marker allele frequency and disease rates relative to the majority population. More generally, they argue that, in populations containing a mixture of subpopulations, the disease prevalence can depend on the size of the subpopulations, and that the contention that "the greater the number of distinct population subgroups, the smaller the bias" does not hold in that case. This is indeed a very interesting and important point, and we do agree that, due to founder effects and genetic drift related to inbreeding, the allelic frequencies in some small subpopulations may shift far away from that of the majority population. And yet, as already pointed out by Wacholder et al. (3) "only ethnic groups that maintain their individual identities are likely to remain endogamous and retain any important differences in genotype frequencies." It can therefore be argued that, whenever such a situation arises, it does concern distinguishable ethnic groups, and can be handled by matching or statistical adjustment. In that case, the allelic differences which have to be considered for the purposes of investigating the stratification bias are those which remain after accounting for ethnicity, and those "residual" differences are likely to be moderate. Whereas we find that the point made by Heiman et al. is very relevant, we believe that it does not question our conclusions, and maintain that, in carefully matched, moderate size studies, the type I error associated with population stratification remains very limited in most realistic scenarios.


    References 
 Top
 References
 References 
 

  1. Heiman GA, Hodge SE, Gorroochurn P, Zhang J, Green berg DA. Effect of population stratification on case-control association studies. I. Elevation in false positive rates and comparison to confounding risk ratios (a simulation study). Hum Hered 2004;58:30–9.
  2. Gorroochurn P, Hodge SE, Heiman G, Greenberg DA. Effect of population stratification on case-control association studies. II. False-positive rates and their limiting behavior as number of subpopulations increases. Hum Hered 2004;58:40–8.
  3. Wacholder S, Rothman N, Caporaso N. Population stratification in epidemiologic studies of common genetic variants and cancer: quantification of bias. J Natl Can Inst 2000;92:1151–8.[Abstract/Free Full Text]




This Article
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Heiman, G. A.
Right arrow Articles by Cazes, M.-H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Heiman, G. A.
Right arrow Articles by Cazes, M.-H.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Cancer Research Clinical Cancer Research
Cancer Epidemiology Biomarkers & Prevention Molecular Cancer Therapeutics
Molecular Cancer Research Cancer Prevention Research
Cancer Prevention Journals Portal Cancer Reviews Online
Annual Meeting Education Book Meeting Abstracts Online