| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
1 Division of General Internal Medicine, Departments of 2 Medicine and 3 Biopharmaceutical Sciences, 4 Center for Human Genetics, and 5 Comprehensive Cancer Center, University of California San Francisco, San Francisco, California and 6 Northern California Cancer Center, Fremont, California
Requests for reprints: Elad Ziv, Division of General Internal Medicine, University of California San Francisco, Box 1732, San Francisco, CA 94143. Phone: 415-353-7981; Fax: 415-353-7932. E-mail: elad.ziv{at}ucsf.edu
| Abstract |
|---|
|
|
|---|
Methods: We used 44 ancestry informative markers to estimate individuals' genetic ancestry in 563 Latina participants. To test whether ancestry is a predictor of hormone therapy use, parity, and body mass index (BMI), we used multivariate logistic regression models to estimate odds ratios (OR) and 95% confidence intervals (95% CI) associated with a 25% increase in Indigenous American ancestry, adjusting for age, education, and the participant's and grandparents' place of birth.
Results: Hormone therapy use was significantly less common among women with higher Indigenous American ancestry (OR, 0.78; 95% CI, 0.63-0.96). Higher Indigenous American ancestry was also significantly associated with overweight (BMI, 25-29.9 versus <25) and obesity (BMI,
30 versus <25), but only among foreign-born Latina women (OR, 3.44; 95% CI, 1.97-5.99 and OR, 1.95; 95% CI, 1.24-3.06, respectively).
Conclusion: Some breast cancer risk factors are associated with genetic ancestry among Latinas in the San Francisco Bay Area. Therefore, case-control genetic association studies for breast cancer should directly measure genetic ancestry to avoid potential confounding. (Cancer Epidemiol Biomarkers Prev 2006;15(10):187885)
| Introduction |
|---|
|
|
|---|
35% lower than the rates of Caucasian women (2). Latinos are known to be an admixed population with genetic ancestry from Europeans, Indigenous Americans, and Africans (3-7). The proportion of these ancestral contributions vary depending on the country and region of origin of individuals (8). In the San Francisco Bay Area, most Latinas are of Mexican or Central American descent. These women are, in turn, known to be of mostly European and Indigenous American ancestry. In genetic association studies of cases (individuals with the disease of interest) and controls (individuals without the disease), admixture may lead to false-positive or false-negative results if cases and controls differ in their genetic ancestry (7, 9, 10). The degree to which such confounding would occur depends on whether genetic ancestry is associated with the disease under study (10, 11). If genetic ancestry is associated with disease, because of either genetic or environmental differences between ancestral groups in the admixed population, the likelihood of both false-positive results and false-negative results will be increased in case-control studies (11). It is important to note that the association between genetic ancestry and a trait may be due to non-genetic risk factors. For example, if certain environmental risk factors are more common in a population with one ancestry, then case-control association studies of genetic variants would still be confounded.
There has been considerable controversy about the degree to which population stratification may affect case-control studies of cancer (12-14), but there is little data on how stratification may affect cancer studies among Latinos. Because they are genetically admixed and have significantly lower breast cancer incidence rates than Whites or African Americans, Latinas offer an opportunity to address this question. We used a series of ancestry informative markers to estimate the genetic ancestry of 241 Latina breast cancer cases and 333 age-matched Latina population controls to determine the distribution of genetic ancestry and evidence for substructure and recent admixture among Latinas and to test the association of genetic ancestry with breast cancer risk factors. We reasoned that if risk factors for breast cancer are associated with genetic ancestry, then genetic association studies in this population are likely to be confounded by ancestry.
| Materials and Methods |
|---|
|
|
|---|
Cases
Of 4,842 cases identified through the cancer registry, 618 (13%) could not be contacted (168 deceased, 71 physician refusal, 379 moved or lost). A brief telephone screening interview that assessed self-identified race/ethnicity and study eligibility was completed by 90% of cases. Of 357 cases who self-identified as Hispanic or Latina, 324 (91%) completed the in-person interview, and 241 (68%) provided a blood sample.
Controls
From the pool of eligible women identified through random-digit dialing (81% response to household enumeration), 1,479 were selected as controls. Of these, 103 (7%) could not be contacted (9 deceased, 94 moved or lost), and of the remaining controls, 93% completed the screening interview. Of 479 controls who self-identified as Hispanic or Latina, 421 (88%) completed the interview, and 333 (70%) provided a blood sample.
The analysis was based on 563 Latinas, after excluding 10 women who reported being born in or having all four grandparents from Spain or the Philippines and one woman who did not give any information about her own and her grandparents' place of birth.
Among foreign-born women, 175 were from Mexico, 91 from Central America (44 from El Salvador, 27 from Nicaragua, 12 from Guatemala, 4 from Costa Rica, 3 from Panama, and 1 from Honduras), 30 from South America (8 from Columbia, 8 from Peru, 4 from Argentina, 4 from Ecuador/Galapagos, 3 from Chile, 1 from Bolivia, 1 from Brazil, and 1 from Uruguay), and 10 from the Caribbean (5 from Puerto Rico, 4 from Cuba, 1 from the Dominican Republic). U.S.-born women were categorized further based their report of grandparents' place of birth, including Mexico (n = 100), Central America (n = 2), and Caribbean (n = 8), and those who reported grandparents from more than one region were classified as "mixed origin" (n = 116). Women with all four grandparents born in the United States were grouped separately (n = 31).
Institutional review boards at all participating institutions approved the study and all of the participants gave written informed consent.
Data Collection
An extensive structured questionnaire was given in participants' homes by bilingual and bicultural professional interviewers in English or Spanish to collect information on parents' and grandparents' country of birth, residential history, family history of breast cancer, menstrual and reproductive history, hormone therapy use, and other lifestyle factors. Body mass index (BMI) was calculated as weight (kg) divided by height (m) squared as measured by the interviewer at the time of the interview.
Marker Selection
Forty-four markers were preselected to be informative for either Indigenous American-European ancestry differences, European-African ancestry differences, or Indigenous American-African ancestry differences. Markers were identified either based on previous reports in the literature or differences in allele frequency in existing databases as previously described (6). For each marker, the allele frequency in ancestral populations were confirmed in samples from European populations (n = 243): Ireland, England, Germany, and Spain; Indigenous American populations from the western United States, Mexico, and Central America (n = 148): Maya, Pima, Cheyenne, and Pueblo; and sub-Saharan African populations (n = 481): Nigeria, Central African Republic, and Sierra Leone. The mean allele frequency difference for all 44 markers is 0.30 between the European and Indigenous American populations, 0.42 between the European and African populations, and 0.42 between the African American and Indigenous American populations.
Genotyping
Genotyping was done using single base extension, and detection of specific alleles was done by fluorescence polarization. A complete list of primers and conditions used for these primers is available in Salari et al. (6).
Statistical Analysis
We used
2 tests to test for deviations from Hardy-Weinberg equilibrium. We also examined the proportion of heterozygotes in the expected minus the observed data as a means of detecting the direction of deviation from Hardy-Weinberg equilibrium because population substructure leads to excess homozygosity (17). We also tested for allelic association among all pairs of markers on different chromosomes using
AB, as described by Weir et al.(18), which tests for correlation among genotypes. Allelic association among markers that are physically unlinked implies the presence of population substructure (i.e., non-random mating) and/or recent admixture (19). Population substructure and recent admixture without substructure would both lead to variation in individual ancestry and the possibility of association between ancestry and the phenotype of interest.
We estimated individual genetic ancestry by a maximum likelihood approach (20, 21). Briefly, for each genotype, an expression of the likelihood of origin from each of three populations is derived, based on the allele frequencies in the ancestral populations. The sum of the log-likelihoods for all genotypes for an individual is maximized over the range of possible values of ancestry. A program for maximum likelihood estimation of ancestry in JAVA is available from the authors upon request. We also used structure, a population genetics program that implements a Bayesian approach to infer individual ancestry (22), as a complimentary method of analysis. For structure, we inputted the ancestral population data as part of the genotype file but did not use population labels. Thus, the inference about population membership for both ancestral populations and for the cases and controls was based on the structure inference alone. To compare ancestry among subgroups of Latinas, we used rank sum tests. We compared the results of the structure analysis and the maximum likelihood analysis and found very strong correlations between these two approaches for % European ancestry (r = 0.95), Indigenous American ancestry (r = 0.95), and African ancestry (r = 0.94). Given the high correlation, we present results from the maximum likelihood analysis only.
We used ANOVA to test the association between ancestry and various breast cancer risk factors, including BMI, parity (number of full-term pregnancies), age at first full-term pregnancy, history of hormone therapy use (yes/no), age at menarche, and age at menopause (surgical or natural). Because the purpose of the current study is to examine the association between risk factors and genetic ancestry, we included both breast cancer cases and controls in the ANOVA, adjusting for case/control status. We used logistic regression models to further explore associations of ancestry with overweight (BMI 25-29.9 versus <25), obesity (BMI
30 versus <25), parity (<2 versus
2), and history of hormone therapy use (yes versus no), adjusting for age, education (some high school or less, high school graduation, some college, and college graduation or higher), and grandparents' country of birth (Mexico, Central America, South America, Caribbean, United States, or mixed). In addition, logistic regression models that included women born outside of the United States adjusted for age at migration to the United States, and models that included all women adjusted for place of birth (foreign born versus U.S. born).
We also examined whether the association between individual ancestry informative markers and breast cancer risk is modified by adjustment for individual ancestry. We tested the association between each marker and breast cancer risk using logistic regression models and entering each marker into the model using an additive model. We then tested the association between each marker and breast cancer risk, adjusting for individual ancestry, using both African and Native American ancestry in the model. (Entering all three ancestral components into the model is not possible because knowledge of two of the ancestral components perfectly determines the third component). For each marker, we determined the negative log of the P value as a summary estimate of the strength of the association before and after adjustment for ancestry. We carried out the same analysis for BMI using logistic regression models, which compared women in the normal weight group (BMI < 25) with obese women (BMI > 30).
All statistical tests were two sided and done using Stata (version 8.0).
| Results |
|---|
|
|
|---|
|
|
|
|
2 full-term pregnancies; P = 0.04), and high BMI (BMI
25; P = 0.01) and a nonsignificant association with no history of hormone therapy use (P = 0.2). In analysis of cases only, we found associations in the same direction: significant associations with high BMI (P = 0.01) and no history of hormone therapy use (P = 0.02) and nonsignificant associations with low education (P = 0.1) and high parity (P = 0.1). We found no evidence for a statistical interaction among genetic ancestry, case/control status, and any of these risk factors, although tests for interaction in this data set are likely to be underpowered.
|
|
Because Indigenous American and European ancestry are inversely related (see Fig. 1), the risk factors associated positively with Indigenous American ancestry were inversely associated with European ancestry. We found no significant associations between African ancestry and breast cancer risk factors.
We tested the association for each ancestry informative marker and breast cancer risk before and after adjustment for ancestry. To compare the association before and after adjustment, we plotted the negative log Ps for association with each marker. Figure 2 represents the results of this analysis, with each dot on Fig. 2 representing one marker's negative log P for association before (on the x-axis) and after (on the y-axis) adjustment. A negative log P > 1.3 is equivalent to P < 0.05. Only 2 of the 44 markers were significantly associated with breast cancer risk (P < 0.05). After adjustment, only one of these markers remained significant. Overall adding ancestry to the models had little effect on the overall distribution of associations with these markers (Fig. 2A).
|
30) before and after adjustment for ancestry. Five of the 44 markers were significantly associated with obesity before adjustment. However, after adjustment, only 3 of 44 markers remained significant. Furthermore, the overall distribution of associations seems substantially more attenuated after adjustment for ancestry in the models (Fig. 2B). | Discussion |
|---|
|
|
|---|
This analysis also identified extensive diversity in genetic ancestry within immigrants from each region. Such diversity creates the possibility of false-positive and false-negative association results in case-control studies. However, for false-positive and false-negative results to occur, there also needs to be an association between breast cancer risk and ancestry (either due to genetic or environmental differences between ancestral groups in the admixed population).
We found significant associations between several risk factors for breast cancer and genetic ancestry among Latinas. Hormone therapy use was lower among Latinas with higher Indigenous American ancestry or lower European ancestry. Even among Latinas born in the United States, and after adjustment for education and grandparents' country of birth, there remained an association between higher Indigenous American ancestry and lower hormone therapy use. Thus, there are other factors, possibly either related to cultural factors or differential access to care, that are associated with lower likelihood of hormone therapy use among U.S. born Latinas with higher Indigenous American ancestry. We also found an association between parity and higher Indigenous American ancestry. However, this association was largely accounted for by differences in education and grandparents' country of birth.
Clearly, the associations of less hormone therapy use and higher parity with Indigenous American ancestry are due to non-genetic differences among Latinas of different ancestral backgrounds. However, these associations may also confound genetic association studies. For example, Latinas with higher Indigenous American ancestry who are less likely to have used hormone therapy and more likely to have had higher parity may be at lower risk of breast cancer. Thus, any allele that is at higher frequency in the European ancestral population may be associated with breast cancer due to a confounding effect from differences in non-genetic factors.
Adjusting for known non-genetic risk factors should eliminate confounding by such factors in genetic association studies. However, because not all risk factors are known for breast cancer, adjusting for the currently known risk factors may not completely eliminate confounding. Among Latinas, unknown risk factors that correlate with country of birth, socioeconomic status, and acculturation may be important determinants of breast cancer risk. For example, Latinas who are primarily Spanish speaking are at lower risk of breast cancer compared with Latinas who are English speaking, even after adjustment for all known risk factors (16). Thus, other unknown environmental risk factors for breast cancer may exist in this population, and these factors may also be associated with cultural practices and genetic ancestry. Therefore, our results imply that genetic association studies in unrelated individuals from Latina populations should include measurement of genetic ancestry to avoid confounding.
A stronger case for confounding in genetic association studies among Latinas can be made if markers at candidate genes are associated with breast cancer, and it can be shown that these associations are attenuated after adjustment for genetic ancestry. The present results show a substantial effect of admixture on the association between ancestry informative markers and obesity; the magnitude of association was diminished for most markers. Furthermore, because our ancestry estimates is free of error, which tends to diminish the effect of the adjustment, it is possible that we would have had even greater adjustment had we used more markers for a more precise estimate of ancestry. Our study did not find any substantial effect of adjustment for ancestry on the association between breast cancer and these ancestry informative markers. It is possible, however, that more modest effects of adjustment for ancestry would have been detected with a larger sample size.
Higher BMI has generally been associated with increased risk of breast cancer in postmenopausal women. This analysis identified an association between BMI and Indigenous American ancestry among foreign-born Latinas, with a higher proportion of Indigenous American ancestry found among overweight and obese women. This association remained significant when adjusting for country of origin. However, there may be non-genetic explanations for this association; unmeasured differences in dietary habits and physical activity associated with culture, education, and socioeconomic status may underlie these differences in part. In addition, our observation that the association with BMI and ancestry differs by place of birth suggests that even if this is a genetic effect, it may be modified by environmental factors.
Several studies of cancer have shown that genetic ancestry could be a source of confounding. Kittles et al. showed that among African-American prostate cancer cases and controls, the distribution of ancestry informative markers was significantly different, suggesting that case-control studies of prostate cancer among African Americans may be confounded by genetic ancestry (27). Similarly, Freedman et al. showed confounding in a case-control study of prostate cancer among African Americans (28). Barnholtz-Sloan et al. showed significant confounding by genetic ancestry, even after accounting for self-reported race/ethnicity in a study of lung cancer (29). We have previously shown that studies of asthma susceptibility (7) and severity (6) are potentially confounded by genetic ancestry among Latinos.
Because the populations we studied were mainly of European and Indigenous American ancestry, our study has good statistical power to assess associations with differences between European and Indigenous American ancestry. Although we found no significant associations between African ancestry and breast cancer risk factors, the population in this study had relatively little African ancestry, which limited our ability to draw conclusions about associations with African ancestry.
This study was also limited by the use of only 44 markers to assess genetic ancestry. Measurement of genetic ancestry with markers is always associated with a certain amount of random error, due to the limited number of markers used and the imperfect information from each marker (11, 30). The greater the number of markers and the more informative each marker is for ancestry, the lower the error is. Simulation studies of a three-population model with a range of ancestry informative markers similar to the one used in this study suggest that the correlation coefficient between the ancestry estimate and the true ancestry is
0.8 (31). Because the error in the estimate of ancestry is random with respect to the associations we tested, it should generally bias our results towards the null hypothesis. Thus, the associations we observed in this study are likely to represent conservative estimates.
The present analysis of genetic ancestry and breast cancer risk factors combined cases and controls and adjusted for case/control status. However, if there were interactions among case/control status, genetic ancestry, and any of these risk factors, an adjusted analysis would be flawed. We found no significant interactions among any of these risk factors, genetic ancestry, and case/control status.
This study included Latina women living in the San Francisco Bay Area. Other regions of the United States have different distributions of Latinos from Mexico, Central America, South America, and the Caribbean. Therefore, the distributions of ancestry and the association with breast cancer risk factors we identified are likely to vary in different regions in the United States. In addition, the associations we identified are likely due to various degrees of acculturation, which may differ in different regions of the United States.
We used grandparents' country of birth to assess whether genetic ancestry would provide additional information. A further question, not directly assessed in this study, is whether self-report of grandparents' or parents' ethnicity (e.g., European, Indigenous, and African) is a reliable estimate of genetic ancestry. Williams et al. reported that among Pima Indians, self-reported ancestry based on grandparents' origin performs similarly to genetic ancestry based on 12 markers. We did not directly ask women about their parents' and grandparents' Indigenous American, European, or African ancestry. The degree to which this would be well known depends on how recent admixture has been in these populations. We are not aware of any empirical data comparing self-reported ethnicity/population ancestry with genetic assessment among Latinos.
The chance of a false-positive result due to differences in the genetic ancestry of cases and controls increases as the sample size increases (32). Because the goal of genetic association studies for complex traits, including breast cancer, is often to identify relatively modest effects, the sample sizes planned for such studies are often in the thousands (33). Even if there are subtle associations between ancestry and the phenotype of interest, the chance of false positives may be magnified by the large sample sizes required for such studies.
In summary, we identified substantial variation in individual ancestry among Latina women living in the San Francisco Bay Area and associations between genetic ancestry and hormone therapy use and BMI. These results suggest that population stratification may affect the results of genetic association studies for breast cancer among Latinas. Therefore, such studies should collect information about genetic ancestry to assess and adjust for differences in ancestry between cases and controls.
| Footnotes |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: A preliminary version of some of the results in this article was presented at the American Society of Human Genetics Meeting, Salt Lake City, Utah, 2005.
Received 2/ 8/06; revised 7/28/06; accepted 8/14/06.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
L. Fejerman, E. M. John, S. Huntsman, K. Beckman, S. Choudhry, E. Perez-Stable, E. G. Burchard, and E. Ziv Genetic Ancestry and Risk of Breast Cancer among U.S. Latinas Cancer Res., December 1, 2008; 68(23): 9723 - 9728. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. C. Aldrich, S. Selvin, H. M. Hansen, L. F. Barcellos, M. R. Wrensch, J. D. Sison, C. P. Quesenberry, R. A. Kittles, G. Silva, P. A. Buffler, et al. Comparison of Statistical Methods for Estimating Genetic Admixture in a Lung Cancer Study of African Americans and Latinos Am. J. Epidemiol., November 1, 2008; 168(9): 1035 - 1046. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Barnholtz-Sloan, B. McEvoy, M. D. Shriver, and T. R. Rebbeck Ancestry Estimation and Correction for Population Stratification in Molecular Epidemiologic Association Studies Cancer Epidemiol. Biomarkers Prev., March 1, 2008; 17(3): 471 - 477. [Full Text] [PDF] |
||||
![]() |
J. N. Weitzel, V. I. Lagos, J. S. Herzog, T. Judkins, B. Hendrickson, J. S. Ho, C. N. Ricker, K. J. Lowstuter, K. R. Blazer, G. Tomlinson, et al. Evidence for Common Ancestral Origin of a Recurring BRCA1 Genomic Rearrangement Identified in High-Risk Hispanic Families Cancer Epidemiol. Biomarkers Prev., August 1, 2007; 16(8): 1615 - 1620. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |