
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Laboratory of Population Genetics, National Cancer Institute, NIH, Bethesda, Maryland 20892 [D. J. K., J. P. S.], and Department of Epidemiology, Johns Hopkins University School of Public Health, Baltimore, Maryland [T. H. B.]
| Abstract |
|---|
|
|
|---|
Segregation analysis was used to evaluate the likelihood of various genetic and nongenetic models. Sporadic, environmental, and general Mendelian genetic models fit the family data poorly and were rejected. A Mendelian recessive model fit better than dominant and codominant models, although none of these could be rejected. Cumulative incidence curves predicted by the recessive and codominant models fit observed incidence among first-degree relatives well. The assumption of Mendelian transmission of a major recessive gene(s) is compatible with the data.
The recessive model predicts that 4% of women would carry the high-risk genotype, with 85% of them developing breast cancer by age 70. There was significant heterogeneity between these families and the 114 BRCA1/2 mutation-positive families from the same study population, implying that this apparent recessive effect is not because of undetected BRCA1/2 mutations. The study adds support for a major autosomal recessive component to breast cancer susceptibility.
| Introduction |
|---|
|
|
|---|
3 cases do not carry a BRCA1 or BRCA2 mutation. Three United States studies of high-risk families support these findings (10, 11, 12)
; between 19% and 39% of these families were unlinked to either of these two genes. The authors of one study (11)
concluded that "at least one more major gene for inherited (breast) cancer remains to be found." Some linkage studies looking for such a gene (13
, 14)
were based on a rare dominant gene model (15)
that was also used as the basis for linkage studies for BRCA2. Whereas this model may be applicable to additional breast cancer genes, it may be possible to refine estimates of the penetrance and allele frequency of such genes by performing a new segregation analysis for breast cancer that excludes BRCA1 and BRCA2 families. One segregation analysis of families of known BRCA1 and BRCA2 noncarriers published recently (16)
tested Mendelian and polygenic models, and found a Mendelian recessive gene best described breast cancer patterns among 858 study families with no detected BRCA1/2 mutation. We performed segregation analysis on families without BRCA1/2 mutations to look for statistical evidence of an additional major gene(s) that influences age-of-onset of breast cancer. A total of 231 Jewish families were studied that contain at least one affected family member who tested negative for mutations in BRCA1 and BRCA2. We also tested whether age-specific risks, penetrance, and allele frequency of the hypothetical gene(s) appear distinct from BRCA1 and BRCA2 mutation carrier families from the same study population. This analysis of observational data cannot prove the existence of another causal genetic factor but can provide statistical evidence to support or reject additional genetic control of breast cancer and serve as the basis for linkage studies to help locate such genes. | Materials and Methods |
|---|
|
|
|---|
80% of mutations in the Ashkenazim (18)
. Neither personal nor family history of breast cancer was used as criteria for entry into the study. Only the 5108 (96%) volunteers who gave consent for future use of their information at the initial ascertainment were eligible for this study. After using personal information to determine that members of 4873 distinct families had participated, all of the personal identifiers were removed from the blood samples and questionnaires rendering them anonymous and preventing identification or recontact of volunteers and their relatives. Volunteers from 255 families had a history of breast cancer but did not carry a founder mutation.
Approximately two-thirds of Jewish breast-ovarian cancer families are predicted to carry a BRCA1 mutation (19) . Additionally, in both the CASH and WAS populations, after excluding BRCA1/2 mutation carriers, a family history of ovarian cancer does not increase the odds of breast cancer (20 , 21) . Thus, families of WAS breast cancer cases (n = 24) with any reported ovarian cancer were excluded, to reduce the likelihood of including families segregating undetected BRCA1 mutations. None of the families studied reported a case of male breast cancer.
Segregation analysis was applied to the remaining families of 231 unrelated volunteers with breast cancer who do not carry the three BRCA1/2 founder mutations. The probands in these families are referred to as noncarriers. Family histories of cancer obtained from volunteers included information on parents, sisters, and daughters over age 20. Of 637 female first-degree relatives, 602 (95%) had complete data available (including current age or age at death, cancer history, type of cancer, and age at diagnosis). These included 78 of 82 relatives who were reported as having had breast cancer (Table 1)
.
|
A formal test of heterogeneity compared the families of 231 noncarriers to 114 eligible families of WAS probands who do carry a BRCA1/2 founder mutation. Mutation carrier pedigrees were constructed and checked using the method described above for noncarriers, although families with ovarian cases were included to retain as much information on carriers as possible. Of 268 eligible first-degree female carrier relatives over 20 years of age, complete information was available on 258 (96%), including 59 of 60 breast cancer cases (98%; Table 1
).
In addition, risk factor data from women who volunteered for WAS, and had neither breast cancer, a founder mutation, nor family history of ovarian cancer (n = 3,193) were used for comparison with information from the 231 affected probands.
This study was approved by the Johns Hopkins School of Public Health Institutional Review Board and the Office of Human Subjects Research at the NIH.
Statistical Methods
Preliminary Analysis.
Initially all of the female WAS volunteers with no founder mutation and no ovarian cancer in their family (n = 3193) were analyzed to examine whether known risk factors for breast cancer were associated with the disease. Age-adjusted multivariate logistic models predicted odds of breast cancer associated with oral contraceptive use, hormone replacement therapy, early age at first birth, parity, age at menarche, and menopausal status. Increased odds of disease associated with family history were also calculated in noncarriers, adjusting for the same covariates.
Segregation Analysis.
Patterns of disease distribution within families (dominant Mendelian, recessive Mendelian, and so forth) were modeled using modified logistic regression (22)
. In addition to Mendelian models, sporadic, environmental, and polygenic models were tested. The models that best fit the study families are compared with a general, unrestricted model using the LRT. Models can include one, two, or three "types" of subjects, represented as type AA, AB, and BB. In the genetic models, these correspond to the genotypes of a hypothetical biallelic gene. Transmission parameters in the model (
AA,
AB, and
BB) represent the probability that a parent of a given type transmits a factor "A" to a child. In Mendelian models these
s are fixed at values of 1.0, 0.5, and 0.0, respectively. The parameter
(penetrance) estimates the proportion of the population developing disease if they lived indefinitely (23)
. The parameters ß and
are used to describe the mean age of onset and its variance.
The 231 noncarrier families were selected because the volunteer (proband) is a breast cancer survivor. To correct for this ascertainment criterion, the ln-likelihood of the models given the observed phenotype and age of onset in the volunteer was subtracted from the ln-likelihood of the model fit to the entire pedigree.
Each model was tested independently with the REGTL subroutine of the S.A.G.E. software package (24) . A second software tool, REGTLHUNT (25) , was used to check whether the best fitting were at global and not local maxima.
Hypothesis testing was performed comparing the -2ln (likelihood) and Akaike Information Criterion (AIC) scores of nested and non-nested models, respectively. The best-fitting models were used to calculate genotype-specific cumulative incidence curves (26) , which were compared with observed Kaplan-Meier risks of breast cancer in the first-degree relatives of probands.
Three tests of heterogeneity were performed. In these tests, the study families are divided into subgroups, and a separate set of models is fit for each strata. If there are significant differences between the subgroups, the additional parameters used to fit multiple sets of models should improve the fit. Significant improvement of fit is measured using a
2 statistic, calculated by subtracting the sum of the -2ln (likelihoods) of the subgroup models from the -2ln (likelihood) of the entire-group model, with degrees of freedom equal to the additional number of parameters used to estimate the subgroup models. A significant result (P < 0.05) indicates rejection of the hypothesis that the fit of the subgroup models are equivalent to the fit of the model for the entire group. We may not have power to detect heterogeneity. Power estimates for segregation analyses are not readily calculated.
The first test of heterogeneity looked for differences between noncarrier families stratified by the menopausal status of the proband (pre-, peri-, or postmenopausal) at the time of diagnosis. The second test of heterogeneity compared the 231 noncarrier families with the families of the 114 BRCA1/2 mutation carriers, to determine whether findings in the noncarrier pedigrees were distinct from BRCA1/2 families in the WAS population. The third test of heterogeneity attempted to test for a cohort effect by stratifying families based on whether they contained any women born before 1910.
To check whether inference of a Mendelian genetic model is correct, we estimated the parameter
AB under an otherwise Mendelian model. If
AB approaches the Mendelian value of 0.50, the original genetic model is less likely to be spurious (27)
.
| Results |
|---|
|
|
|---|
Among women with no founder mutation (noncarriers) and no personal or family history of ovarian cancer (n = 3193), 231 had survived breast cancer for an average of 8 years (Table 2)
. The mean age of onset for cases was 53 (SD = 11.1). The 231 case parents, siblings, half-siblings, and children over age 20 were eligible for analysis. Because 73% of the probands had incomplete data for at least one grandmother, grandmothers were excluded. Overlapping histories of 42 families that contained multiple WAS volunteers provided an opportunity to estimate the reliability of noncarrier case reported family history. Among relatives described by multiple volunteers, 57 of 58 breast cancer cases (98%) were also reported by another relative. Of 167 female first-degree relatives reported by multiple volunteers, 74% of the reports of current age or age at death matched exactly, 18% were within 1 year of each other, and 4% were within 5 years of one other. Similarly, 54% of the reported age of breast cancer diagnoses matched exactly, 27% of overlapping reports were within 1 year of each other, and 10% were between 1 and 5 years apart.
|
Using multiple logistic regression to adjust for the nongenetic covariates in Table 3
, a positive family history of breast cancer remained a risk factor for disease in noncarrier women. Odds of disease increased 5060% (95% CI, 1.22.1) with each affected first-degree relative a volunteer had. Stratifying noncarriers by these covariates provided no evidence of interaction between any of these variables and the observed effect of family history. ORs for the nongenetic risk factors were generally similar in magnitude to findings from other large epidemiological studies. As in the CASH (20)
, having a first- or second-degree relative with ovarian cancer was not related to noncarrier breast cancer risk (OR, 1.2; 95% CI, 0.81.9; P = 0.36). The overall comparability of WAS to CASH (20)
with respect to familial aggregation outside BRCA1/2 families suggests that the WAS noncarrier families may be representative of a more general population for the segregation analysis.
|
and
were fit, fixing male values to indicate a very low penetrance and high mean age of onset among men. The sporadic model was rejected (LRT
2 = 17.93; 7 df; P = 0.012), but this analysis was unable to discriminate between environmental and genetic models, none of which were rejected. The most parsimonious ß-specific model described a codominant Mendelian gene(s) decreasing the mean age of cancer onset with each risk allele inherited. A polygenic model could not be fit to the data. Limitations concerning the polygenic models are discussed below.
In many of the ß-specific models, estimates of the
parameter were not stable; the same model, fit several times, would converge with equal likelihood at different values of
. This suggested that variations in age-of-onset of breast cancer might be better described through differences in the
parameter. A second set of genotype-specific
models was fit. Sex specific ßs and
s were estimated, fixing male values of ß and
. The serial analysis of gene expression software does not permit estimation of a codominant model with genotype-specific
. Because the codominant Mendelian model was the most parsimonious of the ß models, it was included for comparison. This necessitated fitting a general model that included type-specific values for both
and ß.
With the exception of the models for arbitrary and dominant genes,
-dependent models fit the observed families slightly better than ß-models (Table 4)
. The sporadic (LRT
2 = 28.57; 9 df; P = 0.001) and environmental (LRT
2 = 12.65; 5 df; P = 0.03) models were both rejected. Although these differences were not dramatic, the analysis did discriminate between Mendelian models. A Mendelian gene with arbitrary effects assigned to each genotype was narrowly rejected (LRT
2 = 12.90; 6 df; P = 0.044). Mendelian transmission of major codominant (ß), dominant (
), and recessive (
) genes could not be rejected. Of these, the recessive model had the best fit and was the most parsimonious (AIC = 978.68) among all of the models tested.
|
models supported the existence of an additional major gene(s), the LRT statistics alone do not favor a single model. Fig. 1
|
AB (with
AA and
BB still fixed at Mendelian values). In the recessive model,
AB was estimated at 0.51, quite close to the expected Mendelian value of 0.5. In the codominant model,
AB was estimated at 0.75, suggesting that parents with one risk allele were more likely to have children with higher cancer risks than expected under strict Mendelian inheritance. When all three of the transmission parameters (
AA,
AB, and
BB) were estimated, values were 1.00, 0.51, and 0.09, respectively, in the recessive model, and 0.75, 0.55, and 0.16 in the codominant model. The close fit of the recessive model to the observed data, and its adherence of estimated
s to Mendelian transmission values suggest that a major recessive gene is the best explanation of familial aggregation in these families. Additional statistical support for this conclusion comes from the arbitrary
model, which converged at the same values as the recessive model, while estimating an additional parameter. It should be noted that we attempted to fit polygenic models, but were unable to achieve convergence when modeling only residual familial factors. It was determined that the study families contained insufficient numbers of affected mother-daughter pairs (excluding the probands, n = 10) to calculate the residual familial correlation needed to model a polygenic effect.
The recessive model predicts a common mutated allele (qA = 0.20; 95% CI, 0.040.36), resulting in 4% of the population being high-risk ("AA") homozygotes (95% CI, 0.213%). Fig. 2
compares the predicted cumulative incidence of breast cancer among high-risk homozygotes and those with at least one wild-type allele. By age 50, 28% of high-risk AA carriers would develop breast cancer, compared with 1.3% of women carrying a wild-type allele; at age 70, the risks are 85% and 8%, respectively. Among homozygous carriers who develop breast cancer, the mean age-of-onset would be 55, compared with 72 among those with a wild-type allele. Among homozygous carriers, 70% of breast cancer would occur between the ages of 45 and 65, whereas 75% of cases in low-risk women would occur after age 65. Under this model, 36% of cases diagnosed under age 40, and 16% of those diagnosed by age 80 would carry the high-risk genotype.
|
0.5 (P = 0.0001). Probands related to women with p(AA) > 0.50 are similar to probands without a high-risk relative. The two proband groups had the same mean age (61) and similar mean ages of breast cancer onset (probands who are relatives high risk women = 52; probands who are relatives of low risk women = 49; P = 0.11).
Tests of Heterogeneity and Analysis of Birth Year.
In the 114 WAS families where the probands carried BRCA1 or BRCA2 founder mutation, the dominant model was the most parsimonious. We rejected the hypothesis that the pattern of familial aggregation in BRCA1/2 carrier families is similar to that in families of noncarrier cases across all of the models (P
0.00002). The familial clustering of breast cancer in noncarriers described by the recessive
model appears distinct from the pattern observed in the WAS families containing BRCA1/2 founder mutation carriers.
Stratifying noncarrier families into those where probands were pre-, peri-, or postmenopausal at the time of diagnosis provided no evidence (P = 0.18) of genetic heterogeneity based on the time of the proband breast cancer diagnosis. We also tested for heterogeneity based on the birth years of all of the women in the family. Assuming that a cohort effect would be most pronounced in women who were at highest risk of breast cancer after 1960, families were separated into those with no women born before 1910 (all reaching highest risk after 1960; n = 83), and those containing women born both before and after 1910 (n = 148), and the models were reanalyzed. In families containing women born before 1910, the recessive model fit the data best, whereas in families containing only recently born women, the environmental model fit best. However, the differences between the two groups were not statistically significant (P = 0.84), so this cannot be considered as evidence of heterogeneity.
| Discussion |
|---|
|
|
|---|
The cumulative incidence of breast cancer predicted by the recessive and codominant models both matched the observed Kaplan-Meier estimates of risk seen among first-degree relatives. However, when transmission parameters were estimated, only the recessive model closely adhered to Mendelian transmission. Four percent of women would carry the high-risk genotype (95% CI, 0.215%), which would result in a 16% population attributable risk, exceeding estimates of the 510% of breast cancer generally attributed to hereditary breast cancer. A simulation study of BRCA1/2 carriers that found that the majority of hereditary cases of breast cancer have no family history of the disease (28) and that only 2% of hereditary cases attributable to a dominant gene have more than two affected relatives suggests that a significant proportion of hereditary breast cancer occurs outside multiplex families. If so, additional susceptibility genes may have greater population attributable risks than previously thought.
One shortcoming of this study is that only carriers of the Jewish BRCA1/2 founder mutations were excluded, and families segregating undetected BRCA1/2 mutations could confound this analysis. If the founder mutations represent 80% of mutations in Ashkenazim (17) and we assume a 60% lifetime penetrance of breast cancer among carriers (16) , we would expect a maximum of 12 of the 231 affected probands to be carriers. Our exclusion of families with ovarian cancer might reduce this number somewhat, but other probands may represent "sporadic" cases of breast cancer from families segregating a founder mutation among the other affected members. However, the models based on these families showed significant heterogeneity compared with BRCA1/2 carrier families, suggesting that a factor distinct from the known genes exists. A segregation analysis of BRCA1/2 noncarriers by Cui et al. (16) provides empirical data supporting this; testing a two-locus model they found evidence of additional undetected BRCA1/2 mutations against a background of a second, highly penetrant common recessive allele.
To evaluate the findings of this study, additional segregation analyses could be undertaken in other populations showing familial aggregation of breast cancer outside of BRCA1/2 families, such as the CASH study population (20) , and a group of Scandinavian breast and breast-ovarian families studied in Sweden (29) . Although self-reports of family history of breast cancer have been deemed reliable (30, 31, 32) , additional analyses would ideally be based on families where histories of breast cancer could be verified. Recruiting families through a population-based series of incident cases would also be preferable, to reduce volunteer and survival bias. A sufficient number of families should also be studied to permit estimation of a polygenic model.
Finally, collecting data from relatives as well as probands on factors such as age at first birth would allow direct adjustment for risk factors of which the distribution may vary from generation to generation. This could help avoid misinterpreting generational differences in breast cancer risk factors as a genetic effect. Although there was no strong evidence of such a problem in this study, the adjustment and stratification by birth year alone is difficult to interpret, given the collinearity of birth year with age and age-at onset.
Several other segregation analyses (16 , 23 , 33, 34, 35, 36, 37) have provided various degrees of evidence for recessive inheritance of susceptibility to breast cancer. Two of these studies (16 , 37) were specifically designed to look for evidence of genes other than BRCA1 and BRCA2. Cui et al. (16) compared recessive, dominant, and codominant models, mixed Mendelian-polygenic models, and two-locus models of independent dominant and recessive genes that would model the BRCA genes alongside other loci. Of these, models of a highly penetrant recessive gene (virtually 100% by age 70) with allele frequency of 7% (95% CI, 0.050.10) were the most parsimonious, and remained robust against backgrounds of polygenic and residual dominant inheritance. The study did not publish individual hypothesis tests for these models, nor did it evaluate sporadic or environmental models. A British study of 1484 families ascertained through a proband affected by age 55 tested models including simultaneous effects for BRCA1, BRCA2, and a third gene (37) , and supported recessive, polygenic, and mixed recessive-polygenic models. The recessive models predicted a gene with high allele frequency (24%; 95% CI, 12.441.8) and penetrance of 42% by age 70. This method may have precluded detection of any additional dominant genes, because additional dominant effects were likely to be subsumed by the BRCA1/2 models.
The parameter estimates presented here could be used as the basis for linkage studies to try to localize this putative recessive gene. Data on existing collections of breast cancer families already genotyped for markers could be used to test for linkage under the proposed recessive model. Collections of affected twins and other sib-pairs may be among the most useful to detect a recessive gene (38) . Another, nonparametric method, comparative genomic hybridization, may be useful to discover recessive tumor suppressor genes. Comparative genomic hybridization examines tumor tissue in affected relatives for concordant loss of heterozygosity (14 , 39) . Candidate regions are identified on the genome where early somatic deletions of genetic material have repeatedly occurred (14 , 40) .
| Acknowledgments |
|---|
| Footnotes |
|---|
1 This work was supported by the National Cancer Institute at the NIH. ![]()
2 To whom requests for reprints should be addressed, at NIH/National Cancer Institute/Laboratory of Population Genetics, 41/D702, 41 Library Drive, Bethesda, MD 20892. Phone: (301) 435-8955; Fax: (301) 435-8963; E-mail: struewij{at}mail.nih.gov ![]()
3 The abbreviations used are: WAS, Washington Ashkenazi Study; CASH, Cancer and Steroid Hormone; LRT, likelihood ratio test; df, degrees of freedom; p(AA), putative recessive genotype; AIC, Akaike Information Criterion. ![]()
Received 12/30/02; revised 4/17/03; accepted 6/17/03.
| References |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |