Background: Multiple genome-wide and candidate gene association studies have been conducted in search of common risk variants for breast cancer. Recent large meta analyses, consolidating evidence from these studies, have been consistent in highlighting the caspase-8 (CASP8) gene as important in this regard. To define a risk haplotype and map the CASP8 gene region with respect to underlying susceptibility variant/s, we screened four genes in the CASP8 region on 2q33-q34 for breast cancer risk.
Methods: Two independent data sets from the United Kingdom and the United States, including 3,888 breast cancer cases and controls, were genotyped for 45 tagging single nucleotide polymorphisms (tSNP) in the expanded CASP8 region. SNP and haplotype association tests were carried out using Monte Carlo-based methods.
Results: We identified a three-SNP haplotype across rs3834129, rs6723097, and rs3817578 that was significantly associated with breast cancer (P < 5 × 10−6), with a dominant risk ratio and 95% CI of 1.28 (1.21–1.35) and frequency of 0.29 in controls. Evidence for this risk haplotype was extremely consistent across the two study sites and also consistent with previous data.
Conclusion: This three-SNP risk haplotype represents the best characterization so far of the chromosome upon which the susceptibility variant resides.
Impact: Characterization of the risk haplotype provides a strong foundation for resequencing efforts to identify the underlying risk variant, which may prove useful for individual-level risk prediction, and provide novel insights into breast carcinogenesis. Cancer Epidemiol Biomarkers Prev; 21(1); 176–81. ©2011 AACR.
The caspase-8 (CASP8) gene is one of only 3 genes identified as possessing common variants with strong and noteworthy associations with breast cancer risk based on cumulative evidence from candidate gene and genome-wide association studies (1, 2). Similarly, pooled and metaanalyses focusing on CASP8 specifically have indicated strong associations with breast cancer (3–5). In particular, these refer to the highly significant association of the minor allele at D302H in exon 12 (rs1045485) and decreased risk. Some data suggest that another variant, a 6-bp deletion in the CASP8 promoter (−652 6N del, rs3834129) is associated with breast cancer, although the evidence for this variant is much less consistent (6–8). There is no known functional effect of rs1045485 (9), and it is very rare in Asian populations. The del allele of rs3834129 has been suggested to remove an Spl transcription-binding site, although functional data on the effects of this change in lymphocytes are conflicting (6, 9). Evidence thus far suggests that it is likely that other variant/s in linkage disequilibrium (LD) with rs1045485 and/or rs3834129 will be the critical variant/s.
CASP8 resides at chromosome 2q33. Two other genes, caspase-10 (CASP10) and amyotrophic lateral sclerosis 2 (juvenile) chromosome region candidate 12 (ALS2CR12), lie directly adjacent to CASP8. Like CASP8, CASP10 is an initiator of apoptosis, and the CASP10 V410I variant has been reported to be associated with breast cancer (10). Another gene, called “CASP8 and FADD-like Apoptosis Regulator” (CFLAR) lies centromeric to CASP10. It is a member of the same gene family as CASP8 and CASP10, but acts as a negative regulator of apoptosis (11). Given their physical proximity to CASP8 and functional relevance (CASP10/CFLAR), the critical variants could reasonably lie in any of these 4 genes.
We previously genotyped 14 tagging-SNPs (tSNPs) in CASP8 on 2,450 breast cancer case and control subjects from the Sheffield Breast Cancer Study (SBCS) and identified a 4-SNP risk haplotype (1-1-2-1 across SNPs rs7608692, rs1861269, rs6723097, rs3817578; P = 8.0 × 10−5), with a per allele OR (95% CI) of 1.30 (1.12–1.49; ref. 12). This haplotype was substantially more significant than any individual SNP, and was consistent with previous findings [i.e., the common (aspartate) allele at rs1045485, and the ins allele of rs3834129, are associated with the increased risk haplotype]. The aim of the current study was to consider the broader 4 gene region and refine the risk haplotype upon which the susceptibility variant/s lie. Here, we have studied 3,888 breast cancer cases and controls from 2 collaborating sites, genotyped for 45 tSNPs across the 4 genes in the CASP8 region at chromosome 2q33-q34.
Materials and Methods
Case and control subjects
A joint resource of 3,888 breast cancer cases and controls were genotyped: SBCS (n = 2,049) and Utah Breast Cancer Study (UBCS; n = 1,839). The SBCS set consisted of 1,015 histopathologically confirmed breast cancer patients recruited from the surgical outpatient clinics of the Royal Hallamshire Hospital, Sheffield, United Kingdom between 1998 and 2005. Case subjects were a mixture of incident and prevalent cases (median time to diagnosis 2.3 years), with median age at diagnosis (range) 59 (28–92) years. Fifteen percent of SBCS cases had at least 1 first-degree relative with breast cancer. Control SBCS subjects (n = 1,034) were healthy women attending the Sheffield Mammography Screening Service between 2000 and 2004. In the United Kingdom, women are invited for routine mammography screening every 3 years between the ages of 50 and 70 years, and the average uptake in Sheffield is >80%. Women whose mammograms showed no evidence of breast lesions were eligible as controls for this study and median age at recruitment (range) was 57 (45–78) years. Eleven percent of SBCS controls had at least 1 first-degree relative with breast cancer (13). The UBCS set consisted of 905 breast cancer cases identified using the Utah Population Database (UPDB) and confirmed and ascertained through the Utah Cancer Registry; median age at diagnosis (range) was 56 (21–92) years and 41% of UBCS cases had at least 1 first-degree relative with breast cancer. Controls (n = 934) were birth cohort- and sex-matched cancer-free individuals, and 2% of these had at least 1 first-degree relative with breast cancer. Using genealogy from the UPDB, it was established that 208 cases and 564 controls were singletons and the remaining individuals were members of 31 extended pedigrees; although most relationships were distant (average kinship coefficient = 0.017, i.e., approximately 6 meioses distant, or second cousins). All cases and controls were of North European ancestry.
We excluded individuals with genotype call rates <80%, resulting in a total sample of 1,882 cases and 1,896 controls included in the genetic analyses.
For replication of significant single SNPs, we used a replication cohort comprising cases with a strong family history of breast cancer and control subjects from Manchester, United Kingdom. The cases comprised 713 subjects fulfilling the NICE criteria for BRCA1 and BRCA2 screening (>20% risk of mutation), but negative for BRCA1 or BRCA2 mutations as determined by DNA sequence analysis of coding regions, and multiplex ligation-dependent probe amplification to detect deletions and duplications. The 236 control subjects had no cancer and no immediate family history of breast cancer. All cases and controls were unrelated white British women.
Selection of tSNPs and genotyping
We identified 60 tSNPs in the 4 genes for genotyping using LDselect (14), based on an analysis including all known SNPs with data available for the CEPH Utah individuals from HapMap and NIEHS (15). Genotyping was carried out using the Applied Biosystems SNPlex multiplex system (55 SNPs) or 5′ nuclease PCR (TaqMan; 3 SNPs). Genotype data for the remaining 2 SNPs, rs3834129 and rs6723097, was already available for these subjects (12). Genotyping quality was assessed by examination of duplicate concordance and call rates for each SNP and a test for compliance with Hardy–Weinberg equilibrium (HWE) in controls. SNPs were removed if, in either SBCS or UBCS, their duplicate concordance rate was <98% (n = 2), more than 1 plate failed (n = 7), or HWE P < 0.005 (n = 1). We also removed SNPs that were monomorphic (n = 5). This resulted in a final set of 45 tSNPs for analysis (Supplementary Table S1).
To account for familial relatedness in the UBCS subjects, all analyses were carried out using the meta-association options in the Genie (single SNPs) and hapConstructor (haplotypes) software packages which use Monte Carlo testing to derive empirical estimates of significance and 95% CI (16, 17). OR, CI, and significance tests for individual SNPs were derived based on allele dose, dominant, and recessive models. HapConstructor is a data-mining algorithm that builds multi-SNP haplotypes based on association evidence. Starting with evidence from single SNPs, the process adds or removes SNPs using a forward-backward algorithm. Each step includes tests for dominant, additive, and recessive models for each haplotype. The process continues, provided predefined significance thresholds are met with each step. The results from the datamining are the haplotype, genetic model, risk ratio, and P value. The significance thresholds used for the haplotype construction process were 0.05, 0.005, 0.0005, and 0.0001 for haplotypes of 1 to 4 SNPs, respectively, and 0.00005 thereafter. Haplotypes were estimated via the estimation maximization algorithm, and any missing genotypes were internally imputed. All P values were estimated using between 100,000 and 500,000 simulations. Observations more extreme than all simulated data sets were designated P < 1/simulations.
Single SNP analyses
Table 1 and Fig. 1 illustrate the individual tSNP results based on the Cochran Mantel Haentzel test for trend (allele dose risk model). For comparison, Supplementary Table S2 and Supplementary Fig. S1 show the most significant evidence for each tSNP (based on dominant or recessive models). Three of the 45 tSNPs showed at least nominally significant association with breast cancer (Ptrend < 0.05) in single SNP meta-association analyses across the 2 sites (rs3769821, rs6723097, rs700635; Table 1 and Fig. 1). As illustrated in Fig. 1, the most significant single SNPs cluster in CASP8 and ALS2CR12, with rs3769821 in CASP8 being the most significant, with OR per-allele (95% CI) of 1.17 (1.05–1.30; P = 0.0032; Table 1) and ORdom of 1.28 (1.21–1.38; P = 5.3 × 10−4; Supplementary Table S2). For confirmation, we genotyped the 3 most significant tSNPs in a cohort of unrelated familial cases negative for mutations in BRCA1 or BRCA2 from Manchester, United Kingdom and local controls. Nominally significant results were replicated for rs3769821 (Ptrend = 0.042) and rs6723097 (Ptrend = 0.014).
HapConstructor analyses for the genes CASP8 and ALS2CR12 identified highly significant risk haplotypes (P < 5 × 10−6 and P < 1 × 10−5, respectively). No significant haplotypes (P < 0.001) were identified in the downstream genes CFLAR or CASP10.
The most significant risk haplotype with greatest effect size in CASP8 was a 6-SNP haplotype 1-2-1-1-1-1 across rs3834129, rs6723097, rs3817578, rs7571586, rs36043647, and rs35010052 (P < 5 × 10−6; ORdom = 1.29, 95% CI: 1.22–1.33). The first 5 SNPs reside in CASP8, the last SNP, rs35010052, is contained in the approximately 700 bp region between CASP8 and ALS2CR12. The frequency of this haplotype was 0.27 in controls and 0.30 in cases. Risk estimates from the 2 separate sites were ORdom = 1.31, 95% CI: 1.09–1.57 (P = 0.004) and 1.27, 95% CI: 1.22–1.33 (P = 0.0002) for SBCS and UBCS, respectively. The association strength of this 6-SNP haplotype was driven by a 3-SNP subhaplotype, which gave a slightly lower risk estimate, but attained the same significance in the metaanalysis (1-2-1 at rs3834129, rs6723097, and rs3817578; P < 5 × 10−6; ORdom = 1.28, 95% CI: 1.21–1.35; freqcontrol = 0.29, freqcases = 0.32). Single site results for this 3-SNP subhaplotype haplotype were ORdom = 1.29, 95% CI: 1.07–1.55 (P = 0.008) and 1.27, 95% CI: 1.23–1.31 (P = 0.0002) for SBCS and UBCS, respectively. These 3 SNPs are not observed to have substantial levels of LD between them in the population (R2 0.54, 0.03, and 0.05 for rs3834129-rs6723097, rs3834129-rs3817578, and rs6723097-rs3817578, respectively).
The findings for CASP8 are consistent in direction with previous single SNP results for rs3834129 (6) and rs1045485 (3–5), that is, the rs3834129 ins allele and rs1045485 aspartate allele are on or in LD with the risk haplotypes. Similarly, they are consistent with the risk haplotype that we defined in our previous single site SBCS analyses (1-1-2-1 across rs7608692, rs1861269, rs6723097, rs3817578; Table 2; ref. 12).
The most significant risk haplotype and highest effect size in ALS2CR12 was a 3-SNP haplotype 2-2-1 across rs1035140, rs1035142, and rs10185177 (P < 1 × 10−5; ORdom = 1.26, 95% CI: 1.17–1.35; freqcontrol = 0.27, freqcases = 0.30). Single site results were ORdom = 1.29, 95% CI: 1.08–1.54 (P = 0.005) and 1.21, 95% CI: 1.12–1.31 (P = 0.003) for SBCS and UBCS, respectively. The first SNP on this haplotype, rs1035140, lies between CASP8 and ALS2CR12. Due to the strong association between the minor alleles at rs1035142 and rs700635 on this risk haplotype, the 4-SNP and 3-SNP haplotypes created by either adding allele 2 at rs700635 or substituting rs1035142 with rs700635, gave very similar results (both P < 1 × 10−5; ORdom = 1.25, 95% CI: 1.20–1.30; freqcontrol = 0.27, freqcases = 0.30).
We carried out a hapConstructor analysis across the CASP8-ALS2CR12 2-gene region. This analysis of 27 tSNPs converged to the same results identified from the CASP8-only analysis, involving haplotype 1-2-1 at pivotal SNPs rs3834129, rs6723097, and rs3817578. This confirmed the expectation that the risk haplotypes identified in CASP8 and ALS2CR12 are due to the same underlying variant/s and suggests that only 1 risk haplotype exists at this region.
Previously, we defined a 4-SNP risk haplotype in CASP8 based on an analysis of 14 tSNPs and data from a single site (SBCS; ref. 12). Here, we provide a more thorough interrogation of a broader region using 45 tSNPs across 4 genes on chromosome 2q33 genotyped in 2 independent data sets (SBCS and UBCS). Two single SNP results were additionally replicated in a third set of familial cases without BRCA1 or BRCA2 mutations. We have identified a common 3-SNP risk haplotype in CASP8 that drives the association evidence at both sites and results in a highly significant meta-association finding (P < 5 × 10−6; ORdom = 1.28; freqcontrol = 0.29). Evidence from multigene analyses indicates haplotypes that are centered in CASP8, and continues to confirm the involvement of CASP8 in risk to breast cancer. The consistency of results across 2 independent sites lends robustness to the finding and credibility and to the risk haplotype.
A recent genome-wide association study identified that the CASP8 region is associated with melanoma risk, and there are reports that it is also associated with other cancers including chronic lymphocytic leukemia and pancreatic cancer (18–20). These observations suggest that this region may be of broader interest for cancer in general. The risk haplotype provides a strong foundation for resequencing efforts by refining the haplotype upon which the susceptibility variant likely resides. Identification of the critical underlying risk variant may prove useful for individual-level risk prediction, aid in deciphering the role of CASP8 in risk of other cancers (6, 18–20), and provide novel insights into breast carcinogenesis.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Data collection for the UBCS was made possible by the Utah Population Database (UPDB) and the Utah Cancer Registry (UCR). Partial support for all datasets within the UPDB was provided by the University of Utah Huntsman Cancer Institute (HCI) and the HCI Cancer Center Support grant, P30 CA42014 from the NCI. The UCR is funded by contract HHSN261201000026C from the NCI SEER program with additional support from the Utah State Department of Health and the University of Utah. The genotyping and analysis was supported by funding from the Susan G. Komen Foundation (BCTR0706911) and the Avon Foundation (02-2009-080), the Breast Cancer Campaign (2004 Nov 49), Cancer Research UK (C9528/A11292) and Yorkshire Cancer Research (S295, S299, and core funding). L.A.C-Albright acknowledges support from the Huntsman Cancer Foundation. H. McBurney, A. Latif, W.G. Newman, and D.G. Evans were all supported by the Manchester NIHR Biomedical Research Centre.
The authors thank Steve Backus, Kim Nguyen, Jathine Wong, Thomas Naranjo, and Jim Farnham (UBCS) and Sue Higham, Gordon MacPherson, Helen Cramp, Dan Connley, and Ian Brock (SBCS). The authors also thank all the women who took part in these studies.
Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).
- Received August 31, 2011.
- Revision received October 31, 2011.
- Accepted October 31, 2011.
- ©2011 American Association for Cancer Research.