| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
1 Department of Oncology and 2 Genetic Epidemiology Unit, Cancer Research UK, University of Cambridge, Strangeways Research Laboratory, Cambridge, United Kingdom
Requests for reprints: Karen A. Pooley, Cancer Research UK Department of Oncology, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge CB1 8NR, United Kingdom. Phone: 44-1223-741168; Fax: 44-1223-740147. E-mail: karen.pooley{at}srl.cam.ac.uk
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
The physiological actions of progesterone are mediated by the progesterone receptor (PGR) and many studies on the gene, located at 11q23, and translated protein have investigated their possible roles in tumorigenesis. The progesterone ligand binds to its steroid hormone receptor (4) and dimerizes. This complex works as a transcription factor, controlling the expression of downstream genes involved in mammary cell growth and differentiation. In addition, synthetic progestins, as well as the steroid hormones themselves, have been seen to lead to increased transcription of downstream oncogenic targets, such as c-myc and c-fos (5, 6).
Loss of heterozygosity at 11q22-qter has been frequently seen in cervical, ovarian, and breast cancers and has been associated with higher-grade tumors and a more aggressive disease course (7-9), indicating the existence of a tumor suppressor gene within this region and PGR is a good candidate. Similar correlations have been seen between tumor invasiveness and low levels of hormone (10, 11) or higher levels of receptor (12-14). The majority of breast cancers stain positively for both PGR and estrogen receptor; receptor positivity is predictive of response to tamoxifen and overall survival. More recently, microarray studies have found PGR-negative tissue to have high levels of transcripts of genes associated with cell proliferation (15).
The PGR gene is transcribed from two alternative promoters and translated into two different zinc-finger proteins, PR-A and PR-B. These differ by a 165-amino-acid NH2-terminal region present only in PR-B and known as the "B upstream segment" (16). PR-B is a potent transcriptional activator and contributes to the proliferative effects of estrogen, whereas PR-A, the shorter isoform, is necessary to oppose the effects of both PR-B and the estrogen receptor (17, 18).
The promoter region polymorphism +331 G>A has been reported to increase expression of the PR-B isoform and has been postulated to predispose women to breast cancer through increasing PR-B-dependent stimulation of mammary cell proliferation (19), although a more recent study failed to find any association between this polymorphism and breast cancer risk in postmenopausal women (20). Studies on endometrial cancer alone (21) and in combination with clear cell ovarian cancers (22) have also been inconclusive. Other work has associated the rare allele of this polymorphism with an increased likelihood of multiple failed embryo implantations during in vitro fertilization treatment (23).
The other commonly studied PGR polymorphic variants are PROGINS and V660L. These polymorphisms are in perfect linkage disequilibrium with one another (RP2=1.0), so their effects cannot be distinguished using genetic epidemiology. The PROGINS polymorphism consists of a 306 bp Alu insertion in the G intron of the PGR gene, which always occurs with the L allele of the V660L polymorphism. The insert-carrying allele exhibits higher mRNA stability and is transcribed to a more stable and transcriptionally active protein (24). The PROGINS insertion allele has been reported as inversely correlated with risk of breast cancer (25, 26), ovarian cancer (27), and endometriosis (28) in some populations, whereas in other studies, no association has been reported (29-31).
The V660L polymorphism results from G>T substitution in exon 4 of the PGR gene. In the published studies undertaken to date, no significant association has been found between this polymorphism and risk of breast cancer (32-34). One study of ovarian cancer (35) also failed to find an association, whereas another has shown an association of the L allele with an increased risk (36). V660L has also been reported, along with S344T and H770L, to be associated with an increased likelihood of repeated miscarriage (37), suggesting that the resultant PGR protein does not function optimally. However, none of these effects can be attributable to one polymorphism or the other as they are in perfect linkage disequilibrium.
To evaluate whether there are common breast cancer susceptibility alleles in PGR, we have conducted a large case-control association study. We have used a comprehensive SNP tagging approach to identify and test SNPs that can evaluate the effect of all common SNPs in PGR.
| Materials and Methods |
|---|
|
|
|---|
To maximize efficiency, we used a two-stage study design (39-41) in which SNPs that are evidently not associated with breast cancer risk are dropped at the end of set 1. The staged approach substantially reduces genotyping costs without significantly affecting statistical power a comparison is shown in Supplementary Table S2. We carried out genotyping on an initial subset (set 1) of the first 2,345 enrolled cases with invasive cancer and 2,284 EPIC controls. The geographical and ethnic background of cases and controls was very similar, with over 98% being of Anglo-Saxon ancestry. The cases were aged 25 to 73 years at diagnosis (mean, 50.2; SD, 7.8). The controls were aged 44 to 81 years at blood collection and 3 to 5 years after enrollment (mean, 65.2; SD, 7.6). It has been possible to determine menopausal status from the questionnaire data for 2,034 cases (87%) and of these, 1,292 were premenopausal and 742 were postmenopausal at diagnosis.
SNPs that exhibited a difference in genotype distribution between cases and controls that reached a predefined threshold of P < 0.1, using either a 2 degree of freedom (df) heterogeneity test (Phet) or a trend test (Ptrend), were further evaluated in a second subset of 2,302 cases from SEARCH and 2,280 controls from EPIC-Norfolk (set 2). All selection criteria were as for set 1. The set 2 cases at diagnosis were aged 23 to 70 years (mean, 53.3; SD, 9.3) and the controls were aged 43 to 81 years (mean, 62.3; SD, 8.6). Menopausal status has been determined for 1,930 set 2 cases (84%) and, of these, 1,563 were premenopausal and 367 were postmenopausal at diagnosis.
As there was no evidence for heterogeneity between set 1 and set 2, it was possible to combine the data for the two series.
Haplotype Block Definition and Tagging
Our principle hypothesis was that there are one or more SNPs in PGR that are associated with an increased or decreased risk of breast cancer. Thus, the aim of the SNP tagging approach was to identify a set of SNPs (stSNP) that efficiently tags all the known SNPs and is also expected to tag any unknown SNPs in the gene. The best measure of the extent to which one SNP tags another SNP is the pairwise correlation coefficient RP2 because the loss in power incurred by using a marker SNP in place of a true causal SNP is directly related to this value. We aimed to define a set of tagging SNPs such that all known common SNPs (minor allele frequency >0.05) had an estimated RP2 of >0.8 with at least one tagging SNP. However, some SNPs are poorly correlated with other single SNPs but may be efficiently tagged by multiple SNPs, thus reducing the number of tagging SNPs needed. As an alternative, we aimed for the correlation between each SNP and a group of tagging SNPs (RS2) to be at >0.8. We used the University of Washington NIEHS Environmental Genome Project SNPs Program PDR90 resequencing data to identify tagging SNPs.5 Two hundred forty-five SNPs were identified in the PGR gene, of which 81 were biallelic SNPs or insertion/deletion polymorphisms of <7 bp, with a minor allele frequency >5%.
The Graphical Overview of Linkage Disequilibrium package (42) was used to create a graphical summary of pairwise linkage disequilibrium patterns for the 81 eligible variants and, hence, to identify haplotype blocks (Fig. 1A ).6 Tagging SNPs were selected using the TagSNPs program (43).7 This program uses the partition-ligation expectation-maximization algorithm to estimate haplotype frequencies based on the full set of 81 SNPs. An RS2 value was obtained between every measured SNP and every possible set of stSNPs, where RS2 is the expected squared correlation between an observed genotype at the SNP the genotype predicted on the basis of only the set of stSNPs. The optimal set of stSNPs was taken to be the smallest set that gave a minimum RS2 of >0.8.
|
5% conferring a relative risk of at least 1.4, or a recessive allele with frequency
10% conferring a relative risk of at least 2.
Taqman Genotyping
Genotyping was done by 5' nuclease assays (Taqman) using the ABI PRISM 7900HT Sequence Detection System according to instructions of the manufacturer. Primers and probes were supplied directly by Applied Biosystems (Warrington, United Kingdom) as either Assays-by-Design or Assays-on-Demand (PGR-07, PGR-09, and PGR-11 only), the details of which, along with the reaction conditions, are shown in Supplementary Table S3. All assays were carried out in 384-well plate format, with each plate including negative controls (with no DNA) and positive controls duplicated on a separate quality control plate. Assays for which >98% of the duplicated samples did not give identical genotypes were discarded. Failed genotypes were not repeated.
Statistical Methods
Deviation of genotype frequencies in controls from the Hardy-Weinberg equilibrium was assessed by a
2 test with 1 df. The primary tests of association were univariate analyses for each of the stSNPs. Genotype frequencies in cases and controls were compared using a 2 df,
2 test for heterogeneity (Phet) and a 1 df Cochran-Armitage
2 test for trend in risk by allele dose (Ptrend). Genotype-specific risks were estimated as odds ratios (OR) using standard cross-product ratios, with confidence intervals (CI) calculated using the variance of the log (OR), estimated by the standard Taylor expansion.
Likelihood ratio tests to compare models of recessive, codominant, and dominant modes of SNP action were done using binary logistic regression to assess the log likelihood of each model compared with a general model.
Tests for interaction between genotype and menopausal status were carried out in a case only design using a
2 test with 2 df. Under the assumption that genotype is not related to exposure (menopausal status), this provides a more powerful test of interaction than a full case-control analysis.
We compared the common haplotype frequencies (>0.05) in cases and controls using the haploscore program (44), implemented in S-plus. Haploscore computes score statistics (and hence significance levels) to test for associations between individual haplotypes and disease status, along with a global score test of association.
For the V660L polymorphism, we pooled our results with those from other published studies for the same SNP. A Mantel-Haenszel test was used to evaluate the difference in genotype frequencies between cases and controls, stratified by study. Genotype-specific ORs were estimated using logistic regression, with an appropriate test for heterogeneity between studies.
PupaSNP Finder
PupaSNP (putative phenotypic alterations caused by SNPs) is a web-based tool used as a means of identifying potential phenotypic effects of SNPs at the level of transcription (45).8 The program uses submitted gene sequences or chromosomal coordinates to retrieve a list of SNPs that could affect conserved regions, such as intron/exon boundaries, exon splicing enhancers, and transcription factor binding sites. The SNP location data is based on the Ensembl genome browser map.9
| Results |
|---|
|
|
|---|
An additional SNP in the promoter region, +331G>A (PGR-06), was also selected for analysis. Although its rare allele frequency in the NIEHS PDR90 sample population was lower than our 5% threshold, this polymorphism was included because there were previous reports of its association with an increased risk of both endometrial and breast cancer (19, 21). We also typed two additional SNPs that had been selected based on their genomic positions and rare allele frequencies before the availability of the NIEHS Environmental Genome Project resequencing data. These two SNPs were not present in the NIEHS data. Thus, a total of 10 SNPs were investigated in set 1.
Genotyping Set 1
The results of the set 1 genotyping in are summarized in Table 1
. There was no evidence for deviation of the genotype frequencies from Hardy-Weinberg equilibrium in controls, apart from PGR-04 where there was some evidence of an excess of rare homozygotes (PHardy-Weinberg equilibrium = 0.03). Re-evaluation of the genotyping raw data shows nothing abnormal about the assay or genotype calls, and this seems likely to have been a chance finding. Seven of the SNPs, (PGR-03, PGR-04, PGR-05, PGR-07, PGR-10, PGR-11, and PGR-12) exhibited possible evidence for an association at P < 0.1 using either Phet or Ptrend, and, therefore, fitted our criteria for further evaluation.
|
211=13.5) using the global score test of haploscore (44). We used the haploscore and the TagSNPs (43) programs to determine the haplotype arrangements of all 10 SNPs in our set 1 subjects (Fig. 2
). Two SNPs, PGR-05 and PGR-07, were found to be in perfect linkage disequilibrium (Rp2 = 1.0) with one another in the East Anglian population sample despite having tagged different haplotypes in the NIEHS Environmental Genome Project sample set. Thus, the redundant PGR-07 SNP was omitted from further investigation. We also selected PGR-06 (+331G>A) for evaluation in set 2 as it has been associated with breast cancer in other studies.
|
|
|
A Meta-analysis of V660L
We did a combined analysis of the genotype frequencies associated with V660L using our own data and that from three published studies (refs. 32, 33, 36; Table 4
). Since De Vivo et al. (33) analyzed carriers of the L allele as a single genotype, a full analysis of genotype-specific risks was only possible in the other three studies. The pattern of risks in the SEARCH, Spurdle et al. (32), and Pearce et al. (36) studies seemed somewhat different in that the SEARCH and Spurdle et al. studies showed evidence of a positive association between the L allele and breast cancer, whereas the Pearce et al. study did not. However, there was no significant evidence of heterogeneity in the estimated ORs between studies (homogeneity test
2 = 4.81, 4 df). The combined analysis based on these three studies provided evidence for increased risks of breast cancer associated with the VL and LL genotypes (ORLV versus VV, 1.08; 95% CI, 1.00-1.16; ORLL versus VV, 1.17; 95% CI, 0.93-1.47), Phet = 0.17, Ptrend = 0.05. The ORs were indicative of a codominant (allele dosage) model with an estimated OR per L allele carried of 1.08 (95% CI, 1.01-1.15). If the De Vivo et al. (33) study is also included, the estimated OR associated with VL and LL genotypes combined was 1.09 (95% CI, 1.02-1.17), P = 0.009.
|
| Discussion |
|---|
|
|
|---|
Because of the danger of false positives due to multiple testing, we consider it unwise to attempt interaction and subgroup analyses until the main genetic effect is fully established (46) and so we have generally avoided this. It will be interesting to see if there are stronger subgroup associations in future studies.
In addition to V660L, we found some weak evidence that an association with PGR-05 polymorphism, an intronic polymorphism just upstream of V660L, is also associated with breast cancer risk. This polymorphism has a suggestive dominant protective effect in our SEARCH breast cancer cases (heterozygote risk OR, 0.90; 95% CI, 0.83-0.98; rare homozygote risk OR, 0.95; 95% CI, 0.81-1.12), but a larger sample size is needed to confirm this and, consequently, this polymorphism is worthy of further investigation in other populations. PGR-10 also showed some weak evidence of an association, but this may be explained by the fact that the rare PGR-10 allele is also present on the 660L haplotype. The P value for test of difference in haplotype frequency between cases and calculated using haploscore did not prove to be significant.
The promoter region SNP, +331G>A (PGR-06), did not exhibit any significant differences in genotype distribution in set 1 or set 2 individually or when combined. A previous, smaller report had indicated that the rare allele was associated with a reduced risk in premenopausal women and an increased risk in the postmenopausal group (19). However, our data indicated no significant difference in genotype distribution between premenopausal and postmenopausal cases (Supplementary Table S2).
We have attempted a comprehensive SNP tagging study of the PGR gene. How certain can we be that we have evaluated all the common PGR SNPs and haplotypes? A cross-comparison of all 81 suitable SNPs identified in the NIEHS Environmental Genome Project analysis with our set of stSNPs showed 56 to be tagged on a pairwise basis with RP2 > 0.80, and a further 24 tagged by a multivariate RS2 > 0.79. The remaining SNP was the singleton SNP that failed assay design. This SNP, which has a minor allele frequency of 0.12 and is a nonsynonymous C>T polymorphism 1.8 kb upstream of the 5' untranslated region, is unlikely to be functional and, given that no other SNPs define the same haplotype, it is improbable that a real association has been missed by not typing it. By further analysis of the Environmental Genome Project PDR90 data set, we could exclude 28 subjects who clearly carried African-specific alleles,10 and there remained 73 suitable SNPs. The same total haplotype and SNP tagging was found in this "PDR62" data set with our chosen stSNPs. Our confidence in the adequacy of the tagging is reinforced by the fact that the two SNPs originally examined in a previous study, and here genotyped in addition to the tagging set, were both in perfect linkage disequilibrium with a member of the tagging set in our set 1 genotypingPGR-07 and PGR-08 with PGR-05 and PGR-03, respectively. Thus, they contributed no additional information. Using the Environmental Genome Project data, the gene was treated as a single block of linkage disequilibrium for the purposes of SNP selection (Fig. 1); however, different data sets and different SNP search methods may lead to different block structures; for example, Pearce et al. (36) identified SNPs over a wider genomic region and treated their data as four linkage disequilibrium blocks.
What is the maximum estimated disease risk associated with any of the common SNPs we have excluded from association with breast cancer? For all the SNPs studied, the maximum upper 95% CI for any OR was 1.21 for a heterozygote and 1.88 for a rare homozygote (Tables 2-4). Based on these upper confidence limits, the allele frequencies of the tagging SNPs and assuming an RP2 of 0.8, the maximum OR associated with any SNP is unlikely to be >1.3 in heterozygotes and 2.8 in homozygotes.
What could explain the association we are seeing with the V660L polymorphism? It is possible that the rare L allele could affect splicing. The PupaSNP web tool (45) indicates that the presence of the rare allele of V660L may lead to the loss of a cis-acting, DNA-binding SF2-type splicing enhancer site, defining the intron/exon boundary.11 This could lead to abnormal RNA splicing and exon skipping. In vitro functional assays will be necessary to determine the exact mode of action of variants at this residue.
Alternatively, the effect could be steric. Investigation into the conserved domain structure of the PGR protein shows this V660L polymorphism to be in the hinge region between the central zinc finger DNA-binding domain and the HOLI ligand-binding domain of the three-dimensional structure. It is possible that this nonsynonymous change from valine to leucine, a structurally similar residue differing from the former by an extra methyl group, will cause sufficient steric interference to upset the tertiary structure between the progesterone-binding domain and the DNA-binding region. A subtle change in this structure may affect the manner in which the homodimerized hormone/receptor complex binds and controls transcription from the response elements of certain downstream genes involved in mammary cell growth.
Another explanation is that the effects we are seeing with V660L are due to another polymorphism in strong linkage disequilibrium and hence carried on the same haplotypes. The PROGINS Alu insertion is in perfect linkage disequilibrium with the leucine allele of V660L. Its associated increased expression of the PR-B isoform, which is more transcriptionally active, may lead to PR-B-dependent stimulation of mammary cell growth. Further analysis of the NIEHS individual genotype data identified 16 SNPs to be in perfect linkage disequilibrium with V660L (Rp2 = 1). Of these, however, only one, the nonsynonymous polymorphism S344T, is a likely functional mutation. It would have a similar steric effect to that proposed for V660L, but would affect the progesterone-binding region of the PGR protein, possibly causing suboptimal ligand binding.
It is also possible that we could be seeing a true haplotype effect due to the combined effects of multiple variants. Both Leu660 and Thr344 add an extra methyl group to the tertiary protein structure, but neither is predicted to have a particularly dramatic effect on its own. However, because these two methyl-adding alleles, as well as the PROGINS Alu insertion, are inherited together on the same haplotype, they may have a much greater effect on PGR function in combination than alone.
In summary, we have found evidence that the haplotype associated with the V660L allele is associated with a small but significant increased risk of breast cancer, and we showed that other PGR haplotypes are unlikely to be associated with a measurably different risk of the disease. Further epidemiologic studies are required to confirm the risk associated with V660L and to determine if the risks in certain subgroups of carriers are sufficiently large to warrant cancer-preventative intervention. Other approaches would be needed to evaluate the functional basis of this association.
| Acknowledgments |
|---|
| Footnotes |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Supplementary data for this article are available at Cancer Epidemiology Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).
Note: Current address for L. Tee: Birmingham University, Department of Medical and Molecular Genetics, Division of Paediatrics and Child Health, Birmingham Women's Hospital, Birmingham B15 2TG, United Kingdom. B.A.J. Ponder is a Cancer Research UK Gibb Fellow, P.D.P. Pharoah is a Cancer Research UK Senior Clinical Research Fellow, and D.F. Easton is a Cancer Research UK Principal Research Fellow.
3 http://www.srl.cam.ac.uk/search/Homepage.htm/. ![]()
5 http://egp.gs.washington.edu/. ![]()
6 http://www.sph.umich.edu/csg/abecasis/GOLD/. ![]()
7 http://www-rcf.usc.edu/~stram/tagSNPs.html. ![]()
8 http://pupasnp.bioinfo.cnio.es. ![]()
10 P.D.P. Pharoah, personal communication. ![]()
11 http://pupasnp.bioinfo.cnio.es/. ![]()
Received 8/30/05; revised 11/23/05; accepted 1/25/06.
| References |
|---|
|
|
|---|
, epidermal growth factor receptor, c-fos, and c-myc genes. Mol Cell Biol 1991;11:503243.
Leu polymorphism and breast cancer risk. Breast Cancer Res 2004;6:R6369.[CrossRef][Medline]This article has been cited by other articles:
![]() |
M. M. Gaudet, R. L. Milne, A. Cox, N. J. Camp, E. L. Goode, M. K. Humphreys, A. M. Dunning, J. Morrison, G. G. Giles, G. Severi, et al. Five Polymorphisms and Breast Cancer Risk: Results from the Breast Cancer Association Consortium Cancer Epidemiol. Biomarkers Prev., May 1, 2009; 18(5): 1610 - 1616. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Thompson, C. S. Healey, C. Baynes, B. Kalmyrzaev, S. Ahmed, M. Dowsett, E. Folkerd, R. N. Luben, D. Cox, D. Ballinger, et al. Identification of Common Variants in the SHBG Gene Affecting Sex Hormone-Binding Globulin Levels and Breast Cancer Risk in Postmenopausal Women Cancer Epidemiol. Biomarkers Prev., December 1, 2008; 17(12): 3490 - 3498. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. S. Tait, C. L. Butts, and E. M. Sternberg The role of glucocorticoids and progestins in inflammatory, autoimmune, and infectious disease J. Leukoc. Biol., October 1, 2008; 84(4): 924 - 931. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Diergaarde, J. D. Potter, E. R. Jupe, S. Manjeshwar, C. D. Shimasaki, T. W. Pugh, D. C. DeFreese, B. A. Gramling, I. Evans, and E. White Polymorphisms in Genes Involved in Sex Hormone Metabolism, Estrogen Plus Progestin Hormone Therapy Use, and Risk of Postmenopausal Breast Cancer Cancer Epidemiol. Biomarkers Prev., July 1, 2008; 17(7): 1751 - 1759. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. M. Lonard, R. B. Lanz, and B. W. O'Malley Nuclear Receptor Coregulators and Human Disease Endocr. Rev., August 1, 2007; 28(5): 575 - 587. [Abstract] [Full Text] [PDF] |
||||
![]() |
The Breast Cancer Association Consortium Commonly studied single-nucleotide polymorphisms and breast cancer: results from the Breast Cancer Association Consortium. J Natl Cancer Inst, October 4, 2006; 98(19): 1382 - 1396. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |