Background: Age is the strongest breast cancer risk factor, with overall breast cancer risk increasing steadily beginning at approximately 30 years of age. However, while breast cancer risk is lower among younger women, young women's breast cancer may be more aggressive. Although, several genomic and epidemiologic studies have shown higher prevalence of aggressive, estrogen-receptor negative breast cancer in younger women, the age-related gene expression that predisposes to these tumors is poorly understood. Characterizing age-related patterns of gene expression in normal breast tissues may provide insights on etiology of distinct breast cancer subtypes that arise from these tissues.
Methods: To identify age-related changes in normal breast tissue, 96 tissue specimens from patients with reduction mammoplasty, ages 14 to 70 years, were assayed by gene expression microarray.
Results: Significant associations between gene expression levels and age were identified for 802 probes (481 increased, 321 decreased with increasing age). Enriched functions included “aging of cells,” “shape change,” and “chemotaxis,” and enriched pathways included Wnt/beta-catenin signaling, Ephrin receptor signaling, and JAK/Stat signaling. Applying the age-associated genes to publicly available tumor datasets, the age-associated pathways defined two groups of tumors with distinct survival.
Conclusion: The hazard rates of young-like tumors mirrored that of high-grade tumors in the Surveillance, Epidemiology, and End Results Program, providing a biologic link between normal aging and age-related tumor aggressiveness.
Impact: These data show that studies of normal tissue gene expression can yield important insights about the pathways and biologic pressures that are relevant during tumor etiology and progression. Cancer Epidemiol Biomarkers Prev; 21(10); 1735–44. ©2012 AACR.
This article is featured in Highlights of This Issue, p. 1611
Age is the strongest demographic risk factor for human cancer overall (1, 2), with breast cancer rates steadily increasing with age. However, while tumors are less common in young women, younger women are more likely to have aggressive tumors. Young women's breast cancer is more often estrogen-receptor (ER) negative, whereas older women's cancers are more often ER-positive (3, 4). Two previous gene expression studies have compared molecular features of breast cancer by age, focusing on the tumor gene expression (5, 6). These studies have shown that young women's tumors have distinct gene expression, ultimately reflecting different prevalence of breast cancer subtypes. In some analyses, age-associated gene expression changes were no longer evident after adjusting for subtype and grade (7). Thus, persistent gaps in our understanding of age-associated changes in tumor aggressiveness remain.
Research is needed to distinguish characteristics of the malignancy from those of the host (5, 8) and to distinguish background effects of aging from those that contribute to carcinogenesis (9). Although, the host–tumor interface has important consequences for a nascent tumor (10–12), changes to this interface with aging are poorly understood (7). Beyond the histology of postmenopausal involution, aging of normal breast tissue has had limited study (13), and gene expression studies of normal tissue may provide important insights.
Normal tissue studies can help to identify barriers that must be overcome by tumor cells (14). Barriers to carcinogenesis derive from normal tissue homeostasis in the microenvironment prior to, or early in, disease (15). Because tumors evolve with selective pressure from surrounding stroma, studies of normal breast tissue, typically more than 90% stroma by volume, may provide insights about selective pressures faced by tumors. We hypothesized that older versus younger tissues represent distinct evolutionary environments, resulting in distinct cancer subtypes by age. Under this hypothesis, gene expression in the normal tissue of young women would be reflected in aggressive tumors common in young women. To evaluate this hypothesis, we identified age-associated gene expression in 96 normal tissues from patients with premenopausal reduction mammoplasty and tested this signature in independent normal breast microarray data. We then evaluated the age-associated signature using publicly available tumor tissue gene expression data, asking whether age-associated gene expression from normal defines distinct tumor groups. The results link gene expression in young women's breast tissue with aggressive tumor phenotypes.
Materials and Methods
This study included women, ages 14 to 70 years, who underwent reduction mammoplasty surgery at Baystate Medical Center (Springfield, MA) between 2007 and 2009. Patient characteristics are presented in Table 1 and Table 1a. Institutional Review Boards (IRBs) at Baystate and University of Massachusetts (Amherst, MA) approved the study. Women consented to provide tissues not needed for diagnostic purposes to complete a telephone interview following surgery. Tissues were snap frozen and stored at −80°C. An independent dataset of isolated glands from reduction mammoplasty was from University of California at San Francisco Cancer Center and the Cooperative Human Tissue Network, with patients consented under an IRB at those institutions.
RNA isolation and microarrays
Isolation of RNA from reduction mammoplasty tissue was conducted as described in Sun and colleagues (16). Agilent 4 × 44K V1 and custom 244k arrays were conducted according to manufacturer protocols for linear amplification and 2-color hybridization. Spots that had an intensity of greater than 10 in at least 80% of samples were selected for subsequent analyses. Data were lowess normalized and missing data were imputed using k-nearest neighbors with k = 10. Duplicate arrays (N = 15) were average. A total of 114 microarrays representing 99 patients were included (GSE33526 for n = 72, GEO Submission in progress for n = 42). All statistical analyses were conducted in R, Bioconductor. For age-associated and menopause-associated gene expression analyses (n = 76 and n = 99, respectively), probes with no Entrez Gene ID or with less than median variability were eliminated. Linear regression with Linear Models for Microarray Data (LIMMA) (17) was used to identify significant probes associated with chronologic age (in decades) or menopausal status (pre/perimenopausal vs. postmenopausal). Unadjusted P values were used with q value (Bioconductor) to identify q values <0.10 (defined as statistical significance). Hierarchical clustering was used to visualize gene expression, with samples ordered by chronologic age and genes clustered by Pearson correlation. Gene ontology analyses were conducted using Ingenuity Pathway Analysis (IPA).
We tested a second dataset of isolated glands from patients with reduction mammoplasty (N = 30). Tissue samples were minced and enzymatically dissociated using 0.1% (w/v) collagenase I in Dulbecco's Modified Eagle Medium at 37°C for 12 to 18 hours. Organoids remaining after digestion were collected by centrifugation at 100 × g for 2 minutes and stored frozen. RNA was extracted using Qiagen RNeasy and Affymetrix GeneChip Human Gene 1.0 ST microarrays were conducted at University of Wisconsin (Madison, WI). Microarray data were processed using Robust Multiarray Average (Bioconductor).
To confirm expression changes for CDKN2A and TP53 as identified in microarray data, 1 μg RNA per sample for 26 samples were reverse transcribed using QuantiTect Reverse Transcription kit (Qiagen). Resultant cDNAs (10 ng) were analyzed in triplicate (using miScript SYBR Green PCR Kit and miScript-derived primers for p16 (CDK2NA), p53 or GAPDH) by real-time, quantitative PCR (qPCR) on an Applied Biosystems 7900HT thermocycler. An outlier was detected in regression diagnostics and removed to yield a final qPCR datasets (n = 25).
Public microarray datasets and test set of isolated glands
On the basis of the previous evidence that increasing age is associated with qualitative shifts in tumor subtype (5, 18), we hypothesized that gene expression of normal aging would be manifest in tumors, with more aggressive tumors similar to younger, normal tissue. To test this, we projected the age-associated gene set onto publicly available microarray datasets: the NKI295 (19), Naderi and colleagues (20), and UNC337 (21). To classify tumors as young-like or older-like, we applied methods described in Creighton and colleagues (22) to obtain a correlation coefficient describing the relation between each sample and the 802-probe age associated signature. Probes were collapsed by statistical mean to a list of 719 unique entrez genes. The vector summarizing the age signature on these genes was constructed as median expression in patients with reduction mammoplasty younger than 30 years of age minus that in patients older than 39 years of age. Genes with differences less than zero (lower expression in young) were set to −1, and genes with differences larger than zero (higher expression in young) were set to 1. This vector was compared with the sample gene expression data to calculate Pearson correlation coefficients. If the correlation was positive, the patient was classified as young-like; if correlation was negative, the patient was classified as older-like. Because all patients in the reduction mammoplasty dataset and the NKI295 dataset were 55 years or younger, all 3 tumor datasets were restricted to this age range for combined analysis. These data were median-centered by gene and filtered to include only those genes with top 50% variability prior to correlation analysis. The 719 genes mapped to 380 variable genes in the NKI295 dataset, 317 variable genes in Naderi and colleagues, and 268 variable genes in UNC337. Following classification, all samples were aggregated to a single dataset (n = 459).
To test whether young-like tumors mirror aggressive tumors, we used data for women ages 55 years or younger in the Surveillance, Epidemiology, and End Results (SEER) Program (23). Hazard curves for public microarray data and SEER data were generated using the muhaz library in R. Young-like and older-like tumor (microarray data) hazard rates was compared with “aggressive” (grade 3) tumors and “less aggressive” tumors (grade 1 or 2) in SEER. Previous articles (3) have shown that several clinical characteristics (ER status, PR status, race, grade, tumor size, etc.) duplicate the same general patterns for age at incidence and hazard rate, so grade was representative.
Creighton correlation–based classifications were also used for analyzing age-associated gene expression in a test set of isolated glands. Entrez IDs from the age signature were mapped to the University of Wisconsin data, with 138 probes identified in common. The association between the Creighton correlation (coded as “positive” if ≥0 or “negative” if <0) and the true age of tissue (<30, 30–39, >39) was estimated using the nonzero correlation statistic (1 degree of freedom Cochran–Mantel–Haenszel statistic, PROC FREQ in SAS 9.2).
Comparison of age-associated signature with previously published signatures
Previously, a signature of aging on the basis of tumor gene expression data (5) and a meta-analysis of common signatures of aging across many tissues were published (24). These signatures were evaluated in our reduction mammoplasty samples to test whether they predicted age in normal human breast tissue and to assess correlation with our aging signature. These signatures were mapped to our filtered reduction mammoplasty dataset, resulting in 85 genes for the 145-gene Yau and colleagues (5) signature and 52 genes for the 73-gene de Magalhaes signature (24). A Creighton correlation was computed (genes with high expression in young coded as 1, high expression in older coded as −1). Positive correlation indicated patients were young-like for a given signature. These classes were then evaluated for association with our reduction mammoplasty–based young-like and older-like signature and in association with chronologic age. The χ2 tests, or Fisher exact in which cell counts were <5, were used to test statistical significance.
Age-associated gene expression in reduction mammoplasty patients
A substantial proportion of genes had expression changes associated with age. Given a False Discovery Rate (FDR) of 5% or 10%, 2% or 4% of genes were age-associated, respectively. A striking 15% of genes were age-associated at FDR of 20%, demonstrating broad changes induced by aging. Figure 1A shows a one-dimensional (gene only) cluster for 802 age-associated genes (with FDR < 10%) across 76 samples (representing 62 patients), ordered by chronologic age. A qualitative shift is evident in gene expression in the late thirties, with substantial interindividual variation.
To test this gene set, we used data from isolated glands (enriched for epithelium) of patients with reduction mammoplasty. Results in this independent test set show that even in epithelium, the age-related changes observed in whole tissue are preserved. Figure 1B shows a strong correlation between age and the aging signature, with young women positively correlated with the younger women's signature (9 of 14 samples positively correlated) and older women (12 of 16) negatively correlated with young signature. There was a trend toward decreasing correlation with the young-like signature as age increased (odds ratio for young expression was 5.4 [95% CI, 1.1 to 26.0] comparing younger with older). The association is particularly striking given distinct specimen processing methods and microarray platforms. These data suggest that our signature reflects aging in multiple cell types.
Among the 802 differentially expressed probes, several processes and pathways were overrepresented (Table 2). Aging of cells, cell flattening, and shape change were significant processes, whereas JAK2-associatied hormone-like cytokine signaling, Wnt/β-catenin signaling, and Ephrin receptor signaling were significant pathways. The complete list of genes from Fig. 1 are in Supplementary Table S1 with the average fold change comparing youngest (< 30, n = 20) with oldest patients (>49, n = 6).
We confirmed the direction of change for CDKN2A (p16) and TP53 by conducting qRT-PCR on a subset of samples with remaining RNA (Supplementary Fig. S1). The p16INK4a tumor suppressor has an established role in aging and its expression increases with increasing age (25, 26). p53 plays a critical role in determining cellular senescence and in vitro lifespan (27), and has declining activity in aging rodents (28). qPCR showed p53 levels decrease with age and p16 levels increase, consistent with the microarray data.
Menopause-associated gene expression in reduction mammoplasty patients
In contrast to a broad gene expression response to age, there were few genes associated with menopausal status. In comparing 76 pre/perimenopausal with 23 postmenopausal women, only 273 genes were statistically significant with an unadjusted P value <0.05. No genes were significantly associated with menopausal status after correction for multiple testing (q value >0.10 for all genes). Despite the weak association, we conducted an IPA analysis with the 273 genes that had P value <0.05. No Functional Annotation or Canonical Pathway categories were differentially expressed with Benjamini–Hochberg P < 0.05. These results show that within the age range 20 to 70 years, menopausal status did not strongly influence breast tissue gene expression.
Age-associated gene expression in the breast cancer patients
According to evolutionary theories of cancer (11), tumors use the transcriptional programs and pathways that are active in normal tissue to advance growth and survival. Thus, we expected that age-associated genes in normal tissue would also be dysregulated in tumors. By applying the age-associated gene set from Fig. 1 to 3 public microarray datasets, we identified 2 groups of patients. “Aggressive” high-grade patients in SEER (Fig. 2A) and patients with young-like gene expression (Fig. 2B) both showed a left shift in the age-at-incidence distribution, documenting earlier age at incidence. Aggressive tumors (Fig. 2C) and young-like tumors (Fig. 2D) also had peak hazard ratios early after diagnosis, with declining hazard rates thereafter. “Less aggressive” SEER tumors and older-like tumors do not show this pattern. In total, the patterns of age at incidence and hazard rate over time for young-like breast tumors are very similar to patterns presented for aggressive breast cancers on the basis of SEER data (3).
Young-like tumors were more likely to have clinically aggressive characteristics (Table 3), with statistically significant associations in the largest of the 3 datasets (NKI): ER-negative (P = 0.02), high grade (P = 0.005), and larger (P = 0.04). Substantial, although nonsignificant, associations in the same direction (more aggressive tumors given young-like gene expression class) were observed for Naderi and colleagues and UNC 337 datasets. Considering the combined dataset, significant associations held with numerous clinically aggressive phenotypes (large size, high grade, and young age). The strongest association was for high tumor grade. Young-like tumors were also more prevalent among young women in all datasets except for the Naderi and colleagues, in which the young-like signature did not correlate with patient age. However, Naderi and colleagues (n = 52) had an older patient population (mean age of 47 years; compared with UNC and NKI, both mean age of 44 years). These results document that the normal biology of younger women is reflected in aggressive tumors that are more common in this age group.
Evaluation of correlations with previous age-related signatures
A previous study identified common signatures of aging across multiple tissues (including heart, lung, brain, muscle, and others, but not breast) and species (24). As shown in Table 4, young-like samples on the basis of this de Magelhaes signature were younger chronologically and more likely to be young-like according to our signature. Previous studies of aging human breast were not available for comparison, but a tumor-based signature that evaluated age-related gene expression among ER-positive breast tumors (5), avoiding some of the problems of confounding by tumor subtype as described in Anders and colleagues (7), was evaluated. The Yau and colleagues signature was significantly associated with our young-like signature, but was not associated with age of patient with reduction mammoplasty. The weaker correlation with age for this signature may reflect the evolution and divergence of tumor biology from normal age-related biology.
Age-associated changes in normal breast gene expression have not been reported previously. This is a gap in the literature given that age-associated gene expression has been reported for human fibroblasts and lymphocytes (29, 30) as well as brain (31), kidney (32), and skeletal muscle (33, 34). A recent meta-analysis compared aging-related changes across species and tissues, but without inclusion of mammary gland (24). Histologically, there are important compositional and morphometric changes in breast tissue with aging (35). Prior to menopause, a decline in ovarian function causes regression of both epithelium and stroma from the third to sixth decades of life, independent of previous reproductive history (36). Aging as a process spanning decades is reflected in our observation of progressive changes, and the observation that menopause was not strongly associated with changes in breast gene expression.
In our data, age-associated gene expression was functionally linked with “aging” gene expression categories, but also included individual genes of interest. CDKN2A (p16) is an established biomarker and effector of mammalian aging (26), its upregulation being accompanied by shortened telomeres. As expected from the previous studies (37), we found increasing expression of p16 and decreasing expression of TERT with age. Transcripts of the gene coding tumor suppressor p53 (TP53) also changed, with p53 expression declining in older patients. Studies of p53 and aging have emphasized mouse models of breast cancer (28), but in human studies, p53 mutations are more common in younger women and basal-like tumors that occur more frequently in the young (38). These observations raise the hypothesis that increased mutation frequency in young women may reflect greater activity of p53 in young women and a resulting pressure to inactivate p53.
Other interesting developmentally regulated pathways were also altered in aging breast. The JAK2-associated hormone-like cytokine signaling, Wnt/β-catenin signaling, and Ephrin receptor signaling were differentially expressed with age. The hormone-dependent JAK2 signaling alterations (including higher expression of STAT5A), may reflect changes in ovarian function/estrogen signaling over time; this pathway regulating mammary gland development is responsive to estrogen and progesterone (39). The latter 2 pathways (Wnt/β-catenin and Ephrin receptor signaling) are known to be involved in maintaining stem cell dynamics in cancer (40, 41), but their specific roles in normal breast are relatively unexplored. Given that mammary progenitor cells are a rare cell population, these signals are unlikely to be derived specifically from stem cell populations, but may reflect the role of these pathways (and cross-talk between them) in tissue architecture or cellular differentiation [reviewed in refs. (42) and (43), for Ephrins and Wnt pathway, respectively]. We hypothesize that the changes we observe reflect alterations in survival and proliferation potential of the normal cell types that are susceptible to carcinogenesis. Alteration of these signals and normal tissue homeostasis with age may dictate pathways to malignancy and determine the aggressiveness of resulting tumors. In fact, a recent commentary emphasized the importance of altered homeostasis in age-dependent cancer rates, countering the previous notion that oncogenic mutation rates alone limit carcinogenesis at young ages (44).
Our study also identified trends that are relevant for the epidemiology of breast cancer and aging. Previous analyses of epidemiologic data have used breast cancer incidence data to draw inferences about aging of breast tissue (45, 46). These articles have suggested that the rate of aging is most rapid in the early years after menarche and before the first pregnancy, decreases with each subsequent birth, and decreases further with menopause. Use of very large datasets and anchoring of changes to particular reproductive events allowed for observation of these trends despite the substantial interindividual variability. While aging-related changes in undiseased tissue are a more direct route to studying aging in tissue, these studies are underpowered to dissect the composite and interactive effects of multiple demographic and exposure variables. However, it is clear in Fig. 1 that there is substantial heterogeneity in the population. Premature expression of signatures reflective of older biology might predict increased risk of breast cancer. Given that a larger number of epidemiologic studies are now collecting histologically normal tissue from both diseased and unaffected individuals, development of novel risk biomarkers from normal tissue may be possible. Indeed recent studies with peripheral blood T-lymphocytes have shown that molecular markers of aging do show associations with health behaviors, such as physical activity and smoking status (26), suggesting that biomarkers of aging may lead to understanding of etiology.
If gene expression microenvironments in younger tissue apply selective pressures or create optimal conditions for specific subtypes, then (1) these subtypes would be more common in younger women, and (2) the tumors would be expected to differentially express the pathways common to young breast tissue. About the first point, several articles have documented that different subtypes of tumors are more prevalent in younger women (e.g., ER-negative and basal-like cancer; refs. 3, 4), which is echoed in our data showing that tumors with young-like signatures had aggressive clinical features. About the second point, our work illustrates that the young-like tumors have distinct age at incidence patterns and hazard rates over time, similar to the patterns produced by stratifying on aggressive clinical characteristics. Thus, our work provides a strong biologic link between aging processes and the etiology of aggressive breast cancer subtypes.
While links between age-associated gene expression and epidemiologic age at incidence patterns are informative, there are caveats to our analysis. First, we used public microarray datasets to evaluate our age-associated signature classes in comparison with high grade/aggressive tumors in SEER data. While the data convincingly recapitulate the SEER patterns for age-at-incidence and hazard rate over time, it must be noted that these public datasets are not population-based samples, and therefore may have substantial distributional differences from SEER in both age and tumor characteristics. While we restricted the age-range of tumors in our analysis to improve the comparability across datasets, the lack of a population-based tumor gene expression data for evaluating age-dependent signatures limits our comparability with SEER data. Second, all of the microarray datasets used were moderately sized. Therefore, it was impossible to stratify on relevant demographic variables, such as race, and we were unable to detect weaker changes in gene expression with age. However, these analyses are an important first step toward characterizing some of the strongest changes induced in aging breast. Third and finally, our reduction mammoplasty tissues were not microdissected, and thus, we identified changes that are common to multiple cell types. However, many aging changes may be highly stereotyped across tissue and cell types and highly conserved across organisms, given that our signature correlated with a signature derived from multiple species and multiple tissues, both stroma and epithelium rich (24).
Continued and future research may consider whether other breast cancer risk factors perturb particular breast-cancer related pathways in normal tissue. For example, if particular pathways are altered in normal breast according to body mass index (16) or parity (47), epidemiologic studies could assay these pathways in tumors and stratify cases according to whether they express these pathways. This would help to delineate the distinct causal paths that contribute to heterogeneous breast cancers. Case-only studies can identify etiologic heterogeneity with respect to particular pathways (48), and more recently, concordance of phenotypes between first and second primary cancers may help establish etiologic distinctiveness (49). Our work shows that evaluating pathways in both normal tissue and in tumors can also advance our understanding of etiologic distinctiveness.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Conception and design: J.R. Pirone, D.J. Jerry, M.A. Troester
Development of methodology: J.R. Pirone, M.A. Troester
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): W.C. Hines, M. Johnson, M.N. Gould, P. Yaswen, D.J. Jerry, S.S. Schneider, M.A. Troester
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): J.R. Pirone, M. D'arcy, D.A. Stewart, D. J. Jerry, M.A. Troester
Writing, review, and/or revision of the manuscript: J.R. Pirone, M. D'arcy, W.C. Hines, M.N. Gould, P. Yaswen, D.J. Jerry, S.S. Schneider, M.A. Troester
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): J.R. Pirone, D.A. Stewart, M.A. Troester
Study supervision: D.J. Jerry, M.A. Troester
This project was supported by the National Cancer Institute and National Institutes of Environmental Health Sciences (U01-ES019472, R01-CA138255, R01-ES017400, U01-ES019548, and U01-ES019466, P50CA058223), Avon Foundation, and the University Cancer Research Fund at the University of North Carolina.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The authors thank Maureen Lahti at Baystate Medical Center for contributions to patient data collection.
Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).
- Received April 16, 2012.
- Revision received July 13, 2012.
- Accepted July 20, 2012.
- ©2012 American Association for Cancer Research.