
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
1 Department of Pharmacology and Therapeutics, School of Medicine, University of Liverpool, Liverpool, United Kingdom; 2 Department of Human Genetics, Graduate School of Public Health, 3 University of Pittsburgh Cancer Institute, School of Medicine, and 4 Department of Obstetrics and Gynecology, Magee-Women's Research Institute, University of Pittsburgh, Pittsburgh, Pennsylvania; and 5 Institute for Human and Machine Cognition, University of West Florida, Pensacola, Florida
Requests for reprints: David G. Peters, Department of Pharmacology and Therapeutics, University of Liverpool, The Sherrington Buildings, Ashton Street, Liverpool L69 3GE, United Kingdom. Phone: 44-151-794-5477. E-mail: david.peters{at}liverpool.ac.uk
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Several biomarkers for early detection of ovarian cancer have been evaluated, the best described of which is the product of the mucin 16 gene, CA125. CA125 is detectable in the serum of 80% of women with ovarian tumors (5) and has been used for monitoring of patients during chemotherapy and for the detection of relapse. However, the utility of CA125 as an early screening tool is somewhat limited due to the fact that it is also elevated in various benign diseases, including endometriosis, ovarian cysts, uterine fibroids, and chronic liver disease (6), and has been reported to be elevated only in 60% of stage I tumors (7).
To expand our knowledge of the molecular pathology of ovarian carcinoma and identify potential novel markers of diagnosis and prognosis, we have undertaken a large-scale gene expression analysis of primary ovarian tumors and normal surface ovarian epithelium using novel statistical tools. We have also done comparative analysis of our own Serial Analysis of Gene Expression (SAGE) data with publicly available data derived from primary tumors and tumor cell lines.
| Materials and Methods |
|---|
|
|
|---|
|
1,200 colonies were random picked and plasmids with concatemer inserts were cycle sequenced with Big Dye terminator chemistry (Big Dye version 1, Applied Biosystems, Foster City, CA) and analyzed on a 3700 Applied Biosystems DNA sequencer.
SAGE Data Analysis
SAGE data were extracted using the SAGE 2000 software package (version 4.12; http://www.sagenet.org). The number of duplicate dimers for each library was <2% of the total tags for each library. A nonnormalized, side-by-side comparison was done with all five libraries in SAGE 2000 and these numbers were exported to Microsoft Access for further analysis. A query was run in Microsoft Access to link the UniGene identifier and gene description to each tag. The tag descriptions were downloaded from the National Institute for Biotechnology Information ftp server (ftp://ftpl.nci.nih.gov/pub/SAGE/HUMAN) and imported in Microsoft Access. The data were then exported to Microsoft Excel, where tag counts were normalized to counts per 30,000 tags and sorted based on average differences in expression between HOSE and tumor. Gene matches for significant tags were manually verified using both SAGEGenie (http://cgap.nci.nih.gov/SAGE/AnatomicViewer) and SAGEmap (http://www.ncbi.nlm.nih.gov/SAGE/).
In addition, the sequence files from four libraries on National Center for Biotechnology Information's public SAGE library database (http://www.ncbi.nlm.nih.gov/SAGE/) were downloaded. Table 2 shows the tissue source descriptions for each of the libraries. These sequence files were analyzed in the same manner as our own libraries.
|
In this study, for each tag, we compute the above-mentioned score for ovarian carcinoma versus normal HOSE, ovarian carcinoma (inclusive of publicly available tumor data) versus normal HOSE, and ovarian carcinoma (inclusive of publicly available tumor and HOSE cell line data) versus normal bulk HOSE tissue, respectively. To ensure a reasonable reliability, we only consider the tags with a minimum average concentration level of 100 per 1,000,000 tags. The tags with a score of at least 0.5 are reported.
Hierarchical Clustering Analysis
Differentially expressed tags (n = 192) identified by the methods described above were analyzed by hierarchical clustering with the GeneSpring package version 4.2 (Silicon Genetics, Redwood City, CA) using the Pearson correlation function. Tags were clustered by expression pattern and 12 major clusters were identified.
TaqMan Reverse Transcription-PCR
Total RNAs were purified by the RNeasy Mini Kit (Qiagen, Valencia, CA), cleared of residual genomic DNA by the DNA-free kit (Ambion, Austin, TX) according to the manufacturer's protocol, and quantified by spectrophotometry (Beckman DU 640). The optimal reverse transcription was carried out in 100 µL volumes as described (11) using two amounts of RNA template (100 and 400 ng). No reverse transcriptase controls were carried out with 400 ng total RNA. Quantitative PCR was done on this cDNA on the ABI 7700 Sequence Detection Instrument (Applied Biosystems) using TaqMan MGB probes. PCR primers and probes for all genes analyzed were designed using the Primer Express software (Applied Biosystems). PCR amplification of cDNA was done in duplicate in 50 µL volumes as described (11) with the optimal primer and probe concentrations used for each gene (300 nmol/L for primer and 100 nmol/L for probe). Gene expressions were measured relative to the endogenous reference gene, human ß-glucuronidase (ß-GUS), using the comparative CT method described previously (11). Standard t tests and the Wilcoxon two-sample rank sum test were used to generate Ps reported in Table 3A and B, respectively.
|
| Results |
|---|
|
|
|---|
|
|
1 (COL18A1), carbamoyl-phosphate synthetase 2, aspartate transcarbamylase, and dihydroorotase (CAD), cyclin D1 (CCND1), FLJ12988, and FLJ22795. Note that FLJ12988 and FLJ22795 were found to match the same SAGE tag (TGCTCTGAAT). We initially compared the expression of these genes in eight tumor samples and five normal HOSE specimens. These data are presented in Table 3A. Of the eight genes assayed by qRT-PCR, folate receptor 1 adult (FOLR1A; P = 0.01252), WFDC2 (P = 0.04735), FLJ22795 (P = 0.02723), and CLDN3 (P = 0.00486) were significantly overexpressed in the ovarian carcinoma samples. COL18A1, FLJ12988/I>, and CAD also gave promising results but did not reach statistical significance (Table 3A). We then expanded our analyses to include a further 22 tumor samples and 3 normal HOSE specimens and again determined levels of gene expression by qRT-PCR. As shown in Table 3B, the genes, FOLR1A (P = 0.000385), WFDC2 (P = 0.006408), CLDN3 (P = 0.004037), COL18A1 (P = 0.003165), and FLJ12988 (P = 0.003978) were markedly and consistently overexpressed in all the tumor samples relative to normal controls, confirming their potential utility as markers of ovarian carcinoma. Overexpression of all of these genes was detectable in all tumor stages analyzed, including stage 1A, suggesting that overexpression of these genes may be useful for the detection of early-stage ovarian tumors. Furthermore, expressions of FOLR1A, CLDN3, and WFDC2 by qRT-PCR in a metastatic bladder tumor (TP02-635) were equivalent to levels found in normal HOSE, suggesting that these markers may be tumor type specific. However, high expressions of all genes tested by qRT-PCR were observed in a metastatic gall bladder tumor (TP02-163). There was a trend toward greater expression in higher stages for CLDN3 and FLJ12988 and for more aggressive grade for CLDN3, FOLR1, and FLJ12988.
Comparison to Publicly Available SAGE Data
To take advantage of the fact that the SAGE technique generates immortal data that can be readily compared with other SAGE data sets generated in different laboratories (12), we directly compared our own results firstly with publicly available SAGE libraries generated from both bulk ovarian tumors (OVC14, OVT6, OVT7, and OVT8) and secondly with these plus two normal cell lines (IOSE29_11 and HOSE4) derived from HOSE. The results of these comparisons are shown in Supplementary Tables S3 and S4, respectively. We found that the genes identified by this approach were generally similar to those identified in our own data (Table 5A and B). Specifically, 16 (64%) of the genes listed in Table 5A were also found to be differentially expressed when the public tumor data were included in the analysis. These genes are marked with an asterisk in Table 5A. However, only 5 (20%) were retained in the top 25 high-scoring genes. These are rhophilin, Rho GTPase-binding protein 1 (RHPN1; CTGGAGGCTG), CD24 antigen (small cell lung carcinoma cluster 4 antigen; CD24; GGAACAAACA), CLDN3 (CTCGCGCTGG), high mobility group AT-hook 1 (HMGA1; ATTTGTCCCA), and CD9 antigen (CD9; AAGATTGGTG). Similarly, when we also included publicly available SAGE data from two normal HOSE-derived cell lines, 13 (52%) of the top 25 genes identified in our own data remained overexpressed. These genes are marked with a hash (#) in Table 5A. However, only 4 (16%) of these were ranked in the top 25 high-scoring genes. These are FOLR1A (GTCGGGCCTC), RHPN1 (CTGGAGGCTG), CLDN3 (CTCGCGCTGG), and FLJ20297 hypothetical protein (FLJ20297; TCCTTGCTTC). The reasonably good correlation between our own and publicly available data is corroborative evidence that the genes identified as overexpressed in ovarian carcinoma are generally robust.
Clustering Analysis of Differentially Expressed Genes
One clearly important requirement of a tumor biomarker is that its expression be easily detectable and highly specific for disease state. Therefore, the focus of the approaches described above was to identify genes whose overexpression correlates strongly with ovarian cancer. However, we also sought to gain insight into the biological features of the samples assayed by performing clustering analysis of differentially expressed genes. Our aim was to identify coexpressed genes that might reveal information about the biological basis of ovarian tumors and also reveal potential tumor markers that were missed by the analyses described thus far. Therefore, we subjected the differentially expressed tags identified when all of our own and the publicly available data were analyzed by hierarchical clustering analysis. We identified 12 distinct clusters of coexpressed genes that are shown in Supplementary Table S1.
There are some notable features of our data that are revealed by clustering analysis. For example, it is clear that tumors OVCA 1232 and OVT7 express high levels of genes associated with an immune response, suggesting infiltration of leukocytes in those tissue samples. These genes include immunoglobulin heavy constant
3 (IGHG3), immunoglobulin heavy constant µ (IGHM; cluster 2), MHC class I A, B, and C (HLA-A, HLA-B, and HLA-C, respectively), immunoglobulin
constant (IGKC), immunoglobulin
joining 3 (IGLJ3), MHC class II DP
1 (HLA-DPA1), and MHC class II DP ß1 (HLA-DPB1; cluster 5). Significantly, one of the putative tumor markers identified by our SAGE analysis (WFDC2) is coexpressed with these genes, suggesting the possibility that WFDC2 is a marker of leukocyte infiltration. This observation reduces the potential of WFDC2 as a useful tumor marker in peripheral blood.
We also found coexpression of genes encoding ribosomal proteins S3, S9, S13, S23, L5, L10, L17, L32, and X4 (RPS3, RPS9, RPS13, RPS23, RPL5, RPL10, RPL17, RPL32, and RPSX4, respectively) in cluster 9, reflecting moderately elevated expression of these genes in normal HOSE samples (HOSE2, HOSE4, and IOSE29_11) relative to tumor samples. Also of interest is the coexpression in cluster 8 of several structural genes of the extracellular matrix in cancer cells. These include collagen type I
1 (COL1A1), collagen type I
2 (COL1A2), collagen type I
3 (COL3A1), lumican (LUM), and biglycan (BGN). Cluster 8 also revealed coexpression of the calcium signal transducers tumor-associated calcium signal transducer 1 and 2 (TACSTD1 and TACSTD2), which are widely expressed in human cancers (13).
Primary and Cultured HOSE Are Distinguishable by Comparison of SAGE Data
It is notable that the tumor suppressor gene junB proto-oncogene (JUNB) is highly expressed in the primary HOSE samples (HOSE1 and HOSE2) relative to all the tumor samples yet undetectable in the HOSE cell lines (HOSE4 and IOSE29_11). Coexpressed with JUNB is the negative regulator of cell cycle progression, cyclin-dependent kinase inhibitor 1A (CDKN1A). Similarly, the cell cycle regulator CCND1 is overexpressed (cluster 3) in all the tumor samples analyzed by SAGE and most of those assessed by qRT-PCR (Table 3A and B) relative to normal HOSE, yet its expression levels were also found to be very high in the "normal" HOSE cell line IOSE29_11. Notably, CCND1 is coexpressed in cluster 3 with TACC3, which is involved in driving cell cycle progression via a mechanism that involves interaction with the histone acetyltransferases (14, 15). Taken together, these observations suggest that the process of cell culture is associated with alterations in cell cycle regulation in the normal HOSE cell lines.
These analyses also identify several potential novel ovarian tumor markers in our data. For example, coexpressed with CCND1 are CD9, lysophospholipase II (LYPLA2), and G protein
inhibiting activity polypeptide 2 (GNAI2). CD9 is involved in cell proliferation (16). Its overexpression has not been previously associated with ovarian carcinoma, although it has been described as a possible marker for gastric cancer (17). Notably, CD9 underexpression has been associated with ovarian tumor progression (18). To our knowledge, neither LYPLA2 nor GNAI2 overexpression have been previously associated with ovarian cancer. Therefore, these genes, along with TACC3, may be novel ovarian tumor markers.
We also found strong coexpression in cluster 4 of genes associated with response to cellular stress. These are glutathione peroxidase 1 (GPX1), chaperonin containing TCP1, subunit 3 (CCT3), and 27-kDa heat shock protein 1 (HSPB1). Coexpressed with these genes is the gene encoding S-adenosylhomocysteine hydrolase (AHCY). These genes are overexpressed in ovarian tumors relative to primary normal HOSE (HOSE1 and HOSE2) but not relative to cultured HOSE (HOSE 4 and IOSE29_11) in which displayed levels of expression of these genes that were comparable with the primary tumors. In cluster 4, we also found the HMGA1 gene, the overexpression of which has been previously associated with ovarian carcinoma (19). The biological significance of these observations is unclear.
| Discussion |
|---|
|
|
|---|
We identified several potential biomarkers of ovarian cancer, five of which (FOLR1A, WFDC2, CLDN3, COL18A3, and FLJ12988) were further analyzed and their expression changes were confirmed by qRT-PCR in a larger sample set. High levels of expression of three of these markers (FOLR1A, WFDC2, and CLDN3) have previously been associated with ovarian tumors. In particular, the role of FOLR1A has been extensively studied in the context of ovarian cancer. FOLR1A expression has been reported at moderate levels in the normal epithelia of kidney, lung, and breast and high levels in placental tissue (20). However, its expression is absent in normal ovarian epithelium (21) and elevated in the majority of nonmucinous ovarian carcinomas (22). CLDN3 and WFDC2 have also been associated with elevated expression in ovarian cancer. For example, Hough et al. (23) reported overexpression of both CLDN3 and WFDC2 by SAGE analysis in ovarian tumors. Similarly, microarray approaches were used to identify WFDC2 overexpression in ovarian tumors (24, 25).
A COOH-terminal fragment of the COL18A1 gene product corresponds to the antiangiogenic factor endostatin and overexpression of endostatin has been correlated with ovarian cancer (26). Because of the central involvement of endostatin in angiogenesis and its role in tumor growth (27), COL18A1 overexpression is a promising biomarker for ovarian cancer. However, in a previous study, no correlation was observed between serum levels of endostatin and incidence of ovarian cancer (28).
Previous searches for ovarian tumor markers by SAGE have only considered normal samples that have been cultured ex vivo (HOSE4) or are SV40 transformed (IOSE29_11; refs. 23, 29). As noted above, our analysis of primary ovarian epithelium samples (HOSE1 and HOSE2) revealed altered expression of several genes not reported by previous SAGE studies. These are most evident in clusters 3 and 4 and include TACC3, CD9, CCND1, LYPLA2, GNAI2, GPX1, AHCY, CCT3, HSPB1, and HMGA1. Corroborative evidence that some of these genes are indeed potentially useful biomarkers for ovarian cancer is derived from the fact that overexpression of a subset of these genes, HMGA1 (19), CCND1 (30), GPX1 (23), and HSPB1 (31), have all been associated with ovarian cancer.
Interestingly, although TACC3 has not, to our knowledge, been associated with ovarian carcinoma, it is highly expressed during oogenesis (32). CD9 is associated with reduced tumor progression but is not a biomarker for OVCA (18). LYPLA2 was also overexpressed in ovarian carcinoma. High levels of lysophosphatidic acid, a product of lysophospholipase catalytic activity, have been reported as a potential biomarker of ovarian cancer (33). However, lysophospholipase activity levels in serum do not seem to be associated with ovarian carcinoma (34). To our knowledge, GNAI2, GPX1, and CCT3 are not known to be overexpressed in ovarian cancer and may be entirely novel markers for this disease.
The fact that our analysis of primary HOSE tissue leads to the identification of potentially novel tumor markers underlies the importance of avoiding cultured cells as normal controls for biomarker discovery. Our data suggest the activation of gene expression cascades in cultured HOSE that are involved in cell proliferation. Clearly, this is an undesirable control phenotype when performing biomarker screens in cancer. Therefore, comparison of gene expression patterns in cultured cells with those obtained from bulk tissue must be treated with caution. It should also be noted however that the collection of primary HOSE tissue might result in the sampling of contaminating stromal cells.
Clearly, our study has several limitations. One drawback is the use of bulk tumor samples for our analysis. As we have shown, these samples may contain multiple cell types whose distinct transcriptomic signature can create problems at the data analysis stage. One way to overcome this would be the use of technologies for analyzing gene expression in very small samples of laser-captured tissue of interest (35).
One disadvantage of using SAGE for gene expression analysis is that sample throughput is low due to the fact that the procedure is highly labor intensive. Furthermore, despite our efforts to comprehensively identify differentially expressed genes using novel statistical tools, it may be that we have missed important markers of disease. Similarly, several genes were identified by our analyses that we have not pursued by qRT-PCR in a wider sample set and there is much work to be done in confirming the utility of these novel markers that we have identified here. This will require extensive follow-up in a gene and/or proteindirected fashion involving further analysis of gene expression alterations in a wide variety of tumor samples, particularly those that are classified pathologically as stage 1.
The ultimate goal is to identify robust targets for the development of serum-based diagnostic tools. Clearly, this will require significant progress in translational research to develop mRNA tumor markers into reliable serum-based assays. One important consideration when selecting gene products for further analysis at the protein level is predicting the magnitude of altered expression at the mRNA level required to produce a detectable protein change. The combination of mRNA data sets with results from emerging proteomic efforts will likely accelerate biomarker identification and development in this context. Despite these challenges, genome-wide data sets, such as ours, that can be readily shared between investigators will provide a vital foundation for development in this field. The use of an open platform tool, such as SAGE, is an advantage in this context in that it does not rely on any prior knowledge of genes of interest.
In conclusion, we have undertaken a genome-wide screen by SAGE for putative mRNA markers of ovarian cancer in bulk tissue obtained from three adenocarcinomas and two pools of normal HOSE. We further analyzed our data in comparison with publicly available ovarian cancer and HOSE SAGE libraries. The overexpression of a subset of genes was confirmed in a wider sample set of tumors and normal tissue. These data provide an immortal gene expression catalogue for public utility in the identification of potential markers for diagnosis and characterization of ovarian cancer.
| Footnotes |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Cancer Epidemiology Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).
Received 9/22/04; revised 4/ 7/05; accepted 4/14/05.
| References |
|---|
|
|
|---|
-folate receptor overexpression. Oncogene 2000;19:475463.[CrossRef][Medline]
This article has been cited by other articles:
![]() |
I. Perez de Castro, G. de Carcer, and M. Malumbres A census of mitotic cancer genes: new insights into tumor cell biology and cancer therapy Carcinogenesis, May 1, 2007; 28(5): 899 - 912. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |