Tumor and salivary matrix metalloproteinase levels are strong diagnostic markers of oral squamous cell carcinoma.

Background: The matrix metalloproteinases (MMP) cause degradation of the extracellular matrix and basement membranes, and thus may play a key role in cancer development. Methods: In our search for biomarkers for oral squamous cell carcinomas (OSCC), we compared primary OSCC, oral dysplasia and control subjects with respect to: (i) expression of MMP1, MMP3, MMP10, and MMP12 in oral epithelial tissue using Affymetrix U133 2.0 Plus GeneChip arrays, followed by quantitative reverse transcription-PCR (qRT-PCR) for MMP1, and (ii) determination of MMP1 and MMP3 concentrations in saliva. Results: MMP1 expression in primary OSCC (n = 119) was >200-fold higher (P = 7.16 × 10−40) compared with expression levels in nonneoplastic oral epithelium from controls (n = 35). qRT-PCR results on 30 cases and 22 controls confirmed this substantial differential expression. The exceptional discriminatory power to separate OSCC from controls was validated in two independent testing sets (AUC% = 100; 95% CI: 100–100 and AUC% = 98.4; 95% CI: 95.6–100). Salivary concentrations of MMP1 and MMP3 in OSCC patients (33 stage I/II, 26 stage III/IV) were 6.2 times (95% CI: 3.32–11.73) and 14.8 times (95% CI: 6.75–32.56) higher, respectively, than in controls, and displayed an increasing trend with higher stage disease. Conclusion: Tumor and salivary MMPs are robust diagnostic biomarkers of OSCC. Impact: The capacity of MMP gene expression to identify OSCC provides support for further investigation into MMPs as potential markers for OSCC development. Detection of MMP proteins in saliva in particular may provide a promising means to detect and monitor OSCC noninvasively. Cancer Epidemiol Biomarkers Prev; 20(12); 2628–36. ©2011 AACR.


Introduction
Oral squamous cell carcinoma (OSCC) of the oral cavity and oropharynx is one of the most common cancers in the world, with an estimated 400,000 new cases and 200,000 deaths in 2008 worldwide (1). In 2010, an estimated 36,540 new cases and 7,880 deaths from OSCC occurred in the United States (2). Despite advances in chemotherapy, radiation treatments, screening tools such as the VELscope (LED Dental, Inc.; ref. 3), and improvements in surgical techniques, survival remains extremely poor. The 5-year survival is estimated at 63% for Whites and 43% for Blacks in the Unites States (2). Prognosis is heavily based on the AJCC staging system, which has limited utility for predicting survival because patients with tumors of the same AJCC stage often have heterogeneous responses to treatment (4).
To improve the mortality and morbidity burden associated with OSCC, there is an urgent need to identify sensitive and specific biomarkers for early detection and prognosis. For utmost clinical utility, these biomarkers should be measurable in specimens that are collected with minimal discomfort to the patient (5).
This study sought to focus on the MMPs as potential diagnostic and prognostic biomarkers. Our primary goal was to determine whether salivary concentrations of the most highly differentially expressed MMPs might be useful as a diagnostic aid. Although others have reported elevated MMP transcript levels in saliva (20), protein biomarkers in saliva should provide more accurate representations of molecular function. There are numerous posttranscriptional processes, including transcript de/stabilization, translation, protein modification, and degradation. As such, mRNA levels may not correlate with protein activity (22,23). In addition, saliva is an ideal diagnostic tool for biomarker assessment due to its accessibility and ease of collection (5,24). An additional goal was to evaluate the use of the MMPs as a prognostic aid by investigating whether MMP expression was associated with an increased risk of death.

Study population
As described by Chen and colleagues (25), cases were previously untreated primary OSCC patients scheduled for biopsy or surgery at the University of Washington Medical Center, Harborview Medical Center, or Veterans Affairs (VA) Puget Sound Health Care System. Patients diagnosed with oral dysplastic lesions were also enrolled through these medical centers during the same time period. Controls were patients who received oral surgery, such as uvulopalatopharyngoplasty and tonsillectomy, for a nonmalignant or nonpremalignant condition, at the same institutions and during the same time period. Subjects were enrolled between December 2003 and April 2007.
Study participants were interviewed using a structured questionnaire eliciting demographic characteristics, medical and lifestyle history, including tobacco and alcohol use. Data on tumor stage and other characteristics were abstracted from medical records. Participants gave informed consent and study procedures were approved by the Institutional Review Boards of the Fred Hutchinson Cancer Research Center, University of Washington, and VA Puget Sound Health Care System.

Tissue and saliva collection
Tumor tissue from cases was obtained at time of resection prior to other treatment. Tissue from dysplasia patients and controls was obtained at time of biopsy or surgery. Oral epithelial tissue from controls was collected from the uvula or anterior tonsillar pillar, avoiding contamination with surrounding lymphoid tissue. The removed tissue was immersed in RNALater (Applied Biosystems, Inc.) for at least 12 hours at 4 C before storage at À80 C until use.
Saliva samples were collected preoperatively during the clinic visit. Patients were asked not to eat or drink for at least 1 hour before collection and to spit into a centrifuge tube. The saliva was stored at 4 C for up to 2 hours, then centrifuged for 10 minutes at 1,300 Â g and aliquoted into cryovials for storage at À80 C.

Patient follow-up
Patient follow-up was done from July, 2004 to January, 2011. Subjects were followed through periodic telephone contacts at as close to 6-month intervals as possible following surgery, review of medical records and linkage to data sources. Vital status was checked against the Social Security Death Index (SSDI) and Fred Hutchinson Cancer Research Center's Cancer Surveillance System (CSS), which is part of the Surveillance, Epidemiology, and End Results (SEER) program of the National Cancer Institute, and is updated with the Washington State Death Certificate database and National Death Index. A death was classified as due to OSCC or not due to OSCC based on review of medical records and death certificates by head and neck surgeons involved in the study. If there was no indication of death, we censored that subject at the last known date of follow-up. Only 2 subjects were lost to follow-up.

Laboratory studies
As described previously (25), total RNA was extracted from tumor and nonneoplastic oral epithelium using a TRIzol method (Invitrogen), purified with an RNeasy mini kit (Qiagen), processed using a GeneChip Expression 3 0 -Amplification Reagents Kit (Affymetrix), and examined with Affymetrix GeneChip Human Genome U133 Plus 2.0 Arrays.
Tumor and nonneoplastic oral epithelium samples were screened for the presence of HPV DNA using a nested PCR protocol (26). All samples that showed a positive result were tested for HPV types using the LIN-EAR ARRAY HPV Genotyping Test (Roche), under a research-use only agreement. "High-risk" HPV was defined by 13 types, including HPV-16 and HPV-18; "low-risk" HPV corresponded to 24 types. To verify HPV type calls, a subset of the samples were amplified and sequenced using HPV-16-specific primers, and compared against a known HPV-16 sequence (GenBank 333031).
We verified MMP1 gene expression results by quantitative reverse transcription-PCR (qRT-PCR) on a subset of 30 OSCC cases and 22 controls using a QuantiTect SYBR Green RT-PCR kit (Qiagen) and bioinformatically validated QuantiTect primers (Qiagen) on an ABI 7900HT Sequence Detection System (Applied Biosystems, Inc.). Standard curves were generated using Universal Human Reference RNA (Stratagene) for all genes, with the linear correlation coefficient (r 2 ) ! 0.99 for all runs. The mean threshold cycle (C t ) values were calculated from triplicate C t values. Samples that had C t values with a SD > 0.3 in the triplicate run were repeated. Mean C t values were standardized to the mean C t value of the reference gene b-actin.
The cycling conditions consisted of 30 minutes at 50 C, 15 minutes at 95 C, and 40 cycles each of 15 seconds at 94 C, 30 seconds at 55 C, and 30 seconds at 72 C. Saliva samples from 100 subjects (60 primary OSCC, 15 dysplasia, and 25 controls) were tested for MMP1 and MMP3 concentration. These MMPs were selected because they exhibited the largest fold-difference out of MMPs analyzed in gene expression analyses. Each sample in duplicate was analyzed blinded to patient status by Aushon Biosciences Inc. using a SearchLight multiplex sandwich-ELISA proteome array (Aushon BioSystems), with images analyzed using the SearchLight Array Analyst software. Protein concentrations from 2 subjects could not be measured because of excessive viscosity.

MMP expression analyses for primary OSCC compared with control epithelium
Two rounds of quality control checks were done on the microarray gene expression data. First, recommendations by Affymetrix were followed to determine if any Gene-Chips needed to be excluded (27). Second, the "affyQCReport" and "affyPLM" software in the Bioconductor package within R statistical programming language were used to identify any poor-quality chips (28). In total, 167 cases and 45 controls passed both quality control procedures. The data were divided into a training set (119 cases and 35 controls) and a testing set (48 cases and 10 controls). We used the training set in analyses that identified the top differentially expressed genes, validated results in an independent testing set to evaluate discriminatory capacity of identified genes, and then merged the training and testing sets to provide greater numbers of subjects for secondary analyses on stage, survival, and stratified by site. We extracted gene expression values for $54,000 probe sets in the training set from CEL files and normalized the data using the RMA algorithm in Partek Genomics Suite (29).
Although overexpression of the MMPs in OSCC had previously been reported by others (6)(7)(8)(9)(15)(16)(17)(18)(19), our prior gene expression analyses did not identify the MMPs among the list of top differentially expressed genes because we had excluded probe sets based on magnitude (probe sets were excluded if the expression value for that probe set in any of the samples was <3 on log 2 scale; ref. 25). This study did not eliminate such probe sets. Using the training set, we eliminated probe sets if expression showed little variation (defined as interquartile range of expression levels less than 0.3 on the log 2 scale), resulting in $35,000 probe sets for further analyses.
To compare expression levels of tumor tissue to tissue from control subjects in the training set, we conducted ANCOVA (adjusting for age and sex) and a Student t test using Partek Genomics Suite. We set the false discovery rate at 1%. We then created a list for further analyses that contained only those probe sets with (i) t score greater than 6, and (ii) a 1.5-fold or greater difference in gene expression between cases and controls (on the log 2 scale). We identified the top MMPs by sorting this list according to case-control fold-difference.
We compared expression levels of these top differentially expressed MMPs in primary OSCC (n ¼ 119) with expression levels in oral epithelium from control subjects (n ¼ 35) by fitting linear regression models, adjusted for age and sex, and estimating 95% CI for all association estimates. We repeated these analyses with additional adjustment for pack years of smoking (continuous) and alcoholic drinks per day in the year prior to the date of diagnosis for OSCC cases or recruitment for controls (categorical). We calculated Pearson correlation coefficients for the top 4 differentially expressed MMPs.
In secondary analyses using data from both the training and testing sets, we conducted linear regression analyses of MMP1 case-control differences, adjusting for age and sex, after separating by (i) site: oral cavity (115 cases, 45 controls) versus oropharynx (52 cases, 45 controls), and (ii) HPV status: high-risk HPV positive (56 cases, 4 controls) versus HPV negative or low-risk HPV (111 cases, 40 controls).

Validation of MMP1 results using independent testing sets
We extracted gene expression values from files of our independent testing set (48 OSCC cases and 10 controls) and a publicly available data set from Gene Expression Omnibus (GEO; GSE6791; 42 OSCC cases and 14 controls; ref. 30), and normalized the 2 data sets using the RMA algorithm. We then used the receiver operator characteristic (ROC) curve to calculate, for each data set, the area under the ROC curve (AUC) to evaluate the discriminatory performance of the statistical prediction models.

Salivary MMP1 and MMP3 concentration
We log-transformed the MMP1 and MMP3 concentrations to correct for skewed data. We calculated the ratio of geometric means of their concentrations in cases (n ¼ 59) and control subjects (n ¼ 25) by fitting adjusted regression models, adjusted for age and sex. We then repeated these analyses for MMP1 after stratifying by site: oral cavity (45 cases, 25 controls) versus oropharynx (15 cases, 25 controls), and HPV status: high-risk HPV positive (15 cases, 1 control) versus high-risk HPV negative (43 cases, 24 controls).
In secondary analyses, we calculated the geometric mean for control, dysplasia, stage I/II OSCC, and stage III/IV OSCC. To evaluate sensitivity and specificity, we conducted ROC analyses and calculated AUC for salivary MMP1 and MMP3. To determine whether tissue mRNA levels of MMP1 and MMP3 correlated with corresponding salivary concentrations, we calculated the Pearson correlation coefficient.
In all analyses for which software is not specified, we used STATA statistical software (version 10.0, Stata Corp.) Associations of MMP1 and MMP3 gene expression with stage and survival We merged the training and testing datasets to compare expression levels of MMP1 and MMP3 (the top 2 differentially expressed genes based on the initial microarray analyses) across different disease categories. We were unable to categorize 2 subjects because of missing data. After categorization, we calculated mean log 2 MMP1 and MMP3 expression levels for 45 controls, 17 dysplasia, 54 stage I/II primary OSCC, and 111 stage III/IV primary OSCC.
We conducted survival analyses for primary OSCC cases using data from our training and testing sets. The primary outcome variable was survival time, measured as days after diagnosis until a death occurred, or until subjects were lost to follow-up, or January 25, 2011. We started the accumulation of follow-up time and deaths at 4 months after diagnosis date to exclude any deaths due to treatment-related complications. There were 159 subjects with at least 4 months of follow-up time. To analyze risk of death based on gene expression values of MMP1 and MMP3, we calculated OSCC-specific and all cause HR by Cox proportional hazards regression, adjusting for age, sex, and stage and HPV status. We then conducted mutual adjustment for MMP1 and MMP3.

Results
Cases tended to be older than control patients, and were more likely to be male, White and current smokers ( Table 1). In the training set, 74% of cases had oral cavity tumors and 26% had oropharyngeal tumors. In our internal independent testing set, oral cavity tumors accounted for 60% and oropharyngeal tumors for 40% of the cases.

Gene expression levels of MMPs
Results of ANCOVA analyses on the training set resulted in 16,228 significant probe sets out of a total of 33,057. After preprocessing and filtering, we identified 173 probe sets that were substantially differentially expressed between cases and controls (Supplementary Table S1). After sorting by fold-difference (cases vs. controls), the top overexpressed genes in this list were MMP1, MMP3, MMP10, and MMP12. MMP1 had an exceptionally high fold-difference (>200-fold; 3.16-fold on log 2 scale) and very low adjusted ANCOVA P value (P ¼ 7.16 Â 10 À40 ). We observed significant differential expression for multiple additional MMPs, including MMP2 (3.5-fold-difference; P ¼ 2.01 Â 10 À20 ) and MMP9 (8.0-fold-difference; P ¼ 2.29 Â 10 À13 ), but elected to conduct further analyses only on the 4 MMPs that were at the top of our list.
Using the combined training and testing sets, after stratifying by site and adjusting for age and sex, the difference in means for MMP1 (compared with control oral epithelium) was somewhat higher for oral cavity cancer (difference in means, 8.32; 95% CI: 7.75-8.90), than for oropharyngeal cancer (6.04; 95% CI: 5.09-6.99). Similarly, the difference in means was higher for OSCC among high-risk HPV-negative subjects (difference in means, 8.30; 95% CI: 7.70-8.90), than among high-risk HPV-positive subjects (6.00; 95% CI: 3.99-8.02). For oropharyngeal cancers only, the difference in means was 6.38 (95% CI: 4.54-8.22) among high-risk HPV-negative subjects and 6.10 (95% CI: 3.90-8.31) among high-risk HPV positive subjects. Similar patterns were observed for the other MMPs.
Our qRT-PCR results confirmed the substantial differential expression of MMP1, with a difference in mean C t value for controls versus cases of 10.43 (95% CI: 9.31-11.55, P < 0.001). The Pearson correlation coefficient for the microarray gene expression values and the qRT-PCR results was À0.99 (P < 0.001).
In stratified analyses of MMP1 salivary concentrations, results were stronger for patients with oral cavity In analyses across disease categories (controls, dysplasia, stage I/II OSCC, and stage III/IV OSCC), the geometric means of salivary MMP1 and MMP3 concentrations displayed an increasing trend with higher stage of disease ( Fig. 2). However, differences between categories did not reach statistical significance.

Associations of MMP1 and MMP3 transcript levels with survival
The median follow-up time for primary OSCC cases (n ¼ 159) was 51 months. For each unit increase in log 2 for death from all causes. We obtained similar results in analyses with adjustment for age and sex only. HRs for OSCC-specific death were lowered for MMP1 expression levels that were additionally adjusted for MMP3 expression (HR, 0.96; 95% CI: 0.62-1.48), and were increased for MMP3 expression values that were additionally adjusted for MMP1 expression (HR, 1.30; 95% CI: 0.89-1.92). These patterns were similar for all-cause mortality.

Discussion
The MMPs are believed to play a key role in tumor metastasis, cell migration, cancer cell growth, and angiogenesis (31)(32)(33). They promote cell invasion by cleaving components of the extracellular matrix, with MMP collagenases having the unique ability to cleave native inter-stitial collagen, such as collagen types I, II, III, and IV (34,35). Among the MMPs, MMP1 is the most ubiquitously expressed interstitial collagenase and plays a key role in initial cleavage of the extracellular matrix.    Results of this study highlight the importance of the MMPs in the development of OSCC. The prominence of these genes is concordant with similar microarray experiments; for example, Ziober and colleagues (7) reported MMPs on their list of 25 genes overexpressed in OSCC. In particular, MMP1 emerged as an exceptionally strong marker in our study. This gene was expressed at a level more than 200-fold greater in primary OSCC compared with oral epithelium from controls. Importantly, MMP1 perfectly distinguished OSCC from control tissue in ROC analyses in our independent testing set (AUC ¼ 100%), and almost perfectly discriminated OSCC from control tissue in an external independent testing set (AUC ¼ 98.4%). The substantial differential expression of MMP1 was confirmed by qRT-PCR. In addition, we analyzed expression of MMP1 and MMP3 across different stages in the natural history of OSCC and observed increasing expression corresponding to progression from normal tissue to dysplasia to OSCC (although not from stage I/ II to stage III/IV OSCC). The MMPs may thus be ideal molecular markers for monitoring progression from dysplasia to invasive OSCC.
The gene expression levels of the MMPs were highly correlated. In particular, we observed a correlation of 0.92 (P < 0.001) for MMP1 and MMP3. Therefore, although each of these genes is a strong biomarker for OSCC, the combination of the 2 does not add to their predictive ability. MMP1 and MMP3 are both on chromosome 11 in the region 11q22.3 (36). They are often observed to be coordinately expressed (37,38), and are activated by similar factors, such as interleukin-1 (39).
Protein concentrations of both MMP1 and MMP3 were observed to be highly elevated in the saliva of OSCC patients compared with saliva from cancer-free controls. Although results did not reach statistical significance, we observed a trend toward higher concentrations for both MMP1 and MMP3 with increasing disease severity. Several studies have evaluated salivary proteins as potential diagnostic tools in OSCC (24,(40)(41)(42)(43). Hu and colleagues identified a panel of protein biomarkers in the human salivary proteome and observed that a combination of 5 proteins yielded an AUC of 93% for OSCC detection (40). Shpitzer and colleagues observed increased salivary protein concentrations of 75% for MMP2 and 35% for MMP9 in cases versus controls (42,43). The MMPs investigated in our study were not among the proteins reported in these previous studies.
Oral cavity and oropharyngeal squamous cell carcinoma are particularly suitable for salivary biomarker measurements due to the presence of potentially abnormal cells/markers sloughed off directly into saliva. Identification of a sensitive and specific marker of OSCC by noninvasive means such as salivary analysis has great potential to assist diagnosis and prognosis (5). People at high risk of developing OSCC, such as heavy smokers, could potentially be monitored for the first indications of OSCC development. Salivary marker analysis could be done in between biopsies to assist in the monitoring of disease status of dysplasia patients (24). Salivary monitoring has the advantage of being less invasive, and provides a mechanism to possibly detect lesions in locations that are difficult to be visualized by general examination. In addition, the best strategy for management of patients presenting with mild to moderate dysplasia or with other atypical lesions is currently unclear. Markers associated with invasive disease (in saliva or the biopsy itself) could assist clinicians in determining which patients should undergo further screening or more invasive treatment.
We obtained different results after stratification by site, with greater case-control differences for MMP gene expression levels and salivary protein concentrations in oral cavity cancer versus oropharyngeal cancer. Similarly, case-control differences for MMP gene expression levels and salivary concentrations were greater among high-risk HPV-negative subjects versus high-risk HPV positive subjects, although the very small number of high-risk HPV-positive controls limit meaningful interpretation of these results. This pattern is congruent with results obtained after stratification by site, as cancers occurring in the oropharynx, as opposed to the oral cavity, are most strongly associated with HPV infection (44,45). Higher MMP expression in HPV-negative tumors may be reflective of biological processes that differ from HPV-positive tumors. HPV-negative OSCC is more frequently associated with smoking, and tobacco smoke has been shown to induce MMP1 mRNA expression in fibroblasts (46,47) and skin (48). However, we did not observe increased MMP expression associated with smoking for the cell types in our data (results not shown), and the reasons for higher MMP expression in HPV-negative tumors remain unknown. Despite these results, the difference in observed means for gene expression and salivary protein concentrations compared with controls was still substantial within both site and HPV status categories, supporting the use of these MMPs as biomarkers, regardless of site or HPV infection status.
We previously observed LAMC2 expression level to be highly predictive of OSCC-specific survival (AUC ¼ 80%, CI: 69-91 for LAMC2 combined with stage; ref. 49); elevated expression of MMP1 and MMP3 observed in this study was not as strongly associated with poor survival as LAMC2. However, we did observe a moderate elevation in risk of all-cause and OSCC-specific death associated with expression levels of MMP1 and MMP3. However, when MMP1 and MMP3 were mutually adjusted, there was not as much evidence of their association with prognosis.
Using more stringent filtering criteria and a different normalization algorithm, we previously identified 131 probe sets (111 unique genes) with mRNA expression that were substantially different between OSCC cases and controls (25). The top predictive model comprised of the genes LAMC2 and COL4A1 was able to distinguish cases from controls almost perfectly in our testing set (AUC ¼ 99.8%) and an independent data set (AUC ¼ 97.6%). Taken together, these results suggest that markers identified using different statistical analyses may have the same or similar ability to predict phenotype and represent the underlying network of biological processes that are involved. Unlike the situation for MMPs, a commercially available immunoassay assay against well-characterized epitope(s) of LAMC2 that can be used to test the presence of LAMC2 in saliva is not yet available.

Conclusion
Results from our investigations into MMPs lend support to the importance of this family of genes in the pathogenesis and progression of OSCC. The very strong association of MMP1 with OSCC in particular provides a robust candidate marker for further investigations into its use as a potential marker for OSCC development. The strong associations of MMP salivary protein concentrations with OSCC warrant further investigations into their use as salivary biomarkers.

Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.