Incidence and Demographic Burden of HPV-associated Oropharyngeal Head and Neck Cancers in the United States.

Background: Human papillomavirus (HPV)-positive oropharyngeal head and neck squamous cell carcinoma (OPSCC) is increasing in the United States. Current epidemiologic assessments of the national burden of HPV-positive OPSCC are needed. Methods: The Surveillance Epidemiology and End Results HPV Status Database included 12,017 patients with head and neck squamous cell carcinoma of pharyngeal subsites, including OPSCC and non-OPSCC head and neck cancer subsites (hypopharynx, nasopharynx, and “other pharynx”), diagnosed from 2013 to 2014. Age-adjusted incidence rates per 100,000 persons by HPV status were calculated. An exploratory Fine-Gray competing-risks regression determined the associations between HPV status and cancer-specific mortality. Results: From 2013 to 2014, the U.S. incidence of HPV-positive OPSCC was 4.62 [95% confidence interval (CI), 4.51–4.73] versus 1.82 (95% CI, 1.75–1.89) per 100,000 persons for HPV-negative OPSCC. The incidence of HPV-positive versus negative non-OPSCC of the head and neck was 0.62 (95% CI, 0.58–0.66) versus 1.38 (95% CI, 1.32–1.44). White race (5.47) and male sex (8.00) had the highest incidences of HPV-positive OPSCC, with a unimodal age incidence distribution peaking at ages 60 to 64 years (27.23). HPV positivity was associated with lower cancer-specific mortality than HPV-negative disease for OPSCC [adjusted HR (aHR), 0.40; P < 0.001], but not non-OPSCC (aHR, 1.08; P = 0.81), Pinteraction = 0.002. Conclusions: The U.S. incidence of HPV-positive OPSCC was 4.62 per 100,000 persons. Most cases were found in white male patients younger than 65 years, where it represents the sixth most common incident nonskin cancer. The favorable prognosis associated with HPV appears to be limited to the oropharynx. Impact: This large population-based epidemiologic assessment of the U.S. population defines the incidence and demographic burden of HPV-positive OPSCC.


Introduction
Human papillomavirus (HPV) is etiologically responsible for an increasing subset of oropharyngeal squamous cell carcinomas (OPSCC) in the United States and is associated with a favorable prognosis and significantly better survival compared with HPV-negative cancers (1-3). There are distinct risk factors for HPV-positive OPSCC, which are more strongly linked to sexual behaviors (4,5), compared with HPV-negative OPSCC that are associated with tobacco and alcohol use.
Few national datasets have collected HPV status in head and neck SCC (HNSCC) and therefore accurate epidemiologic estimates of the U.S. national burden, distribution, and outcomes of HPV-positive HNSCC (including oropharynx and non-oropharynx cancers) have been difficult to define. Prior U.S.-based estimates of HPV-positive head and neck cancer are primarily based on anatomic site or extrapolation from small subsets (n ¼ 271) of OPSCCs with known HPV status (6).
The Surveillance Epidemiology and End Results (SEER) program of the NCI has recently curated the Head and Neck with HPV Status Database (www.seer.cancer.gov). On the basis of this novel database, we report the largest and most comprehensive contemporary U.S. population-based epidemiologic study representative of the U.S. population to further clarify the national incidence of HPV-positive OPSCC.

Study cohort
The novel SEER Head and Neck with HPV Status Database . This is a nonpublic database that requires proposal and approval of analyses by the SEER custom data group before release. Cases with HPV status in this dataset include anatomic subsites (oropharynx, nasopharynx, hypopharynx, or "other pharynx") that have been reviewed for quality assurance by the SEER data quality team and have been determined to be primed for analyses. Other head and neck subsites outside of the pharynx such as the oral cavity or larynx were not included in the SEER Head and Neck with HPV Status Database. HPV status was determined by the results of any HPV testing (including p16, PCR, or in situ hybridization) performed on pathologic specimens from the primary tumor or metastatic site (including lymph nodes). Results of blood tests or serology were not used in determining HPV status.
In this study, we included anatomic subsites which include all data with HPV status that has been determined and reviewed for quality assurance by SEER. These subsites included HNSCC of the oropharynx (OPSCC) and non-oropharynx (non-OPSCC) head and neck subsites, which specifically refer to nasopharynx, hypopharynx, or "other pharynx" (for specific International Classification of Diseases codes see Supplementary Table S1). The study inclusion period of 2013-2014 represents the years in which HPV status has been collected and reviewed for quality assurance. The study cohort was limited to patients ages 25 and older.
Tumor size, nodal, and metastases staging was determined using the American Joint Committee on Cancer (AJCC) 7th edition (7). SEER age groups were stratified in 5-year increments from 25 to 85þ. Race was classified by SEER as white, black, Asian/ Pacific Islander, American Indian/Alaskan Native, or unknown, while ethnicity was classified as non-Spanish-Hispanic-Latino or Spanish-Hispanic-Latino, groupings that are compatible with available annual population estimates used as denominators for the incident rate estimates. County attribute data including median household income and educational attainment (percent greater than high school education) were linked to SEER population data by state-county Federal Information Processing Standards (FIPS) codes. Small area estimates for percent ever smoker were linked to SEER via estimates developed from the Behavioral Risk Factor Surveillance System (BRFSS) and the National Health Interview Survey (NHIS).
The study was deemed institutional review board exempt at our institution, Dana-Farber Cancer Institute (Boston, MA).

Statistical analyses
Primary analyses: estimates of HPV-positive OPSCC and non-OPSCC incidence rates in the United States. The primary measure of this study was age-adjusted incidence rates by HPV status. Specifically, SEER Ã Stat version 8.3.4 was used to calculate ageadjusted incidence rates and corresponding 95% confidence intervals (CI) of OPSCC and non-OPSCC by HPV status (HPV negative vs. HPV positive), with annual population estimates as the denominator (7). Incidence rates were expressed per 100,000 persons and age-adjusted with the 2000 U.S. standard population from the U.S. Census, with 142,239,885 people representing the population at risk (https://www.census.gov). Incidence rates by HPV status were also determined, stratified by OPSCC and non-OPSCC sites of the head and neck, age group, race, ethnicity, and sex.
Of the 12,017 patients with pharyngeal SCC included in the SEER Head and Neck with HPV Status Database, 76.3% (n ¼ 9,169) had disease of the oropharynx. Among patients with OPSCC, 44.2% (n ¼ 4,056) had unknown HPV status at diagnosis, while 71.1% (n ¼ 2,024/2,848) patients with non-OPSCC of the head and neck had unknown HPV status at diagnosis. To improve the accuracy of incidence rates by HPV status, we estimated the rate of HPV positivity among unknowns by using the proportion of HPV-positive versus -negative disease from this large dataset of patients. While this approach provided a datadriven estimate of the proportion of patients with unknown HPV status that are likely to be HPV positive, it assumes that there is no bias in missingness of HPV status by anatomic site group. The overall incidence of HPV-positive disease, utilizing the estimated HPV status among unknowns, was determined using the following formula: ([total count of HPV-positive patients/(total count of HPV-positive patients þ total count of HPV-negative patients) Â total count of unknown-HPV patients] þ total count of HPVpositive patients)/total population of HPV-positive, -negative, and unknown status Â 100,000. This formula was repeated for all incidences reported in this study.
Exploratory analyses: baseline characteristics and cancer-specific mortality estimates by HPV status in patients with nonmetastatic (M0) disease. Patients who were diagnosed at autopsy or death certificate or who had multiple primaries were queried for incidence rate analyses described above, but not for survival analyses. In patients with known HPV status and no distant metastases (M0 disease; N ¼ 4,476), distributions of continuous and categorical covariates were assessed by HPV status with the Wilcoxon ranksum and Mantel-Haenszel x 2 tests, respectively. Multivariable logistic regression was used to test for associations between HPV status and patient clincodemographic characteristics. Exploratory survival analysis applied multivariable Fine-Gray competing-risks proportional hazards regression in patients with M0 disease to examine the relationship between HPV status [(HPV-positive vs. HPV-negative (referent)] and cancer-specific mortality (CSM; ref. 8). Other patient factors included in the model were disease subsite, tumor stage, nodal stage, initial definitive treatment, age, race, ethnicity, sex, insurance status, smoking propensity (based on percent ever-smoker small area estimates, a continuous variable provided by SEER), income, and education.
To ascertain whether there is a differential prognosis of HPV status by disease site, another Fine-Gray competing-risks regression model for cancer-specific hazard included the subsite (non-OPSCC vs. OPSCC) Â HPV status (positive vs. negative) interaction term.
Cumulative incidence plots for cancer-specific death were created on the basis of the models described above. Two-sided statistical testing was used with a ¼ 0.025 (after Bonferroni correction for n ¼ 2 groups by HPV status). Stata/SE 14.2 (StataCorp) was used for all analyses.

Baseline characteristics and CSM estimates by HPV status in nonmetastatic (M0) disease
Among patients with M0 disease, baseline characteristics including subsite, staging, age at diagnosis, race, initial management, income, education, and smoking propensity were all statistically different for HPV-positive versus HPV-negative disease (Table 1).
On multivariable analyses, there were significant associations between HPV-positive disease and the aforementioned patient characteristics, with a higher odds of HPV-positive disease associated with oropharynx site, smaller primary tumors, higher N-stage, ages 60-64, white race, non-Hispanic ethnicity, and male sex (all P < 0.05; Table 1). When stratified by disease site (OPSCC vs. non-OPSCC head and neck sites), smaller T-stage and higher N-stage were associated with HPV-positive disease in OPSCC, but not in non-OPSCC (P subsite Ã tumor stage interaction < 0.001; P subsite Ã nodal stage interaction ¼ 0.002; Supplementary Table S2).

Discussion
This large population-based epidemiologic assessment of the U.S. population defines the incidence and demographic burden of HPV-positive OPSCC. The U.S. incidence of HPV-positive OPSCC was 4.62 per 100,000 persons-250% the incident rate of HPV-negative OPSCC.
Most new cases were found in white male patients ages 64 and younger, where it represents the sixth most common nonskin solid cancer (9). Furthermore, we were able to compare the patient and tumor characteristics of HPV-positive compared with HPV-negative tumors and subsites. For example, the clinical These findings provide novel epidemiologic data on the U.S. burden of HPV-positive OPSCC and non-OPSCC of the head and neck and identify patient cohorts at high risk of this disease entity. We were also able to determine the short-term prognostic impact of HPV status stratified by disease subsite due to the large number of non-oropharynx cancers included in this study. These results provide the most accurate and contemporary incidence rates of HPV-positive head and neck cancers available in the United States to inform ongoing policy and research efforts targeted at addressing the increasing national burden of HPV-positive head and neck cancer.
Our study confirmed and established several associations between patient characteristics and HPV-positive OPSCC that have been hypothesized and estimated in prior studies that were limited because of inability to directly assess HPV status or small, potentially nonrepresentative sample sizes. First, it has been hypothesized and observed that there is an increasing trend of oropharyngeal cancers in the United States and that most of these cancers are HPV positive (6,10). Our results suggest that for every two cases of newly diagnosed HPV-negative OPSCC that occur in the United States, there are five cases of newly diagnosed HPV-positive OPSCC. Second, since 1973 there has been an increased incidence in oropharyngeal cancers particularly among white men hypothesized to be attributed to an increase in HPV-positive disease (6,10). In a study of 271 patients with oropharyngeal cancer and known HPV status, most of the patients with HPV-positive disease were found to be of white or of other races (6). Our study confirms and establishes that the greatest epidemiologic incidence and odds of HPV-positive OPSCC by race is seen in white male patients. Furthermore, given the data on multiple races in our study, we demonstrated that the burden of HPV-positive OPSCC in white patients is significantly greater than the burden observed in black or Asian/Pacific Islander patients, even after adjusting for tumor and patient characteristics. These prior studies have also hypothesized that younger age is associated with a higher risk of HPV-positive OPSCC. We confirmed that younger age at diagnosis was associated with HPV-positive disease. Given our robust data across age groups we could demonstrate a unimodal distribution of HPV-positive OPSCC by age with a peak distribution at ages 60-64, in contrast with the bimodal age distribution of oral HPV infection (11). The driving factors behind the observed associations between patient characteristics and HPV-positive disease have been hypothesized to be partly attributed to trends in smoking patterns and sexual behaviors in the United States (4,5). Specifically, smoking rates are much lower than what they were in the 1980s, and smoking is a significant risk factor associated with the development of HPV-negative OPSCC (12). Yet the interplay between tobacco-use and HPV infection is complex as increased tobacco smoking is associated with increased risk of oral HPV (12). Furthermore, sexual behaviors have changed in the United States with increased oral sex and oral HPV exposure, which is an obvious risk factor for HPV-positive OPSCC (11). The differential distribution of OPSCC by race/ethnicity may be also related to differences in sexual behaviors (13). These differences in behaviors have not been established necessarily as a causal link to HPV-positive OPSCC and it is difficult to comprehensively establish the determinants of socio-behavioral and biological factors behind the profound differences seen across patient characteristics, particularly for race/ethnicity and sex. For example, a cross-sectional analysis of HPV oral infection did not find an increased prevalence of oral HPV among white males, those at highest risk of HPV-related OPSCC (11). Therefore, further research is needed into the link between patient characteristics and the risk of viral oncogenesis.
Perhaps the most important observation in our large epidemiologic study is that HPV-positive OPSCC is the sixth most common incident nonskin solid cancer among white male patients ages younger than 65 (9.34; 95% CI, 9.02-9.66, per 100,000 persons; ref. 9). As such, this group should serve as a target population for scientific and policy or public health initiatives focused on addressing the increasing burden of HPV-positive OPSCC. The high risk for the development of HPV-positive OPSCC in this patient group could potentially guide the study of the efficacy and/or development of vaccinations in the prevention of oral HPV infections. HPV vaccination has been shown to be efficacious in the prevention of noncervical infections in both men and women, but has not yet been established as preventative for oral HPV infection (14)(15)(16).
At present the Centers for Disease Control and Prevention recommends HPV vaccination in females ages 11-26, or males ages 11-21 (or through 26 for gay, bisexual, or other men who have sex with men; ref. 17). If HPV vaccination is determined to be efficacious in the prevention of oral HPV infection, then this recommendation will need to be revisited and potentially modified. Notably, prior authoritative epidemiologic data demonstrated a bimodal age distribution of oral HPV infection in the United States with peak prevalence at ages 30-34 (11). Therefore, the likely time-course from HPV infection to development of OPSCC is approximately 30 years given the unimodal peak incidence of HPV-positive OPSCC was 60-64 in this study. Potential vaccine trials and/or future changes the vaccination recommendations should take this observation into account to effectively and efficiently reduce the burden of HPV-positive OPSCC.
Finally, our study characterized the U.S. national burden of HPV-positive disease in non-oropharyngeal sites. Our findings show that the overall burden of new cases is low. Consistent with multiple prior reports, we found that HPV positivity was associated with a favorable prognosis in OPSCC (1-3). However, there  was a differential impact of HPV status on CSM such that HPV status was not predictive of prognosis in non-OPSCC of the head and neck where there was no difference in outcomes between HPV-positive and -negative patients. Furthermore, among patients with HPV-positive disease, OPSCC was associated with a favorable prognosis compared with non-OPSCC of the head and neck, but there was no difference in CSM by disease site among HPV-negative patients. A prior study observed a similar pattern and significant interaction between SCC subsite (oropharynx vs. non-oropharynx) and HPV status for the out-comes of overall survival and progression-free survival (18). Ultimately, the non-OPSCC group is heterogeneous and future studies with longer follow-up and more patients will be needed to determine whether this observation continues to be observed over time.
Our findings must be viewed within the context of the limitations of the available data within the SEER program. First, SEER does not contain complete information on receipt of chemotherapy or radiotherapy. Therefore, our analyses with regards to CSM focused on the prognostic rather than predictive value of HPV status in HNSCC, which can be accurately measured with our data. Furthermore, we adjusted for available chemotherapy and radiotherapy data through custom linked treatment data. Second, SEER does not contain information on comorbidity status. However, SEER does contain cause of death information allowing us to account for non-head and neck cancer-related mortality. Third, the follow-up period for our cohort was relatively short. Nevertheless, there were enough events to detect a significant difference in CSM by HPV status stratified by subsite using Bonferroni correction. Future studies with longer follow-up will be needed to determine whether this association persists. Fourth, SEER does not include patient-level smoking data, however we generated and adjusted for smoking propensity estimates by linking patient data to small area estimates developed from BRFSS and the NHIS. Fifth, non-OPSCC subsites included were nasopharynx, hypopharynx, and sites characterized as "other pharynx," and did not include the oral cavity or larynx as these were not included in the SEER HPV-specific database. Finally, information on HPV-status was missing for a proportion of patients, which may introduce bias in both the OPSCC and the non-OPSCC populations. As described in the Materials and Methods section, HPV status was imputed using the proportion of HPV-positive patients among the patients with known HPV status. It should be noted, although, that prior estimates of the U.S. incidence of HPV-positive head and neck cancer utilized the anatomic site alone (i.e., oropharynx) without specific assays for HPV positivity or for a small subset (n ¼ 271) of patients with known HPV status from the SEER registry (5,6).

Conclusions
This large population-based epidemiologic study defines the U.S. burden of HPV-positive head and neck SCC across subsites and patient characteristics. We found that the U.S. incidence of HPV-positive OPSC was 4.62 per 100,000 persons, with most new cases found in white male patients younger than 65, where it represents the sixth most common incident nonskin solid cancer.