
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Center for Chronic Disease Outcomes Research, Minneapolis Veterans Affairs Medical Center; 2 University of Minnesota Department of Medicine, Minneapolis, Minnesota; 3 Division of Health Promotion and Behavioral Sciences, University of Texas-Houston School of Public Health, Houston, Texas; 4 University of Minnesota, Department of Family Medicine and Community Health and Division of Epidemiology, Minneapolis, Minnesota; 5 Durham Veterans Affairs Medical Center, and Duke University Medical Center, Durham, North Carolina
Requests for reprints: Melissa R. Partin, Center for Chronic Disease Outcomes Research, Minneapolis Veterans Affairs Medical Center, Minneapolis, MN 55417. Phone: 612-467-3841; Fax: 612-727-5699. E-mail: melissa.partin{at}va.gov
| Abstract |
|---|
|
|
|---|
Materials and Methods: 890 patients, ages 50 to 75 years, from the Minneapolis Veterans Affairs (VA) Medical Center were surveyed by mail. Phone administration was attempted with mail nonresponders. VA and non-VA records were combined for the reference standard. Sensitivity, specificity, concordance, and report-to-records ratio (R2R) were estimated for overall and test-specific CRC adherence among respondents providing complete medical records. Secondary analyses examined variation in estimates by patient characteristics, treatment of missing and uncertain responses, and whether a strict or liberal time interval was used for assessing concordance.
Results: Complete medical records were available for 345 of the 686 survey responders. For overall adherence, sensitivity was 0.98, specificity was 0.59, concordance was 0.88, and R2R was 1.14. Sensitivity was 0.82 for fecal occult blood test (FOBT), 0.75 for sigmoidoscopy, 0.97 for colonoscopy, and 0.63 for double-contrast barium enema (DCBE). Specificity was 0.89 for FOBT, 0.76 for sigmoidoscopy, 0.72 for colonoscopy, and 0.85 for DCBE. Concordance was >0.80 for all tests other than sigmoidoscopy (0.76). R2R was 1.31 for FOBT, 1.33 for sigmoidoscopy, 1.42 for colonoscopy, and 6.13 for DCBE. The R2R was lower for a combined sigmoidoscopy and colonoscopy measure. Overreporting was more pronounced for older, less-educated individuals with no family history of CRC. Sensitivity and R2R improved using a liberal interval and treating uncertain responses as nonadherent (versus missing), but differences were not statistically significant.
Conclusions: Self-reported CRC screening validity is generally acceptable and robust across definitional decisions, but varies by screening test and patient characteristics. (Cancer Epidemiol Biomarkers Prev 2008;17(4):768–76)
| Introduction |
|---|
|
|
|---|
Self-report is a critical source of information on CRC screening adherence for a variety of reasons. Perhaps the most obvious reason is that self-report is often the only available source of this information, particularly at the population level. Additionally, however, self-report may be the most efficient means for collecting this information in some circumstances, given that (a) the various CRC screening procedures endorsed are not all conducted in the same clinical settings and therefore may not be documented in a single location and (b) the data privacy restrictions imposed by the Health Insurance Portability and Accountability Act markedly complicate gathering medical data from multiple sources. Despite these advantages, self-reported data on screening behavior can be misleading if it is systematically biased. Furthermore, the substantial variation, across prior studies and surveillance efforts, in how self-reported CRC screening behavior is collected and reported makes it difficult to draw conclusions about current adherence levels, trends, determinants, and how to modify them (7).
In recognition of the value of self-reported CRC screening data and the importance of standardized measures for facilitating efforts to improve CRC screening rates, the National Cancer Institute (NCI) developed a core set of questions for measuring self-reported CRC screening behavior (7) that reflect the CRC screening guidelines published by the American Cancer Society and the U.S. Preventive Services Task Force between 2001 and 2002 (6, 8). This questionnaire is called the NCI Colorectal Cancer Screening questionnaire (or NCI CRCS questionnaire). Although previous studies have examined the validity and reliability of self-reported CRC screening measures (9-19), none included DCBE, most did not include colonoscopy, and most were conducted on highly educated, racially homogenous samples in health maintenance organizations and are not readily generalizable to disadvantaged populations.
This study validated the NCI CRCS questionnaire measures in a population of low-income veterans using a mailed survey with phone follow-up. Common measures of validity (sensitivity, specificity, concordance, and report-to-records ratio) were estimated for overall adherence (by any test) and test-specific adherence (FOBT, sigmoidoscopy, colonoscopy, either sigmoidoscopy or colonoscopy, and DCBE separately). Secondary analyses examined variation in validity estimates by patient demographic and health characteristics, treatment of missing and uncertain responses, and whether a strict or liberal time interval was used for assessing concordance.
| Materials and Methods |
|---|
|
|
|---|
65 y), male (93%), and Caucasian (>85%). In 2006,
75% of the population were adherent to CRC screening guidelines, and FOBT was the primary screening modality used.
Eligibility Criteria
Study participants included male and female veterans age 50 to 75 y who had one or more primary care visits to Minneapolis VA Medical Center between October 2003 and September 2005. Individuals with CRC diagnoses, individuals enrolled in VA adult day care and nursing home facilities, and individuals with dementia or Alzheimer's disease documented in VA medical records were excluded. Females and African Americans were oversampled to facilitate sex and race comparisons.
Sampling
Figure 1
illustrates the sample selection details. A total of 29,882 subjects were identified as eligible for the sample (715 females, 490 African American males, and 28,677 non–African American males). A random sample of 900 eligible patients, stratified by sex and race, was drawn from the pool of eligible patients to produce a survey recruitment sample containing 300 females, 300 African American males, and 300 non–African American males. Ten patients were withdrawn from this sample before survey recruitment, either because they were deceased (n = 6) or nonveterans/patients not enrolled at the Minneapolis VA Medical Center (n = 4). A total of 686 (77%) of the remaining 890 participants completed the survey, 627 (70%) by mail and 59 (7%) by phone. Survey nonrespondents were more likely than respondents to be of ages 50 to 64 y (81% versus 67%, P = 0.0002), African American (52% versus 29%, P < 0.0.0001), male (73% versus 65%, P = 0.04), and have a mental health diagnosis (78 versus 63%, P < 0.0001) but did not differ statistically significantly from respondents on the Charlson Comorbidity Index.
|
Data Collection
Self-Reported Survey Data. Self-reported CRC screening behaviors were collected from a mixed-mode survey, in which a mailed questionnaire was the primary mode and phone was the secondary mode. The survey was conducted between March and June 2006. The initial survey mailing included a cover letter, the questionnaire, and a $2 bill cash incentive. A reminder postcard was mailed
1 wk after the first survey mailing. A second survey mailing (with no incentive) was mailed to those who did not return a questionnaire within 3 to 4 wk of the first mailing. Phone administration of the survey was attempted with all participants who did not return a questionnaire within 3 wk of the second survey mailing. The 14-page questionnaire included the NCI CRCS measures (7), as well as measures of CRC screening knowledge and attitudes and patient demographic and health characteristics. The self-reported CRC screening behavior section followed the NCI CRCS question wording and order with two exceptions: (a) we asked only for the timing and date of the most recent test received and (b) we added the following question directly after the "ever heard of" question regarding physician recommendation: "In the past 12 months, did a doctor, nurse, or other health professional advise you to have a [screening test type]?"
Medical Records Data. To obtain a complete history of testing behavior, medical record information was compiled from both VA medical records and non-VA medical records. VA medical records were available for all subjects in the study sample. However, we needed to request a release of information form for any non-VA medical records. For participants who completed and signed a non-VA medical records release form, medical records for FOBTs, sigmoidoscopies, colonoscopies, and DCBEs were requested from all sources of health care services identified by the respondent as received outside the VA. We requested records for subjects who reported being screened, as well as those who reported not being screened. We received medical records from all hospitals and clinics where we requested them. Including these non-VA medical records added 4 FOBTs, 2 sigmoidoscopies, 58 colonoscopies, and 2 DCBEs to our counts of the number adherent with test-specific guidelines for each of these procedures.
All VA and non-VA records were reviewed to determine the types of tests (FOBT, sigmoidoscopies, colonoscopies, and DCBEs), dates of tests, and if possible, indications for the tests. For the VA medical records, this information was abstracted electronically. For non-VA medical records received from clinics and hospitals, all relevant outpatient notes and laboratory reports in the specified time period were reviewed by a trained member of the study team using a standardized medical record abstraction form. To ensure quality control,
10% of these records were independently reviewed by a second trained team member, with 97% agreement across all category assignments. When necessary, an expert physician was consulted to resolve disagreements.
Measures
We examined sensitivity, specificity, concordance, and report-to-records ratio (defined below) for both overall and test-specific adherence measures. Because we were concerned about possible confusion between sigmoidoscopy and colonoscopy among respondents, we calculated validity estimates for sigmoidoscopy and colonoscopy separately, as well as for a combined endoscopy measure (adherent with either sigmoidoscopy or colonoscopy). Below, we describe how adherence status was assessed from self-reported and medical records data sources.
Assessing Self-Reported Adherence Status. Three items from the NCI CRCS questionnaire (7) were used to assess self-reported CRC screening adherence status: (a) a dichotomous question regarding whether they ever had the CRC test in question, (b) a categorical interval question asking when they had their most recent (CRC test type), and (c) an exact-date question asking for the specific month and year of the most recent test. Coding of adherence status using these three questions is described in detail below.
Our decisions about which question to prioritize in our coding procedures was informed by the relative distribution of missing responses across the categorical and exact-date questions. Among respondents that said they had ever been screened, participants were more likely to respond to the interval question (which appeared first in the questionnaire order) than the exact-date question. Indeed, although consistently >97% provided a response to the interval question (regardless of test type), only 47% answered the exact-date question for FOBT and colonoscopy, only 37% for sigmoidoscopy, and only 19% for DCBE. Because it had fewer missing responses than the exact-date question, we first examined responses to the categorical question "when did you have your most recent [screening test type]." If the response to this question reflected adherence with the recommended time interval for that procedure (i.e., "a year ago or less" for FOBT; either "a year ago or less" or "less than 1 but not more than 5 years ago" for sigmoidoscopy and DCBE; and either "a year ago or less," "more than 1 but not more than 5 years ago," or "more than 5 but not more than 10 years ago" for colonoscopy), the respondent was coded as adherent with screening guidelines based on self-report. For the combined sigmoidoscopy and colonoscopy estimates, we coded, as adherent, any respondent who reported receiving either a sigmoidoscopy in the past 5 y or a colonoscopy in the past 10 y. If a respondent did not answer the above timing question or indicated they were not sure but they provided a specific month and year for their most recent procedure in the follow-up date question, adherence was determined based on the comparison between the survey completion date and the procedure date provided. If adherence status was still not clear (i.e., no month and year were provided), we then checked the participant's response to the question regarding whether the procedure was ever done. If the answer to this question was "no," then nonadherence status was assigned. We compared estimates that treated the remaining cases as "missing" to estimates that assigned these cases a status of "nonadherent".
Assessing Medical Records Adherence Status. If a participant had documentation in either their VA or non-VA medical records (if applicable) of having received a test within the recommended timeframe (i.e., 12 mo before the survey completion date for FOBT, 5 y before the survey completion date for sigmoidoscopy and DCBE, and 10 y before the survey completion date for colonoscopy), they were coded as adherent with screening guidelines based on medical records (strict interval definition). For the combined sigmoidoscopy and colonoscopy measures, individuals were coded as adherent if their medical records documented receipt of either a sigmoidoscopy in the past 5 y or a colonoscopy in the past 10 y.
Secondary analyses recalculated the validity estimates defined below using a more liberal agreement window (FOBT within 15 mo of the survey completion date, sigmoidoscopy or DCBE within 5.5 y of the survey completion date, or colonoscopy within 11 y of the survey completion date). Because we classified self-reported screening status first using the categorical response question and used responses regarding the exact month and year only when the categorical response was missing or "don't know" (which occurred in <12% of cases across all test types), this definition change was more likely to affect adherence status based on medical records data than adherence status based on self-reported data. Overall CRC screening adherence for both self-report and medical records was defined as adherence with any one of the test-specific adherence measures.
Validity Measures. Table 1
provides the raw data from which all validity measures were derived. Sensitivity is the proportion of participants that reported having the test among those who had documentation of having a test in the medical record. Specificity is the proportion of participants that reported not having a test among those who had no documentation of having the test in the medical record. Concordance is the proportion of participants whose self-reported and medical records adherence status agrees. The report-to-records ratio (R2R) is the number of participants reporting a test divided by the number of tests in the record. A report-to-records ratio of >1 indicates overreporting, and a ratio of <1 indicates underreporting. There are no established standards for acceptable ranges on the above measures. However, based on the approach used in one recent validation study (20), we considered estimates of
0.9 to indicate excellent agreement for sensitivity, specificity, and concordance; estimates of
0.8 to indicate good agreement; estimates
0.7 to indicate fair agreement; and estimates <0.7 to indicate poor agreement.
|
high school, some college,
college graduate), income (
$20,000, $20-40,000, >$40,000), marital status (married versus unmarried), and family history (yes versus no) were obtained from the questionnaire. Comorbidities were summarized using the Charlson Comorbidity Index score (21, 22) and a measure of mental health diagnoses, which categorized individuals into three groups: (a) no mental health diagnoses, (b) single psychiatric (ICD-9 codes 290-302 and 306-311) or substance abuse–related (ICD-9 codes 303-305) diagnosis, or (c) dual diagnosis (psychiatric and substance abuse). We included measures of mental health diagnoses because they could conceivably affect the accuracy of recall and are not captured in the Charlson Comorbidity Index.
Analysis and Sample Size Determination
The sample size for this study was selected to maximize power for comparing validity estimates by sex and race. Power calculations were based on the assumption that both of the two subgroups would have a minimum estimate of 70% for sensitivity, specificity, or concordance and that, in each subgroup, at least 50% of the participants had complete medical records available. We had at least 81% power to detect a 15-percentage-point-or-greater difference in sensitivity, specificity, and concordance across sex and race subgroups.
Confidence intervals for the validity measures were calculated using bootstrapping methods due to nonnormal sampling distributions. This involved randomly drawing 10,000 samples, with replacement, from the original sample (n = 345), ordering the point estimates computed from each of these bootstrap samples from low to high, and selecting the point estimates corresponding to the 2.5th and 97.5th percentiles as the limits defining the 95% confidence interval.
Human Subjects
The Minneapolis VA Medical Center Subcommittee for Human Studies approved the study protocol.
| Results |
|---|
|
|
|---|
|
|
Variation in Validity Estimates Across Population Subgroups
Subgroup estimates for overall adherence, shown in Table 4
, were calculated using a strict interval for assessing adherence in the medical record and treating missing and uncertain self-report responses as missing (rather than nonadherent). Although sensitivity and concordance were consistently good-to-excellent across all subgroups and specificity was consistently fair-to-poor across all subgroups, the extent of overreporting (as assessed by the R2R) was statistically significantly more pronounced for older individuals, those with less than a high-school education, and those without a family history of CRC.
|
| Discussion and Conclusions |
|---|
|
|
|---|
Sensitivity, specificity, concordance, and R2R estimates for overall adherence and DCBE have not been reported in prior studies. However, the validity estimates for FOBT, sigmoidoscopy, and colonoscopy found in this study are comparable with those reported in prior studies. Our R2R estimates for FOBT (1.14-1.31) are higher than the estimates reported in the only prior study reporting R2R estimates (1.03; ref. 14), but our sensitivity and concordance estimates for FOBT (0.79-0.82 and 0.87-0.88, respectively) are in the upper end of the range of estimates reported in prior studies (0.55-0.96 and 0.72-0.94; refs. 9, 13-19), and our specificity estimates (0.89-0.91) are higher than those reported in prior studies (0.58-0.87; refs. 9, 13-19). R2Rs have not been reported for sigmoidoscopy in prior studies, but our sensitivity (0.66-0.75), specificity (0.76-0.78), and concordance (0.74-0.76) estimates for sigmoidoscopy are in the middle of the range of estimates for these measures reported in prior studies (0.33-1.00, 0.53-0.96, and 0.63-0.96, respectively; refs. 9, 13, 15, 18, 19). Finally, our estimates of sensitivity for colonoscopy (0.93-0.97) are in the upper range of sensitivity estimates reported in prior studies (0.56-0.95; refs. 13, 18, 19), but our specificity estimates (0.72-0.74) are lower than those reported in prior studies (0.87-0.97; refs. 13, 18, 19), and the one prior study reporting concordance for colonoscopy (13) found a higher rate of concordance (0.94) than we did (0.81).
Consistent with prior studies examining the validity of self-reported CRC screening, we found overreporting was a more prevalent source of error in self-reported CRC screening than underreporting (9, 13, 14, 16, 19). However, under certain definitional choices (liberal interval and uncertain and missing responses treated as nonadherent), the estimated extent of overreporting for overall adherence observed in our study was minimal (R2R, 1.07). The fact that overreporting was most pronounced for colonoscopy may reflect the fact that this is the newest screening procedure of the four CRC screening procedures and, hence, the one with which our population is the least familiar. The primary screening tests used in the population sampled for this study are FOBT and sigmoidoscopy. Colonoscopy is primarily used for diagnostic purposes in this VA setting. The low sensitivity estimates for sigmoidoscopy and higher overreporting of colonoscopy found in this study may, in part, reflect confusion between sigmoidoscopy and colonoscopy. Indeed, we chose to report estimates for a combined endoscopy measure because we found that 55% of patients with false positive reports for either one of these endoscopic procedures had documentation of receiving the other endoscopic procedure in their medical record. Although we found the R2R and specificity decreased with the combined definition, the one prior study reporting validity estimates for a combined endoscopy measure found that the combined estimates did not differ markedly from the separate estimates (19).
Our analyses of subgroup variation in validity estimates for overall adherence found statistically significant subgroup differences for age, education, and family history. This variation across subgroups could have important implications for the comparability across studies, as well as for inference about subgroup differences in screening behavior within studies. Several prior studies validating self-report CRC screening behavior examined subgroup differences in validity estimates (12, 14-19). However, the findings regarding subgroup differences are inconsistent across studies. This variation in subgroup differences in validity estimates across studies makes it difficult to draw conclusions about the comparability of reporting patterns across population subgroups and may, in part, reflect the substantial variation in what specific reported screening patterns were being validated in each study. Indeed, no two prior studies that investigated the validity of self-reported CRC screening behavior validated the same measure and no prior study validated CRC screening measures using a definition driven by evidence-based and current recommended screening guidelines, as we did. Future validation studies should focus on evaluating the validity of those screening measures most likely to be used in surveillance and clinical practice (i.e., measures, such as those examined in this study, which assess adherence to recommended CRC screening guidelines).
Additional sources of variation in the screening definitions validated in existing studies include the time interval used to assess concordance between self-report and medical records data (i.e., exact-date match versus some other more liberal criterion) and how missing and uncertain responses to survey questions are handled in the analysis. The findings from this study suggest that validity estimates were improved when using a more liberal time interval for assessing adherence between self-report and medical records and when cases with missing or uncertain responses are treated as nonadherent with CRC screening guidelines (rather than missing). However, improvements in these estimates were not statistically significant. The effect of how missing and uncertain responses are handled on validity estimates will be the most consequential for studies with a high proportion of missing and uncertain responses. Only three prior CRC screening studies specified how missing and uncertain responses were handled in the analysis (9, 10, 13), and none provided information on the proportion of responses that were missing or uncertain. To facilitate comparisons, future studies should provide information on the extent of missing and uncertain responses and how they were handled in the analysis. Because a more liberal screening interval definition has been recommended for studies examining determinants of screening behavior and evaluating interventions designed to enhance adherence with screening recommendations (23), the liberal estimates reported in this study may be more appropriate for most applications.
The findings from this study must be qualified by a number of limitations. Firstly, the results of this study may not generalize to other populations and settings because they are based on the population of low-income veterans receiving care from just one VA medical facility. Secondly, although we structured our choice of measures and analysis plan based on the assumption that the medical records data used as the reference standard were complete, it is possible that for some participants this information was not complete (i.e., participants may have forgotten to identify some of their non-VA health care service providers). Finally, our analyses excluded those patients for whom we could not confirm that we had complete medical records information and minorities and unmarried individuals were overrepresented among those excluded. However, the fact that neither of these characteristics were statistically significantly associated with overall adherence validity estimates suggests that the overrepresentation of these groups among those excluded is unlikely to have biased our estimates. Despite these limitations, as one of the first studies to document the sensitivity of validity estimates to key definitional choices, the findings from this study make an important contribution to the literature on CRC screening behavior.
| Footnotes |
|---|
Received 8/17/07; revised 1/22/08; accepted 2/19/08.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
L. C. Walter, K. Lindquist, S. Nugent, T. Schult, S. J. Lee, M. A. Casadei, and M. R. Partin Impact of Age and Comorbidity on Colorectal Cancer Screening Among Older Veterans Ann Intern Med, April 7, 2009; 150(7): 465 - 473. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. B. Potter, L. Phengrasamy, E. S. Hudes, S. J. McPhee, and J. M.E. Walsh Offering Annual Fecal Occult Blood Tests at Annual Flu Shot Clinics Increases Colorectal Cancer Screening Rates Ann. Fam. Med, January 1, 2009; 7(1): 17 - 23. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Ferrante, P. Ohman-Strickland, K. A. Hahn, S. V. Hudson, E. K. Shaw, J. C. Crosson, and B. F. Crabtree Self-report versus Medical Records for Assessing Cancer-Preventive Services Delivery Cancer Epidemiol. Biomarkers Prev., November 1, 2008; 17(11): 2987 - 2994. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Jones, S. J. Mongin, D. Lazovich, T. R. Church, and M. W. Yeazel Validity of Four Self-reported Colorectal Cancer Screening Modalities in a General Population: Differences over Time and by Intervention Assignment Cancer Epidemiol. Biomarkers Prev., April 1, 2008; 17(4): 777 - 784. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. G. Zapka Validation of Colorectal Cancer Screening Behaviors Cancer Epidemiol. Biomarkers Prev., April 1, 2008; 17(4): 745 - 747. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |