
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
1 Mel and Enid Zuckerman College of Public Health and 2 Arizona Cancer Center, University of Arizona, Tuczon, Arizona
Requests for reprints: Mary Clouser, Arizona Cancer Center, University of Arizona, 1430 East Fort Lowell, Suite 301, Tucson, AZ 85719. E-mail: mclouser{at}u.arizona.edu
| Abstract |
|---|
|
|
|---|
= 0.76; 95% confidence interval, 0.65-0.85) and agreement did not differ by gender. Agreement for self-reported sun sensitivity was moderate (
weighted = 0.46; 95% confidence interval, 0.36-0.56) with higher agreement for women. For self-reported NMSC lesion history between two interviews, 24 days apart,
estimates ranged from 0.66 to 0.78 and were higher for women than men. Overall, there was evidence for substantial reproducibility related to risk group assignment and self-reported history of NMSC, with self-reported sun sensitivity being less reliable. In all comparisons, women had higher
values than men. These results suggest that self-reported measures of skin cancer risk are reasonably reliable for use in screening subjects into studies. (Cancer Epidemiol Biomarkers Prev 2006;15(11):22927) | Introduction |
|---|
|
|
|---|
20% of the general population at some point in life, and 50% of Americans who live to be 65 years old will have skin cancer at least once (2, 3). Skin cancer, including melanoma, has been mainly associated with particular skin phenotypes (fair complexion, tendency to sunburn, freckles) and sun exposure (4). Participant self-report is heavily relied upon in epidemiologic studies; however, the reliability of this information may vary. In cancer prevention studies, self-report is frequently used to classify participants into risk groups for future disease and to identify potential risk factors. Self-report is useful because it reduces time and length of recruitment. It is important to determine the reliability of items included in questionnaires because this variability will affect the validity of measurements and comparability between studies.
In the current analysis, we sought to determine how reliably participants were being placed in risk groups by comparing trained staff interviewers who did initial telephone screening and probable risk group determination by final dermatologist risk assignment. Second, we sought to determine the level of consistency at two different time points for participant perception of their sun sensitivity. Last, we examined the consistency of participant self-report of their nonmelanoma skin cancer (NMSC) and actinic keratosis (AK) history during the initial telephone screening interview and later reported via a self-administered health history form.
| Materials and Methods |
|---|
|
|
|---|
3-month period. This study was designed to assess the reproducibility of various surrogate end point biomarkers within the skin carcinogenesis pathway, specifically the variability of polyamine levels, p53 expression, and proliferating cell nuclear antigen expression. Subjects were recruited from university and community dermatology clinics, advertisements, and a skin cancer registry. Eligible subjects were males and females of at least 18 years of age who were willing to use skin protector factor 50 sunscreen applied daily. Subjects included three different probable risk groups: sun damage on forearms with no visible AKs (the pre-AK group), visible AKs (the AK group), and history of resected squamous cell carcinoma (SCC) in the last 12 months (the SCC group). The Biomarkers Study assessed 851 people via telephone for eligibility. Of those, 199 seemed to be eligible, agreed to participate, and consented at the eligibility clinic visit. At some point after consent, 29 were found to be ineligible. Of those who remained eligible, 91 were assigned to the pre-AK group, 38 to the AK group, and 35 to the SCC group (Fig. 1 ). Of the 164 subjects assigned to probable risk groups, 143 completed the 3-month study and are the focus of these analyses. Information on the design and some results of the study have been previously published (5, 6).
|
At the eligibility visit, the participant was assessed by the study dermatologist to confirm eligibility and to assign a final risk group. Participants also returned a completed self-administered health history form. This form asked about medical conditions and included a dermatology history section with questions about past diagnoses of AK, skin cancer, and skin biopsy results. A self-administered participant profile form was returned at the time of visit 1,
43 days after the initial telephone screen. Phenotypic characteristics, demographics, sun exposure, use of sunscreen, history of sunburn, occupational and environmental exposures, residential history, medical exposures, and smoking history were included on this profile.
Statistical Analysis
To compare basic demographic characteristics between the three final risk groups,
2 tests were used. ANOVA and Bonferroni multiple comparisons were used to look at potential differences in mean age between groups.
To test the reliability between independent groups, Cohen's
was used. For comparison between the multiple levels of self-reported sun sensitivity, weighted
was estimated. This statistic took advantage of the ordered categories so that partial credit was given to small error versus large error (7). Because there was no clear-cut "gold standard" for any of the interviews or questionnaires, equal weight was applied to both sets or readings (7). A
statistic <0 would suggest poor agreement, 0 to 0.20 slight, 0.21 to 0.40 fair, 0.41 to 0.60 moderate, 0.61 to 0.80 substantial, and 0.81 to 1.00 almost perfect (8, 9). Confidence intervals (CI) were calculated for the
statistic using the STATA command "kapci." STATA uses an analytic method for simple two-by-two comparisons and a bootstrap method in the case of dichotomous variables. When the bootstrap method was used, STAT was asked to perform 1,000 repetitions.
| Results |
|---|
|
|
|---|
|
= 0.76; 95% CI, 0.65-0.85) and there were no differences in agreement by gender of the participant. The telephone interviewers misclassified 12 (17.4%) of true pre-AK subjects as AK. Only five (15.6%) of the true AK subjects were misclassified, four as pre-AK and one as SCC. A total of three (9.7%) SCC subjects were misclassified and all three were placed in the AK group instead of the SCC group.
|
, agreement was moderate (
weighted = 0.46; 95% CI, 0.36-0.56), with higher agreement for women (
weighted = 0.53 versus 0.36).
|
value for AK history was
= 0.66 (95% CI, 0.54-0.78), for SCC was
= 0.78 (95% CI, 0.65-0.91), and for BCC was
= 0.75 (95% CI, 0.55-0.94), all considered to be with substantial agreement and all significantly different than 0. The values of
differed by gender but the differences were not significant as there is overlap in the CIs. In addition, although not shown, individuals were more likely to report a diagnosis of NMSC at the telephone recruitment than on the self-reported health history.
|
| Discussion |
|---|
|
|
|---|
In this study, the objectives of the analyses were 3-fold. We sought to determine how reliably participants were placed into risk groups by trained telephone interviewers compared with a study dermatologist's assessment. Second, the consistency between self-reported sun sensitivity was assessed. Last, we examined the consistency of participant self-reported history of skin lesions, specifically NMSC and AK. We noted extremely good agreement (
= 0.76; 95% CI, 0.65-0.85) between the classification of potential study participants into risk groups by trained telephone interviewers and the final assignment by a study dermatologist. During the recruitment phase of a clinical study, large numbers of people are often screened to find the few who qualify. Recruitment and screening is a time-consuming process, and study costs increase dramatically if study dermatologist time is necessary for initial screening. In addition, if the study seeks to recruit specific numbers into each risk group, participant's risk group must be immediately and accurately identified.
Despite the level of good agreement, misclassification existed, and not surprisingly, this misclassification centered on assignment of risk groups pre-AK and AK. The telephone interviewers misclassified 17.4% of true pre-AK subjects as AK. Similarly, 15.6% of the true AK subjects were misclassified, with four people (1.3%) classified by the screener as pre-AK and one (3.1%) as an SCC. Among the final SCC group, three SCC subjects (9.7%) were misclassified and all three had been placed in the AK group instead of the SCC group by the screener. The screeners placed subjects into risk groups based on information they collected during the interview, whereas the dermatologists made group assignments based on skin examinations. Because the telephone interviewers based their decisions on participant report, it would seem that subjects were more likely to report having AKs when they did not actually have any. This could be due to lack of knowledge pertaining to identification of an AK, or the difficulty of an untrained person with sun-damaged skin to differentiate an AK from other sun damage. It may also be true that potential participants overexaggerated their skin damage on the telephone because they had a strong desire to be eligible and participate in the study.
As risk group classification increased in seriousness, misclassification decreased. There was more misclassification in the pre-AK risk group and much less in the SCC risk group. If only the telephone interviewer was used to classify subjects into risk groups, there would be more true-pre-AK subjects in the AK group. This could then make it more difficult to distinguish between groups during analysis of biomarkers. For example, if there was a specific biomarker associated with development of SCC, this type of misclassification could decrease the likelihood of detecting any gradient between the disease groups.
In our study, agreement was not as strong for self-reported sun sensitivity measures (
weighted = 0.46; 95% CI, 0.36-0.56). One caveat needs to be highlighted. The sun sensitivity questions being compared on the two forms were not worded in precisely the same manner. The question on the interviewer-administered telephone recruitment form and the self-administered participant profile differed slightly. The telephone recruitment form was more focused toward assessment of whether an individual's untanned skin burns in the sun, and the self-reported participant profile focused on descriptions of tanning in addition to burning.
The concept of sun-reactive skin typing was created in 1975 to classify persons with white skin to select the correct initial doses of UVA (in joules per cubic centimeter) for the treatment of psoriasis-oral methoxsalen photochemotherapy (11). It was decided that a brief personal interview regarding the history of the person's sunburn and suntan experience was one approach to estimate the skin tolerance to UV radiation exposure and the Fitzpatrick skin-typing system was created (11). The Fitzpatrick skin-typing system has been used by the Food and Drug Administration in its guidelines for sunscreen products for over-the-counter human use (11).
Self-reported sun sensitivity is used to assess skin type and, therefore, risk for skin cancer. Only a few studies looking at the reliability of these measures are available in the literature and report better reliability than our study. Reliability, assessed by comparing answers to the same question at different time points, is used because the measures do not have a gold standard. In the multicenter South European case-control study, a subsample of participants were reinterviewed and reaction to sun exposure was assessed on a four-level scale (4). Weighted
for skin reaction to sun exposure was 0.61 (95% CI, 0.53-0.70), which is slightly higher than the five-level weighted
from our current study (
= 0.46; 95% CI, 0.36-0.56). (Recall that the value of
is affected by the number of categories.) In a case-control study of melanoma that included test-retest reliability of self-reported exposure to sun sensitivity, there was good consistency with
values for ability to tan and tendency to burn of 0.66 and 0.62, respectively (12).
In a case-control study nested within the Nurse's Health Study cohort, Weinstock (13) reported that test-retest reliability of tanning questions was high in the prevalent case group (Spearman's r = 0.78) and control group (Spearman's r = 0.76), but lower in the incident case group (Spearman's r = 0.59). Their study had a similar caveat in that the questions were not worded identically between the two questionnaires. Weinstock et al. (14) also found that, among women diagnosed with melanoma after the first questionnaire and before the second, there was a substantial shift toward reporting a reduced ability to tan.
This highlights an important issue for development of study questionnaires. The issue of burnability and tannability are separate issues to subjects and need to be considered separately. In a study by Rampen et al., neither tannability nor burnability were linked very closely to the minimal erythemal dose, which would be the gold standard of sun sensitivity (15). Rampen et al. (15) investigated burning and tanning histories in 790 White students, 18 to 30 years old, with a self-administered questionnaire to classify them into skin types based on the Fitzpatrick scheme (burning tendency after 1 hour of sun exposure in early summer and the tanning ability after regular sun exposure during summer were recorded as follows: 0, none; 1, mild; 2, moderate; and 3, severe/intense). Minimal erythemal dose was measured in a subgroup of this population. There was no statistically significant correlation with the self-reported burning tendency and the minimal erythemal dose. Skin typing on the basis of self-reported burning tendency and tanning ability may be subjective because subjects tended to overrecord no burning and underrecord no tanning. The correlation with biological complexion factors, such as hair and eye color and freckling tendency, was somewhat better for self-reported tanning than for the burning propensity (15).
The authors concluded that self-reported burning-tanning histories do not provide a valid means of skin typing when compared with the minimal erythemal dose (15). It may be that a better way to characterize sun sensitivity would be through proxy measures, such as hair and eye color and freckling tendency, which seem to be more reliably reported by subjects. Weinstock et al. found that test-retest reliability of hair color assessment by questionnaire was high with the Spearman correlation coefficientbetween 0.76 and 0.87. Sun sensitivity may be subject to recall bias when assessed by ability to tan, but not when assessed by hair color (14). There is a need for further studies to look at the issue of skin-type classification more closely.
Based on Weinstock et al.'s results, we might have expected that the SCC risk group would have been more reliable reporters of sun sensitivity or that there would be a gradient of response with the pre-AK group being the less reliable reporters then the AK or SCC groups. However, we found that all risk groups were equally as reliable when reporting sun sensitivity (data not shown).
Results from previous studies are consistent with our findings on agreement for self-reported history of NMSC. We found that more serious skin conditions had higher agreement (for AK history
= 0.66; 95% CI, 0.54-0.78 for SCC;
= 0.78; 95% CI, 0.65-0.91; and for BCC
= 0.75; 95% CI, 0.55-0.94). In a study by Ming et al. (16), self-reported history of skin cancer was compared with the gold standard of chart documentation. Patients were found to recall their cancer history quite well, with correct identification highest for melanoma (95% of cases) and lowest for basal cell carcinoma (84% of cases). In a study by Bergmann et al. (17), assessing agreement of self-reported medical history using an in-person interview versus a self-administered questionnaire,
values of 0.83-0.88 were found for cancer reporting. Lower values were found for less severe or more transient disease, with the disease being reported at the interview but not on the questionnaire. Our study also found that NMSC diagnosis was more likely to be reported to the phone interviewer than on the self-administered health history form.
The results of our study suggest that women may be more reliable reporters than men; however, the literature does not always support this finding. In an Australian study of ocular melanoma that gathered information on sun exposure in the first four decades of a person's life, questionnaires administered 1 year apart gave an interclass correlation coefficient of 0.65 for ranked total sun exposure between two interviews with the coefficient higher for men (0.73) than for women (0.54), although, like in our study, not statistically significant (18).
The use of
to measure reliability can be problematic because
values will depend on the prevalence of the condition and distribution of the marginals (7). We used a weighted
for ordered categories so that partial credit would be given to small error versus large error (7). Additionally,
values depend on the number of categories with more categories resulting in lower values (19). The value of
does, however, account for agreement that may occur by chance alone. The use of
is more appropriate than the use of percentage agreement. Percentage agreement is the simplest method of summarizing agreement for categorical variables and has the advantage of being useful for any number of categories. Percentage agreement is artificially increased when the proportion of negative-negative results is high or when the prevalence of the condition is high.
| Conclusions |
|---|
|
|
|---|
| Footnotes |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received 5/16/06; revised 8/25/06; accepted 9/13/06.
| References |
|---|
|
|
|---|
coefficient. J Clin Epidemiol 1988;41:94958.[CrossRef][Medline]
statistic. Am J Epidemiol 1987;126:1619.This article has been cited by other articles:
![]() |
K. M. McCarty, T. J. Smith, W. Zhou, E. Gonzalez, Q. Quamruzzaman, M. Rahman, G. Mahiuddin, L. Ryan, L. Su, and D. C. Christiani Polymorphisms in XPD (Asp312Asn and Lys751Gln) genes, sunburn and arsenic-related skin lesions Carcinogenesis, August 1, 2007; 28(8): 1697 - 1702. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Cell Growth & Differentiation |