The Arg/Arg genotype versus Arg/Pro or Pro/Pro at codon 72 of the p53 gene has been implicated as a risk marker in cervical neoplasia. However, research on this topic has produced controversial results. We reviewed the published literature to summarize the association and to identify methodological features that may have contributed to the heterogeneity. Information on specific methodological features of studies addressing this topic published between 1998 and 2002 were obtained. Study-specific odds ratios (ORs) were combined in a meta-analysis, assuming random effects. To identify characteristics that significantly contributed to heterogeneity, we used meta-regression analysis. We identified 50 articles, of which 45 were included in the meta-analyses and regressions. No evidence of association or heterogeneity was detected for preinvasive lesions. For invasive cervical cancer with undefined histology, the Arg/Arg genotype was not found to affect risk (OR, 1.1; 95% confidence interval (CI), 0.9–1.3). However, a slightly increased risk was observed for squamous cell carcinoma (OR, 1.5; 95% CI, 1.2–1.9) and adenocarcinoma (OR, 1.7; 95% CI, 1.0–2.7). Meta-regression analysis identified that the most important factor contributing to heterogeneity among results for invasive lesions was departures from Hardy-Weinberg equilibrium in the control group. Summary ORs for studies in equilibrium were essentially null. A possible susceptibility role by the p53 codon 72 polymorphism at a late carcinogenetic stage in cervical cancer cannot be ruled out. However, various methodological features can contribute to departures from Hardy-Weinberg equilibrium and consequently to less than ideal circumstances for the examination of this polymorphism. Future investigations require appropriate attention to design and methodological issues.
Cervical infection by human papillomavirus (HPV) has been established as a necessary event in the development of cervical neoplasia (1, 2, 3) . An important feature of the mechanism of action involves the expression of two HPV oncogenes, E6 and E7, which bind to and degrade the host tumor suppressor proteins, p53 and Rb, respectively (4) . Persistence of infection by oncogenic HPV types is the central risk factor, but many women with normal Pap smear tests are HPV-positive-prevalence estimates have ranged from 5 to 20% (5) . Given that only a fraction of women with an HPV infection eventually develop cervical neoplasia, research has focused on identifying factors that influence whether an HPV infection will spontaneously clear or progress toward malignant disease. Candidate factors include environmental agents (e.g., smoking, sexually transmitted infections), virus characteristics (e.g., HPV intra-type variants, viral load), and host biological attributes. One such host marker is a polymorphism in the p53 gene, which codes for two structurally distinct forms of the p53 protein depending on the DNA sequence (6) . Specifically, a CGC sequence at codon 72, coding for the amino acid Arg, is substituted by a CCC, coding for Pro (6 , 7) , yielding two protein forms with different biological and biochemical properties (8) .
The association between this polymorphism and cervical disease has received considerable attention following experimental research demonstrating that the Arg form of the p53 protein was more vulnerable than the Pro form to binding and degradation by the HPV-E6 oncoprotein (9) . Additionally, it was found that women with cancer compared with control subjects were more likely to have the Arg/Arg genotype (9) . Thus, it appeared that homozygosity for Arg at codon 72 conferred susceptibility to cervical cancer. To date, about 50 investigations examining this polymorphism in relation to cervical disease have been published; however, only a few have corroborated the results of Storey et al. (9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58) The various populations studied as well as certain methodological and study design features may have contributed to the inconsistency in results. The purpose of this article is to review and summarize the published literature on the p53 codon 72 polymorphism and risk of cervical neoplasia and to identify methodological characteristics that may explain the heterogeneity in results. We performed meta-analyses to summarize the risk of cervical neoplasia associated with this polymorphism and meta-regression analyses to examine possible influences related to study context, design, and other methodological characteristics.
Materials and Methods
We searched PubMed combining the search terms cervical cancer/cervix neoplasm/cervix/cervical with p53 codon 72/p53 residue 72/p53 polymorphism to identify published studies that examined the p53 codon 72 polymorphism in relation to cervical neoplasia between 1998, the date of the initial study, and 2002. To further extend our search, the reference list from all identified studies was examined. Because we conducted a meta-regression, we did not use any exclusion criteria, and all studies addressing this topic were considered. However, to be used in the meta-analyses and regressions, the study had to include a comparison group of disease-free women so that effect estimates (i.e., odds ratios) were available or could be calculated.
The identified studies investigated both preinvasive lesions and invasive cancer. Whenever multiple lesion types were examined within the same investigation, each outcome was analyzed separately. The preinvasive outcome categories that we investigated were preinvasive low-grade squamous intraepithelial lesions (LSIL) and high-grade squamous intraepithelial lesions (HSIL), and the invasive outcomes were invasive cervical cancers (ICC) (histology not specified), squamous cell carcinomas (SCC) and adenocarcinomas (ADC). When the grade of squamous intraepithelial lesion (SIL) was not specified, these studies were classified as LSIL.
In the majority of studies, only genotype frequencies and χ2 tests were presented. Thus, for these studies we calculated the odds ratios (ORs) and respective 95% confidence intervals comparing the Arg/Arg genotype versus Arg/Pro or Pro/Pro. For studies in which ORs were calculated and adjustments for other covariates were made, we used these ORs under the assumption that adjusted values provided a more valid estimate of the association with p53 codon 72 genotype. However, if the adjusted OR did not conform to the main comparison of Arg/Arg versus Arg/Pro or Pro/Pro, the crude OR was used. The Arg/Pro and Pro/Pro genotypes were grouped as the referent category because the Pro/Pro genotype was relatively rare in the majority of the published studies, and in some studies there were no Pro/Pro cases and/or controls, precluding the possibility to calculate ORs and 95% confidences. As well, the grouping of these categories was consistent with the original study where the Arg form of p53 was more susceptible to HPV-E6-mediated degradation, and there was a 7-fold-increased risk of ICC for the Arg/Arg genotype compared with the Arg/Pro genotype (9) .
To maintain consistency with the original study (9) , where no restrictions according to HPV status were made on the study participants, the ORs used in the meta-analysis and meta-regression were those calculated using the full case and control series in each study, even if genotype frequencies by certain subgroups were given (e.g., HPV-16-positive cases). In studies where restrictions were made, we were confined to these ORs, but we explored such restrictions in the meta-regression analyses. As well, when multiple control groups were used (e.g., screening patients and blood donors), we used the group that best represented the population at risk rather than combining convenience samples with risk-based control groups.
For each study, we extracted information on specific study characteristics, which were possible sources of heterogeneity, for consideration in the meta-regression (Table 1)⇓ ⇓ . These variables included publication year (2000s versus 1990s), geographical region (Europe versus other), DNA source used for cases (blood versus cervical specimens; fresh versus archival cervical specimens), DNA source used for controls (fresh versus archival cervical specimens), genotyping method (allele-specific polymerase chain reaction (PCR) versus other methods; single method versus more than one method), strategy to sample control group (convenience versus risk-based), whether ethnicity/race was controlled for (matching/restriction/adjustments versus no control/not specified), whether the case group was restricted to women who were positive for high-risk HPV types (high-risk positive versus unrestricted), and whether the control group was in Hardy-Weinberg equilibrium (significant deviation from equilibrium versus in equilibrium, based on the χ2 test). Calculations of the expected genotype frequencies given Hardy-Weinberg equilibrium were based on the observed allele frequencies (59) . A liberal α value of 0.1 was used to indicate statistically significant differences between observed and expected genotype frequencies. Data were interpreted from the information provided in each article. However, when data on the DNA source used or the genotyping method was not specified, original investigators were contacted to obtain this information.
We used meta-analysis to determine the overall summary OR for each lesion outcome. Study-specific ORs comparing Arg/Arg versus Arg/Pro or Pro/Pro were combined using the DerSimonian and Laird random-effects model, which takes into account heterogeneity among studies in addition to within-study variance (60) . The presence of heterogeneity between studies was tested using the inverse-variance method, which approximates a χ2 distribution (60 , 61) . We also examined the tendency for small studies to exhibit large effect estimates by testing for funnel plot asymmetry (62) .
We used meta-regression analysis to identify characteristics contributing to heterogeneity. A random-effects weighted linear regression model was used, whereby the study-specific log OR was regressed on the study characteristic variable of interest (63 , 64) . The weights for the regression incorporated both the within-study variance as well as the between-study variance, estimated using restricted maximum likelihood (63 , 65) . Because of the small number of studies in each meta-regression analysis, we examined each study characteristic in a univariable model. However, we attempted to explore for the presence of any important confounding in multivariable analysis. Results were expressed as ratios of the odds ratios (RORs) comparing the mean OR in studies with the characteristic to the mean OR among studies without the characteristic (62) . RORs equal to 1 indicated no difference in the mean ORs between groups. All statistical tests were two sided, and P < 0.05 was considered as statistically significant. All analyses were conducted in Stata version 7 (Stata Corp., College Station, TX; Ref. 64 ).
Fifty articles published between 1998 and 2002 were identified (Table 1)⇓ ⇓ . All studies were in the English language except for one that was in Spanish (53) . An additional study in the Chinese language based on 15 cases and 20 controls was identified; however, the English abstract did not provide sufficient information for data extraction (66) . Three studies did not include a disease-free control group; thus, ORs were not available for the meta-analyses and meta-regressions (30 , 37 , 40) . Crude ORs were extracted or calculated for all studies except two where adjusted ORs were used (39 , 57) . In one study, individual age matching was used; however, the comparison did not conform to our main comparison of Arg/Arg versus other genotypes (32) . Therefore, we calculated the OR using an unmatched analysis. One investigation in the meta-analysis of ADC was actually conducted on cases of ADC in situ (17) . Six investigations in the meta-analysis of LSIL were among unspecified SILs (20 , 23 , 33 , 35 , 42 , 45) . In total, there were 24 analyses of ICC (no histology specified), 22 of SCC, 4 of ADC, 22 of HSIL, and 18 of LSIL.
Results of the meta-analyses on each of the outcomes are presented in Figs. 1⇓ 2⇓ 3⇓ 4⇓ 5⇓ . The summary OR for the effect of Arg/Arg versus Arg/Pro or Pro/Pro on ICC was 1.1 [95% confidence interval (CI), 0.9–1.3], indicating little effect of the p53 polymorphism. However, statistically significant heterogeneity was observed (P = 0.009) with study-specific ORs lying both above and below the null. This result did not change when excluding the study that had originally conducted a matched analysis. For both SCC and ADC, there were slightly elevated summary ORs of 1.5 (95% CI, 1.2–1.9) and 1.7 (95% CI, 1.0–2.7), respectively. In studies of SCC, the summary OR did not change substantially when using the crude ORs for the two studies with adjusted estimates (OR, 1.5; 95% CI, 1.2–1.8). Although most ORs were above the null value for both invasive outcomes, visual inspection revealed heterogeneity among studies, and this was statistically significant for SCC (P = 0.0002). As well, there was evidence of funnel plot asymmetry for both ICC (P = 0.001) and SCC (P = 0.003). This was difficult to assess for studies of ADC because there were only four investigations. Because presumably 80% of ICCs are in fact of squamous origin, an analysis of ICC and SCC combined was done and showed a summary OR of 1.3 (95% CI, 1.1–1.5).
For preinvasive outcomes, an elevated summary OR for the effect of the Arg/Arg genotype was not observed for either HSIL or LSIL. In fact, a slight inverse association was observed for LSIL. Exclusion of studies that did not specify the grade of SIL did not substantially alter the summary OR for LSIL (OR, 0.93; 95% CI, 0.7–1.2). Study-specific ORs appeared heterogeneous, although studies at the extremes were relatively small in size while the larger studies were generally near the null value. Tests for heterogeneity were not statistically significant among studies of SIL. As well, we observed no evidence of funnel plot asymmetry for either HSIL (P = 0.310) or LSIL (P = 0.125).
Results of the meta-regression are presented in Table 2⇓ . A meta-regression was not performed for ADC, because there were too few studies (n = 4). Values in Table 2⇓ refer to the univariable RORs. Variables pertaining to the context of the study (year of publication, geographical region) did not account for much heterogeneity in any of the outcomes. Likewise, characteristics concerning laboratory methods did not significantly contribute to variation in ORs, although there was some evidence that analysis of archival case specimens resulted in higher ORs compared with fresh tissue specimens. Study design variables that showed some contribution to heterogeneity included control group sampling strategy and case group restriction, although this was specific to studies of SCC and not statistically significant for control group sampling. Controlling for possible confounding by ethnicity/race did not appear to affect results among any outcomes. On the other hand, studies in which there were significant deviations from Hardy-Weinberg equilibrium in the control group had higher ORs, particularly for invasive outcomes. Results were similar when using a more conventional α-level of 0.05 (results not shown) to define statistically significant deviations from equilibrium.
Despite the strong association observed between Arg homozygosity and cervical cancer risk in the initial study (9) few subsequent investigations have replicated these findings. The phenomenon where the first study of a genetic marker indicates a strong association while later studies reveal weaker results has been shown to occur in many other genetic association studies (67 , 68) . Several factors can contribute to such an occurrence, including that the first study was a spurious finding or that the first study overstated the effect (67) . Indeed, the original study has been criticized as being a chance finding given the small sample size (16 , 19) . However, a strong association was reproduced in other studies conducted among different populations, thus suggesting that perhaps the initial study was not just a statistical fluctuation (68) . Furthermore, a biological mechanism for the role of the polymorphism has been demonstrated in experimental research (9) . It is possible that the initial study represented an overestimate of the association. In fact, only three subsequent studies were of similar magnitude (33 , 36 , 39) . Thus, if an association truly exists and it is of relatively low magnitude, the results from the majority of subsequent studies may have missed an association because of a lack of statistical power.
The summary ORs from the meta-analyses revealed that any effect of the Arg/Arg genotype was specific to invasive cancers with no strong association with SILs. Such evidence points to the possibility that the p53 codon 72 polymorphism may have a principal role in progression to cancer, rather than in initiation of lesions, which has been suggested by others (44 , 46) . However, an association was not observed for ICC with undefined histology (summary OR, 1.1). One could speculate that the overall effect of the polymorphism on ICC was masked by the inclusion of a mixture of SCC and ADC cases. However, an increased risk of ICC is still expected because 80% of ICCs are of squamous origin (69) , and furthermore, separate meta-analysis of SCC and ADC indicated a similar slightly elevated risk associated with Arg/Arg. Given the lack of reliance on histological ascertainment in studies of ICC, it is possible that in some studies women with preinvasive lesions were inadvertently included as cases. Thus, possible misclassification of case status may have resulted in shifting the OR toward the null.
The most noteworthy finding from our meta-analyses was that there was substantial heterogeneity, particularly among studies of invasive cancer. The controversial findings on this topic have generated speculations regarding specific design and methodological characteristics that may be a source of heterogeneous results (16 , 37) . Interestingly, the most important factor that contributed to between-study heterogeneity was whether or not the genotype frequencies were in Hardy-Weinberg equilibrium. Because the equilibrium may not hold among a case group if the genotype is truly associated with disease, we tested departures from equilibrium in the controls (70) . Studies in which the genotype frequencies significantly deviated from that expected under Hardy-Weinberg equilibrium had a greater mean OR than those studies in which the control group was in equilibrium. Meta-analyses among studies restricted to those in Hardy-Weinberg equilibrium showed summary ORs that were essentially null, as follows: ICC: OR, 0.96; 95% CI, 0.8–1.2; SCC: OR, 0.96; 95% CI, 0.8–1.2; HSIL: OR, 0.93; 95% CI, 0.8–1.1; LSIL: OR, 0.87; 95% CI, 0.7–1.1. Although the differences among studies of HSIL and LSIL were of low magnitude and not statistically significant, the RORs were in the same direction as that observed for ICC and SCC, suggesting that a lack of equilibrium consistently led to an increase in the mean OR. As well, the effect remained even when an α level of 0.05 was used to judge statistically significant deviations (not shown).
The Hardy-Weinberg law states that in the situation of random mating (with respect to genotype) and the absence of mutation, migration, natural selection, or random genetic drift, the amount of genetic variation, represented by the frequency distribution of genotypes, will remain constant from one generation to the next (71) . Thus, given these conditions, we expect the equilibrium to hold for a nonsex-linked Mendelian-segregating gene like p53. Observed departures from equilibrium therefore suggest possible issues with the control group, or the study population in general, that might have generated less than ideal circumstances for the investigation of the p53 polymorphism and cervical neoplasia (72) . In fact, we examined certain study characteristics that may have led to departures from Hardy-Weinberg equilibrium in the meta-regression analyses; however, no characteristic was found to be significantly and consistently associated with heterogeneity between studies.
For instance, a lack of equilibrium can indicate that the genotype distribution in the control group was not representative of the general population from which the cases presumably arose, suggesting the possibility of selection bias (73, 74, 75) . In more than half of the studies, all of which followed a case-control design, control selection was based on convenience sampling (e.g., blood donors, laboratory personnel) and did not follow basic principles of epidemiological study design (73) . We observed no significant differences between studies that used convenience controls compared with those that used a strategy that approached risk-based sampling. Although an elevated ROR was observed for SCC, four of the ten studies that used convenience controls were carried out by the same researchers (21 , 44) ; thus, we cannot exclude the possibility that other common characteristics may have influenced the ORs. In fact, this effect was no longer present when adjusted for confounding by the variable pertaining to case group restriction (discussed below). Thus, if selection bias was present in studies using convenience controls, we found that the bias did not consistently increase or decrease ORs.
A departure from Hardy-Weinberg equilibrium can also imply possible ethnic admixture in the population, if the polymorphic site varies in genotype by race or ethnicity (70 , 76) . In fact, race-specific variation in the distribution of genotypes in the p53 codon 72 polymorphism has been demonstrated (77 , 78) . Because ethnicity/race may be related to disease, either through common risk factors or other genes in linkage disequilibrium with p53, confounding by ethnicity/race, or population stratification, may have biased results in studies conducted on ethnically diverse populations that did not account for possible confounding (74 , 79) . We found that the mean OR among studies that controlled for potential confounding by race, either in the design or in the analysis, was not significantly different from the ORs among studies that did not address the issue. Thus, if population stratification were an issue within some studies, it did not appear to be an important source of heterogeneity between studies.
In fact, the majority of studies were conducted on ethnically restricted populations. Several investigators have therefore suggested that heterogeneity may be because of the various ethnicities examined (47 , 51) thus implying effect modification by ethnicity. Unfortunately, only a few studies were conducted in any given geographic region, even when categorized broadly by continent, except for Europe. Thus, we were limited to comparing investigations from Europe to other regions. We found no significant differences between studies conducted in European populations versus those from other regions. However, the “other” regions included studies from North and South America, Asia, as well as Africa. As well, European studies were conducted in the North, South, East, and West. Thus, if the association between the p53 codon 72 polymorphism and cervical neoplasia did differ according to ethnicity, we likely missed an effect because of the diversity within the categories compared.
Interestingly, of the 15 studies with control groups that deviated from Hardy-Weinberg equilibrium, nine were conducted on populations that were restricted to a single ethnic group. Although selection bias may have been a factor, departures from Hardy-Weinberg equilibrium can also be a manifestation of misclassification of the genotype consequent to technical aspects related to the procedures for genotyping (72) . Our own investigation of this issue in a Brazilian population revealed that the identification of the genotype at codon 72 using allele-specific PCR, the method used in the initial study as well as in the majority of subsequent studies, differed substantially across three independent laboratories that tested samples blindly (39) . Furthermore, it was demonstrated that when analyses were conducted with data that agreed among all three laboratories and were presumably less misclassified compared with unconfirmed results from a single laboratory, the OR comparing Arg/Arg versus other Arg/Pro or Pro/Pro increased substantially. These findings suggested that the absence of an association between the Arg/Arg genotype and cervical neoplasia in many previous studies might have arisen because of bias toward the null stemming from nondifferential misclassification of genotype. On the other hand, the opposite was observed in another study, where an elevated OR was seen when using allele-specific PCR alone but was shifted toward the null when results were confirmed with a second genotyping method (37) . However, the comparison group comprised women with SILs rather than a disease-free population and may not have adequately represented the base population with respect to genotype.
Thus, we were concerned that the use of allele-specific PCR compared with other genotyping methods may have contributed to the heterogeneity in past studies. However, meta-regression analyses revealed no significant differences, indicating that potential misclassification by allele-specific PCR did not consistently bias ORs upwards or downwards compared with other methods. It should be noted that although the other methods used, including single-strand conformation polymorphism (SSCP) analysis, restriction fragment length polymorphism (RFLP) analysis, and DNA sequencing, are not subject to the false positives and negatives consequent to inadequate allele-specific amplification, they are not immune to errors (80 , 81) . Thus, it is possible that a similar level of misclassification existed among studies that used other genotyping methods. Presumably, misclassification would be reduced in studies that confirmed genotyping results with a second method. However, significant differences were not observed between studies that used a single method versus those that used more than one method. Unfortunately, there were only eight studies overall that used more than one genotyping method; thus, the comparisons were hindered by a lack of statistical power.
We also examined other methodological characteristics that could contribute to heterogeneity via the potential for genotype misclassification among the cases because of the quality and source of DNA for genotyping. Of concern with tumor specimens is the possibility of gene sequence deletions or “loss of heterozygosity” (82) , which would lead to an erroneous classification of heterozygous individuals as homozygous for a particular allele. Loss of heterozygosity generally occurs as a result of a gene mutation in tumor cells, (83) however, research has shown that the p53 gene is most likely to be wild-type in cervical cancers (9 , 84) . Nonetheless, the use of nontumor specimens, such as blood, would avoid genotype errors because of loss of heterozygosity. Results from our meta-regression analysis of this characteristic revealed no substantial differences in studies that used cervical specimens from cases versus blood specimens, which is consistent with the notion that loss of heterozygosity is a rare phenomenon in cervical neoplasia, at least for this gene locus. However, among studies using cervical specimens, archival samples were associated with a greater mean OR than the use of fresh specimens for ICC, HSIL, and LSIL. The detection of artificial mutations has been shown to be more frequent in archival tissues, thus suggesting a greater level of misclassification when using such specimens (85) . However, spurious mutations, if truly random, would more likely bias OR estimates toward the null, while we observed a greater OR among studies that used archival specimens for cases. In any case, such a finding was not observed for studies of SCC, and it was only statistically significant for studies of HSIL. We also examined the effect of the use of archival versus fresh cervical specimens among controls and found no significant associations, but because of the limited number of studies, this comparison lacked statistical precision.
We were also interested in whether restrictions applied to the case group based on their HPV status had an impact on the ORs. Specifically, in some studies, only those patients positive for HPV-16 and/or HPV-18, were included as cases, although in one study, SIL cases were restricted to those positive for a high-risk type as defined by the hybrid-capture method of HPV detection (20) . Meta-regression analyses revealed no significant differences except among studies of SCC, where the mean OR was two times greater in studies that examined an HPV-restricted case group (ROR = 2.1). This suggests that genotype-related susceptibility may be specific to certain HPV types. In fact, the in vivo predisposition of the Arginine form of p53 to E6-mediated degradation was demonstrated only for HPV types 16 and 18 (9) . Although it is reasonable to extrapolate these findings to other high-risk HPV types, it is possible that the E6 oncoprotein from other HPV types are less discriminating over the codon 72 polymorphism. Of note, there was some suggestion in the original results of Storey et al. that degradation of p53 may be slightly stronger by the E6 protein from HPV-18 compared with that of HPV-16. It is therefore interesting that the elevated summary OR for ADC, a histological type commonly associated with HPV-18, was slightly greater and more consistent between studies than that for SCC, which is more often associated with HPV-16 (86) .
Moreover, there is evidence that particular molecular variants of HPV-16 are more likely to be associated with cervical cancer in women with the Arg/Arg genotype (27 , 35) . Molecular variants are defined by having up to 2% nucleotide variation in specific regions of the HPV genome, including E6 (87 , 88) . It is possible that the elevated mean OR because of Arg/Arg observed among studies of SCC where cases were restricted might reflect an interaction between HPV-16 or HPV-16 intratype variants with p53 Arg/Arg. In fact, all SCC studies with restricted case groups were among HPV-16-positive cases, except in one study where HPV-18 was predominant (33) . As mentioned previously, four of these investigations were by the same investigators and thus share other common characteristics (21 , 44) . However, other methodological features did not confound this finding. Therefore, these findings highlight the possibility of important type- and variant-specific interactions between p53 and HPV.
One other factor that we examined was the timing of publication. We were interested to know whether stronger associations were observed shortly after the original publication compared with more recent studies. We did not observe this phenomenon among any outcome when comparing studies published within 2 years of the original publication with recent articles. In fact, the most recent publications were supportive of the original results by Storey et al. (50 , 51 , 53 , 57 , 58) Thus, time lag bias was not apparent in studies of p53 codon 72 polymorphism and cervical neoplasia (62) . As well, traditional publication bias, where there is a tendency for only positive studies to be published, was considered. On the contrary, the majority of published studies on this topic were negative findings. If positive results were more likely to be rejected, then publication bias may be present. However, this is unlikely given that the most recently published studies were positive findings. Results from the funnel plot tests of asymmetry for ICC and SCC suggested that small studies might have been more likely to show an elevated OR. If in fact results from smaller studies tended to be statistical fluctuations, then bias may be present in the meta-analyses because of the inclusion of spurious findings. However, this did not lead to an overestimation of the overall effect for ICC, because the summary OR was essentially null. Indeed, funnel plot asymmetry can arise by chance (62) . In any case, caution should be applied when interpreting the summary effects from the meta-analyses, especially in light of the observed heterogeneity between studies.
The overall finding from our meta-regression analysis was that departures from Hardy-Weinberg equilibrium were the principal source of between-study heterogeneity, specifically among studies of ICC and SCC. Although there were indications that some other methodological characteristics may have contributed to heterogeneity, each study possessed different combinations of both desirable and undesirable methodological features, such that no single factor consistently increased or decreased ORs. Unfortunately, because of sample size limitations [i.e., the largest analysis was conducted on n = 24 studies (ICC)], we were unable to carry out detailed multivariable analysis. In any case, we were not able to detect important confounding of the effect of control group Hardy-Weinberg equilibrium by any of the other variables considered. However, it is possible that other important sources of heterogeneity (and possible sources of departures from Hardy-Weinberg equilibrium) would have become apparent if appropriate adjustment for confounding was achievable.
Thus, future investigations of the p53 codon 72 polymorphism warrant close attention to design and methodological features. In addition to research on cervical neoplasia, this also has implications for other HPV-related malignancies. Adherence to epidemiological design principles when selecting the study population can minimize selection bias. As well, confirming genotype results with a second method can reduce misclassification of genotype. In doing so, deviations from Hardy-Weinberg equilibrium can be avoided. However, when conducting studies among ethnically diverse populations or using hospital-based controls, departures from Hardy-Weinberg equilibrium may still occur. Nonetheless, population stratification can be minimized by restriction or matching by ethnicity in the design or measurement and adjustment in the analysis. As well, stratified analysis can reveal if any effect modification by ethnicity exists. However, such an undertaking requires a very large sample size. Other sources of misclassification, such as that introduced by the use of archival specimens, can be avoided by using fresh tissue or blood specimens. Additionally, the exploration of interactions between p53 genotype and specific HPV types or intra-type variants can enhance further research. Although this requires a very large sample size in the traditional case-control setting, alternative designs such as the case-only approach could be used (89 , 90) .
We thank Martyn Plummer from the International Agency for Research on Cancer for providing the programs to generate forest plots; Javier Pintos for verification of data extraction in Spanish; and Jeff Boyd, Aleksandra Dybikowska, Allan Hildesheim, Stuart Lanham, Giovanni Rezza, and Ingeborg Zehbe for cooperation in providing data on the genotyping method and the DNA source used.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Anita Koushik is a Research Student of the National Cancer Institute of Canada with funds provided by the Terry Fox Run. Robert Platt holds a New Investigator Award, and Eduardo Franco holds a Distinguished Scientist Award, both from the Canadian Institutes of Health Research.
Requests for reprints: Eduardo Franco, Division of Cancer Epidemiology, McGill University, 546 Pine Avenue West, Montréal, Québec, Canada H2W 1S6. Phone: (514) 398-6032; Fax: (514) 398-5002; E-mail:
- Received February 28, 2003.
- Revision received August 21, 2003.
- Accepted September 12, 2003.