
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
1 Center for Human Genetics Research and Department of Molecular Physiology and Biophysics, Vanderbilt University Medical Center, Nashville, Tennessee; 2 Paul P. Carbone Comprehensive Cancer Center and 3 Department of Population Health Sciences, University of Wisconsin-Madison, Madison, Wisconsin; 4 Norris Cotton Cancer Center, Dartmouth Medical School, Lebanon, New Hampshire; 5 Cancer Prevention Program, Fred Hutchinson Cancer Research Center, Seattle, Washington; 6 Core Genotyping Facility, Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, NIH, Gaithersburg, Maryland; and 7 H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida
Requests for reprints: Kathleen M. Egan, H. Lee Moffitt Cancer Center and Research Institute, Medical Research Center, 2nd Floor, 12902 Magnolia Drive, Tampa, FL 33612. Phone: 813-745-6149; Fax: 813-745-6525. E-mail: Kathleen.Egan{at}Moffitt.org
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Several whole-genome amplification protocols exist and are under continuous modification and improvement (3, 4). Recently, multiple displacement amplification has become more widely used. Whole-genome amplification using multiple displacement amplification can produce a representative product of gDNA from small amounts of starting gDNA and may optimize buccal cell gDNA yield. Multiple displacement amplification is a non-PCR–based whole-genome amplification method that can amplify the whole-genome, generating microgram quantities of product from as low as 1 to 10 copies of input DNA (5). Multiple displacement amplification uses
29 DNA polymerase and random exonuclease-resistant hexamer primers and has been shown to be very efficient for balanced amplification and generation of long DNA products (>10 kb) from a small amount of DNA. Studies indicate that multiple displacement amplification exhibits less amplification bias and results in greater yield, product length, and fidelity than other PCR-based whole-genome amplification methods (6).
In this report, we present results of our experience with whole-genome amplification applied to oral rinse-derived buccal cell DNA samples collected through the mail from 3,377 subjects enrolled in a breast cancer case-control study. Fidelity of amplification was assessed by the correlation of SNP calls from genomic and amplified DNA (aDNA). We also tested the effect of amplification on Hardy-Weinberg equilibrium (HWE). To our knowledge, this is one of the first large-scale genotyping efforts based on whole-genome aDNA.
| Materials and Methods |
|---|
|
|
|---|
Multiple Displacement Amplification
Whole-genome amplification was achieved using REPLI-g kit according to the manufacturer's instructions (Qiagen, Inc.). gDNA was amplified in a total volume of 50 µL at 30°C for 6 h and then terminated at 65°C for 3 min. aDNA product was then stored at –20°C.
SNP Genotyping in the Case-Control Study
SNP genotyping was done at the Core Genotyping Facility, National Cancer Institute, using Assays-On-Demand or Assays-by-Design with the ABI Prism 7900HT Sequence Detection System (Applied Biosystems). The Cancer Genome Anatomy Project SNP500Cancer Database8 and Breast and Prostate Cancer Cohort Consortium9 (10) were mined to select SNPs for each candidate gene. Genotyping was done by Taqman (5'-nuclease assay) reaction or, for SNPs in close proximity, by the MGB Eclipse Assay (Epoch Biosciences).
Quality Control Analysis
Before genotyping, all aDNA samples were profiled using the Applied Biosystem AmpFLSTR Identifiler PCR Amplification kit. Identifiler PCR amplifies 15 tetranucleotide short tandem repeat loci as well as the Amelogenin marker used for gender determination and is useful to detect samples with contamination or low probability of genotyping success. For all SNPs examined, we incorporated paired gDNA and aDNA on a total of 95 subjects (49 cases and 46 controls) for quality control. In addition, we incorporated another set of 198 independent replicate sets involving only whole-genome amplification–derived DNA. Women included in the quality control comparison were similar to the remaining women in religious (2% were Jewish in both groups) and ethnic (96% reported European ancestry in both parents) background, whereas all women included in the analysis were Caucasian; thus, the quality control comparison group should have provided a reasonable representation of the genetic background in the study population. Genotype determinations were conducted independently for each replicate.
Statistical Analysis
We used the percentage of agreement to evaluate concordance between genotypes in quality control replicate samples. Genotyping error rates were calculated among gDNA-aDNA pairs as the total number of discordant alleles (0, 1, or 2) across all informative replicate sets divided by the number of sets multiplied by 2, the total number of chromosomes. Under the assumption that gDNA is more likely to produce the correct genotype, we refer to discrepancies as errors. However, in some cases, the aDNA sample may have yielded the correct genotype, making this estimate conservative. Tests for HWE were computed using exact formulations as suggested by Wigginton et al. (11). Statistical significance of results was evaluated using t tests or
2 tests as appropriate; nonparametric tests (rank sum and Spearman correlation) were used for nonnormally distributed variables. Statistical analyses were conducted using Stata 8.1, SAS, and R programs.
| Results |
|---|
|
|
|---|
Genotyping Completion Rates
Genotyping of whole-genome amplification aDNA was attempted for a total of 67 SNPs in 18 breast cancer susceptibility candidate genes in the 3,357 study samples plus quality control replicates. High call rates were obtained for all but one of the attempted SNPs, ARVCF-172 (rs165849), which had minimal cluster separation in the Taqman assay. High call rates were observed in the remaining 66 SNPs, with a mean of 97% (range, 95-99%; Table 1
). Genotyping success was significantly related to DNA quality: among the 66 SNPs examined, genotype failure occurred in a mean of 13.5 SNPs in samples with lower quality DNA that failed profiling (n = 273) compared with 0.87 SNPs in the remaining samples (n = 3084; P < 0.0001).
|
Allele Bias
Genotyping error rates were calculated by comparing results in aDNA and gDNA in 95 subjects across the 66 SNPs. The total number of discordant alleles (0, 1, or 2) was summed across all informative replicate sets divided by the number of sets multiplied by 2, the total number of chromosomes. The allelic error rate had a median of 0.5% (upper 25th percentile, 1.1%) and ranged from 0% to 4.5% for individual SNPs. Perfect concordance was observed in 26 of the 66 SNPs, whereas SNPs in two of the examined genes (VDR and TXNRD2) contributed disproportionately to discordant results in the aDNA (range, 2.7-4.5% in five total examined SNPs). A similar percentage of heterozygotes was observed in the aDNA (36.0%) and the gDNA (36.4%; paired t test t = 1.18; P = 0.25) in 40 SNPs with any allelic errors.
Hardy-Weinberg Equilibrium
We tested the effect of whole-genome amplification on HWE among the controls with gDNA results (n = 46). P values for tests of HWE in gDNA and aDNA samples were highly correlated (r = 0.84; P < 0.0001). A total of 5 (7.6%) of the SNPs among aDNA samples and 3 (4.5%) SNPs in gDNA samples exhibited significant departure from HWE at P < 0.05. Two of the three SNPs out of HWE in gDNA were also out of HWE in aDNA. However, in aDNA from all women without breast cancer (n = 1,492), >60% of the SNPs failed Hardy-Weinberg, averaging a 5% loss of expected heterozygotes (Table 2
). After the
9% of the controls with lower quality DNA were excluded from analysis, fewer than a third of the SNPs failed Hardy-Weinberg, and the heterozygote loss dropped from 5% to <3%, on average. The degree of allele bias varied by gene (data not shown): we observed little if any allele bias in the haplotype tagging SNPs included for TP53. Two of the SNPs had a borderline significant departure from Hardy-Weinberg, in both cases with an excess of observed heterozygotes. In contrast, tests for departure from HWE were highly significant in 7 of 23 cytochrome P450 SNPs (in CYP1A1, CYP1A2, and CYP1B1), in each situation, with a shortfall of observed heterozygotes. We examined Spearman correlations between the HWE P values and the concordance rate across 66 SNPs in the 198 paired aDNA samples. A moderate positive association was noted (
= 0.35; P = 0.004), indicating higher concordance rates in samples that had less evidence of deviation from Hardy Weinberg equilibrium.
|
| Discussion |
|---|
|
|
|---|
Our findings show a realistic picture of whole-genome amplification as applied to DNA collected using a common protocol in molecular epidemiology studies (17-19). Mouthwash samples were collected by self-administration and returned by regular mail to a central laboratory for DNA extraction and storage. The oral rinse method has been shown by us (9) and others (19, 20) to generally yield appreciable high molecular weight human DNA, although amounts are highly variable and some samples may contain limited human DNA. An advantage of whole-genome amplification is that these latter samples can be rescued and incorporated in genotyping. However, they may also contribute disproportionately to allele amplification problems in the data because samples that failed the profiling step had substantially lower concentrations of aDNA product.
The problem with allele bias in the data did not seem to be consistent and varied by gene. For example, in the haplotype tagging SNPs included for TP53, we saw little if any evidence of allele bias based on Hardy-Weinberg assumptions (results of the TP53 analysis based on aDNA have been published; ref. 21). Two of the SNPs had a borderline significant departure from Hardy-Weinberg and, in both of these cases, there was an excess of observed compared with expected heterozygotes. In contrast, for the 23 cytochrome SNPs included, tests of departure from HWE were highly significant in 7, with a shortfall of observed heterozygotes in every instance. The cytochrome gene family is complex with extensive homology, duplication, and pseudogenes. Thus, it is possible that technical aspects related to genotyping influenced the overall performance of these assays. However, it is likely that amplification problems suggested in our analysis exacerbated these problems. Overall, we saw nearly a 4% heterozygote loss in the aDNA for the cytochrome SNPs. Other possible explanations for variable results across genes include differential amplification, guanine-cytosine content, or structure (e.g., repeat sequence or nearby SNPs) around the examined SNP.
The present results suggest cautious interpretation of genetic association studies based on aDNA where there is suspicion of amplification bias. Amplification bias may lead to genotype miscalls at heterozygous sites. Random and nondifferential (with respect to case-control status) misclassification error is well known to reduce power to detect genetic association for SNPs, with bias inverse to the minor allele frequency (22, 23). Moreover, even small amounts of genotype misclassification can substantially attenuate the parameter estimate for gene-environment interactions (24). The cumulative effect of allele-calling errors may lead to misrepresentation of haplotypes (25), particularly at loci with multiple polymorphic sites (26). Some genotyping error is inevitable due to instability of genotyping reagents, operator error, and chance. However, the additional error introduced by poor amplification and loss of heterozygosity in studies based on amplified oral epithelial DNA, typical in many population-based studies, may lead to significant loss of power and lack of replication across studies. For these reasons, the multiple displacement amplification strategy must be optimized to minimize allele amplification bias especially in studies examining haplotype associations.
The success of whole-genome amplification depends on quality and quantity of template human DNA. Barker et al. (27) assessed recently the fidelity of multiple displacement amplification for use in genome scans based on lymphocytic DNA and determined a >99.8% concordance in SNP genotype calling between original and amplified gDNA at 2,320 markers in 5 individuals. A minimum of 4 ng of human DNA in whole-genome amplification reactions is recommended by the manufacturer (Qiagen) to ensure high-fidelity amplification. However, it is likely that the major contributing factor to the amplification problems in many of our samples was inadequate high molecular weight DNA; in other applications, we have found that a minimum of 50 ng, 10 times the manufacturer's recommendation, should be used to attain adequate and balanced allelic coverage in samples that contain degraded DNA (oral rinse, cytobrush, etc.).10
The presence of allele bias in the current data was not immediately apparent. The source gDNA collected on these women has been used extensively for multiple genotypes with excellent success (7, 28-31). Call rates for individual SNPs generally exceed 95%, and quality control concordance rates are high in examined genotypes. This was the first genotyping exercise in which we attempted to use aDNA from these samples. Preliminary genotyping of a limited series of duplicate gDNA and aDNA samples, undertaken before large-scale genotyping, indicated near-perfect concordance of aDNA and gDNA replicates (data not shown). Likewise, aDNA-gDNA replicates included in the present genotyping (of 66 SNPs) showed near-perfect concordance of genotypes. However, results for many of the examined SNPs showed obvious allele bias in the wider series. Quality control samples were specifically chosen from among the samples with large amounts of harvested DNA that were, for this reason, more expendable. The DNA in these samples may have been of better quality, resulting in less amplification bias at heterozygous loci. It would be expected that allele bias if present would be more demonstrable in aDNA replicates; although agreement overall was high in 198 aDNA pairs, we did note a modest tendency for deviation from HWE to occur in samples with lower genotype agreement, consistent with amplification problems in a subset of these samples. Consistent with this, Identifiler screening of 15 short tandem repeats indicated slightly less agreement in 198 aDNA pairs compared with 95 aDNA-gDNA pairs (data not shown). As noted, allele bias in the data was not apparent in the quality control data. The full extent of the amplification problem only came to light by examining HWE in hundreds of control samples across multiple SNPs (32). We have since repeated genotyping on 21 of the CYP and COMT SNPs included in this report using gDNA and observed excellent concordance in the gDNA duplicates. None of the repeated genotypes exhibited departure from HWE (data not shown).
Based on all of these findings, the following recommendations can be offered: (a) the multiple displacement amplification strategy should be optimized to minimize allele amplification bias; 50 ng of human template gDNA is recommended; (b) screening before genotyping using Identifiler or a similar protocol should be considered to identify problem samples; and (c) validation with a large number of randomly selected gDNA replicates may identify problem genotypes. Tests for HWE may be useful to help identify loss of heterozygosity at specific loci. Because of the potential for allele amplification bias with this technique, the use of aDNA for genotyping should always be acknowledged in published reports.
| Acknowledgments |
|---|
| Footnotes |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
8 http://snp500cancer.nci.nih.gov/assay_list.cfm ![]()
9 http://cgf.nci.nih.gov/cohort.cfm ![]()
10 R.A. Welch, personal communication. ![]()
Received 2/ 5/07; revised 5/ 8/07; accepted 5/23/07.
| References |
|---|
|
|
|---|
(TNF-LTA) and breast cancer risk. Hum Genet 2007;121:483–90.[Medline]This article has been cited by other articles:
![]() |
J. M. Cunningham, T. A. Sellers, J. M. Schildkraut, Z. S. Fredericksen, R. A. Vierkant, L. E. Kelemen, M. Gadre, C. M. Phelan, Y. Huang, J. G. Meyer, et al. Performance of Amplified DNA in an Illumina GoldenGate BeadArray Assay Cancer Epidemiol. Biomarkers Prev., July 1, 2008; 17(7): 1781 - 1789. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Beckett, S. J. Laughton, L. D. Pozza, G. B. McCowage, G. Marshall, R. J. Cohn, E. Milne, and L. J. Ashton Buccal Swabs and Treated Cards: Methodological Considerations for Molecular Epidemiologic Studies Examining Pediatric Populations Am. J. Epidemiol., May 15, 2008; 167(10): 1260 - 1267. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |