
Cancer Epidemiology Biomarkers & Prevention 16, 1185, June 1, 2007. doi: 10.1158/1055-9965.EPI-06-0759
© 2007 American Association for Cancer Research
Triallelic Single Nucleotide Polymorphisms and Genotyping Error in Genetic Epidemiology Studies: MDR1 (ABCB1) G2677/T/A as an Example
Claudia Hüebner1,4,
Ivonne Petermann1,4,
Brian L. Browning1,3,4,
Andrew N. Shelling2,4 and
Lynnette R. Ferguson1,4
1 Discipline of Nutrition and 2 Department of Obstetrics and Gynecology, Faculty of Medical and Health Sciences and 3 Department of Statistics, Faculty of Science, The University of Auckland and 4 Nutrigenomics New Zealand, Auckland, New Zealand
Requests for reprints: Lynnette R. Ferguson, Discipline of Nutrition, Faculty of Medical and Health Sciences, University of Auckland, Private Bag 92019, Auckland, New Zealand. Phone: 6493737599, ext. 86372; Fax: 6493035962. E-mail: l.ferguson{at}auckland.ac.nz
 |
Abstract
|
|---|
Accurate measurement of allele frequencies between population groups with differing sensitivities to disease is fundamental to genetic epidemiology. Genotyping errors can markedly influence the biological conclusions of a study. This issue may be especially important now there is increasing recognition of triallelic single nucleotide polymorphisms (SNPs) in the genome and their possible role in diseases like inflammatory bowel disease. For example, the MDR1 (ABCB1) SNP G2677/T/A was, like many other triallelic SNPs, originally described as diallelic. Here, we report a comprehensive analyses of estimated allele frequencies of this SNP in a set of 73 human DNA samples, comparing six commonly used genotyping methods (Applied Biosystems Taqman, Roche LightCycler melting analysis, allelic discrimination PCR, DNA sequencing, Sequenom, and RFLP) from the angle of their error potential. Only Sequenom and DNA sequencing provided accurate measurements, if we had not had prior knowledge of the triallelic nature of this SNP. The other tested methods (with the exception of LightCycler) failed to show any indication of the presence of the rare third A- allele in a diallelic assay. Although most of the errors were due to the inability to detect the third allele, all methods except Sequenom and sequencing produced errors for the detection of the two common alleles G and T (LightCycler, 6 errors; PCR, 4 errors; RFLP, 2 errors; Taqman, 1 error). There is considerable variability in the reported frequencies of the different alleles of the MDR1 G2677/T/A SNP, and the role of this SNP in the etiology of inflammatory bowel disease has been controversial. Our data emphasize the importance of choosing the appropriate method for SNP detection and lead us to suggest that part of the previously reported variation may reflect artifacts associated with the different genotyping methodologies used. The failure to recognize the triallic nature of a SNP may lead to underestimations of real genetic associations. (Cancer Epidemiol Biomarkers Prev 2007;16(6):118592)
 |
Introduction
|
|---|
The International HapMap Project (1) has opened the door for a new generation of diagnostic tools aimed at identifying and characterizing human diversity. In particular, it has provided a large resource of single nucleotide polymorphisms (SNPs) that provide much of the variation between different individuals and different ethnic groups. Although most of the SNPs associated with human disease have been described as diallelic, in the last few years, an increasing number of these have been recognized to be triallelic and possibly even tetra-allelic. Most of the multiplex techniques that are being increasingly used for genotyping are based on discerning one allele from the other (i.e., start with the assumption that the allele is diallelic; refs. 2, 3). We wished to consider whether starting with such an assumption could impede the discovery of novel triallelic SNPs, and whether alleles may have been mistyped in the past. This would have implications for the accurate estimation of population data.
The group of cancer-prone inflammatory bowel diseases (IBD) includes ulcerative colitis and Crohn's disease. We used the National Center for Biotechnology Information SNP database to identify variants of genes that are described in the literature as associated with IBD susceptibility: MDR1 (4), DLG5 (5), OCTN1/2 (6), NFkB1 (7), TNF and TNFRSF1B (8), MIF (9), IL4 (10), and IL11 (11). Eleven triallelic SNPs have been reported in eight of the identified genes. Interestingly, six of the triallelic SNPs in four of the IBD-associated genes (MDR1, MIF, NFkB1, and TNFRSF1B) have been previously described as diallelic (Table 1
).
View this table:
[in this window]
[in a new window]
|
Table 1. Known triallelic SNPs in IBD-associated genes as shown on the National Center for Biotechnology Information database
|
|
The human MDR1 gene, located on chromosome 7, encodes an ATP-dependent efflux transporter pump (P-glycoprotein) that is highly expressed in various tissues, including the epithelial surfaces of the intestine. The level of expression of P-glycoprotein is critical in determining the pharmacokinetics of a wide-ranging number of substrates, including anticancer drugs (12-15). There is considerable interindividual variability in P-glycoprotein expression that has implications not only for the development of resistance to various pharmaceutical agents but also for disease susceptibility (16). Several SNPs in the MDR1 gene have been associated with susceptibility to the development of various types of cancer (16), HIV susceptibility (17), hypercholesteremia (18), and Parkinson's disease (19). They have also, arguably, been associated with IBD (4, 20-24).
The MDR1 gene is 209 kb in length and composed of 28 exons, and at least 314 SNPs have been described (25-28). Thus far, three variants within the gene (G2677T/A in exon 21, C3435T in exon 26, and T129C in exon 1B) have been shown to correlate with a lower P-glycoprotein expression in normal tissues (26, 29-31). G2677T and C3435T SNPs are in linkage disequilibrium (multiallelic D' = 0.85; refs. 22, 32, 33). Considering the triallelic SNP in exon 21, the reference G2677 is Ala893, with the T variant being Ser893, and the less frequent A variant coding for Thr893. Various research groups studying IBD have studied SNPs within MDR1 to determine whether they might be associated with susceptibility to the development of disease. To date, results for the G2677T/A polymorphisms have been controversial (4, 20, 21, 23, 24); however, a recent meta-analysis reported evidence for association of the 3435T allele with ulcerative colitis [odds ratio (OR), 1.12; 95% confidence interval, 1.02-1.23] but not Crohn's disease (22).
We have genotyped DNA samples from a small set of control and Crohn's disease patient samples using a variety of genotyping methods to consider the question as to whether genotyping errors associated with different methods could explain why different studies have not been able to consistently find association to MDR1 SNPs.
 |
Materials and Methods
|
|---|
Study Population
Seventy-three human subjects were recruited either from the Auckland District Health Board gastroenterology clinics or healthy volunteers to provide approximately equal numbers of male and female subjects and controls or IBD patients. Blood samples were collected into heparinized tubes, and DNA was isolated using the Puregene DNA Purification kit (Gentra Systems) according to the manufacturer's protocol. The amount of DNA extracted was quantified by absorbance spectroscopy (260 and 280 nm) and diluted to 10 ng/µL for working solutions. The isolated DNA was stored at 20°C, and the working solutions were stored at 4°C. The study was conducted under ethical protocol MEC/04/12/011, authorized through the New Zealand Multi-Region Human Ethics Committee.
Genotyping Methods
The PCR, RFLP, and Taqman SNP Genotyping Assay assays were designed to detect a diallelic rather than a triallelic SNP. The allelic discrimination PCR and Taqman SNP Genotyping Assay assays tested for the presence of G and T alleles, whereas the RFLP detected G allele dosage. All primers used for the different assays (except for the primers obtained for Taqman SNP Genotyping Assay) were obtained from Invitrogen. The techniques were done as follows.
PCR for DNA Sequencing or RFLP
Details of the primers used for amplification of exon 21 are provided in Table 2
. The sequence of the primers was designed using OligoPerfect Designer free software5 and checked for specificity using the National Center for Biotechnology Information BLAST server.6 The PCR reactions were done in a 25-µL reaction volume containing 20 ng genomic DNA, 100 pmol of each primer, 0.2 mmol/L of each deoxynucleotide triphosphate, 1x PCR buffer, 1.5 mmol/L MgCl2, and 1 unit Taq polymerase (Qiagen). The PCR program for exon 21 consisted of 30 cycles at 94°C for 30 s, 58°C for 30 s, and 72°C for 30 s and a final elongation step at 72°C for 10 min. The PCR products were checked on a 1.5% agarose gel and photographed before being subjected to a RFLP analysis or DNA sequencing.
View this table:
[in this window]
[in a new window]
|
Table 2. Oligonucleotide sequences for primers used for DNA sequencing, RFLP, allelic discrimination PCR, Sequenom, and Taqman SNP Genotyping Assay
|
|
RFLP Analysis
To determine the respective genotype (G or T), RFLP analysis with the restriction endonuclease BseYI was conducted after PCR-based amplification (primer listed in Table 2). PCR product (10 µL) was combined with 4 units enzyme, 2 µL of 10x Restriction Enzyme Digestion Buffer 3, and 0.5 µL of bovine serum albumin (all reagents from New England Biolab) in a total volume of 20 µL. Samples were digested for 4 h at 37°C. As the enzyme BseYI remains bound to DNA after digestion and alters migration rate of DNA during electrophoresis, 1 µL of 10% SDS was added after 4 h to disrupt binding. The digestion products were separated on a 2% agarose gel and stained with ethidium bromide.
DNA Sequencing
Amplicons from exon 21 were cleaned according to the manufacturer's instructions using the ChargeSwitch PCR Clean-Up kit (Invitrogen). Automated DNA sequencing was done on an ABI 3130XL Genetic Analyzer sequencer by using BigDye Terminator version 2 reactions (Perkin-Elmer/Applied Biosystems) using the 2677 forward primer.
Conventional Allelic Discrimination PCR
To achieve allelic discrimination between wild-type and mutant allele, two physically separate PCR reactions containing the 2677 forward primer and the corresponding wild-type (2677W) or mutant-specific primer (2677M) were done (Table 2). All reactions were carried out in total volume of 25 µL containing 20 ng genomic DNA, 100 pmol of each primer, 0.2 mmol/L of each deoxynucleotide triphosphate, 1x PCR buffer, 1.5 mmol/L MgCl2, and 1 unit Taq polymerase (Qiagen). The PCR program for allelic discrimination consisted of 30 cycles at 94°C for 30 s, 60°C for 30 s, and 72°C for 30 s and a final elongation step at 72°C for 10 min. The PCR products were electrophoresed on a 1.5% agarose gel, and the genotype assignment was selected on the basis of the following criteria: no visible band represents the absence of the analyzed allele, whereas a band indicates the presence of the analyzed allele.
Applied Biosystems Taqman SNP Genotyping Assay
The SNP at position 2677 of MDR1 was genotyped using the Taqman MGB diallelic discrimination system (34). Probes and oligonucleotides were obtained from Applied Biosystems using the Assay-by-Design product (listed in Table 2). The reactions were prepared by using 2x Taqman Universal Master Mix, 40x SNP Genotyping Assay Mix, DNase-free water, and 10 ng genomic DNA in a final volume of 5 µL per reaction. The PCR amplification was done using the ABI Prism 7900 HT sequence-detector machine under the following conditions: 10 min at 95°C enzyme activation followed by 40 cycles at 92°C for 15 s and 60°C for 1 min (annealing/extension). The allelic discrimination results were determined after the amplification by performing an end-point read.
Roche LightCycler Melting Curve Analysis
The LightCycler combines rapid thermal cycling for PCR with real-time fluorescence monitoring (35, 36). After amplification, the fluorescence signal allows genotyping by analysis of the allele-specific melting behavior of the hybridization probe. The reaction mixture (20 µL) contained 1 unit Taq polymerase, 2 µL of 10x Taq buffer (GeneCraft), 2.5 mmol/L MgCl2, 0.1 mmol/L deoxynucleotide triphosphates (GeneCraft), 30 mg/L bovine serum albumin (New England Biolab), 50 ml/L dimethyl sulfoxide (Merck), 0.25 mol/L forward primer, 0.1 mol/L reverse primer, 0.15 mol/L of the anchor, 0.05 mol/L of the locked nucleic acidmodified sensor, 1 µL DNA (40-60 ng/L), and water (PCR grade) up to 20 µL. The following program was done: an initial denaturation at 94°C for 2 min at 20°C/s, followed by a 50-cycle program consisting of heating to 94°C at 20°C/s with no hold, cooling to 58°C at 20°C/s with a 10-s hold, and heating to 72°C at 2°C/s with a 15-s hold. The melting curve was determined by 20 s denaturation at 94°C cooling to 32°C at 20°C/s with a 20 s hold by continuous temperature increase from 32°C to 70°C in increments of 0.1°C/s. Fluorescence was recorded continuously while heating.
Sequenom MassARRAY Genotyping System
Genotyping was carried out with a MassARRAY technique (Sequenom; refs. 37, 38) using a chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometer (39). Multiplex SNP assays were designed using SpectroDesigner software (Sequenom); 384-well plates containing 2.5 ng DNA in each well were amplified by PCR following the specifications of Sequenom. After PCR, shrimp alkaline phosphatase (Sequenom) was added to samples to prevent future incorporation of unused deoxynucleotide triphosphates that could interfere with the primer extension assay. Allele discrimination reactions were conducted by adding the extension primer(s), DNA polymerase, and a cocktail mixture of deoxynucleotide triphosphates and dideoxynucleotide triphosphates to each well. MassExtend clean resin (Sequenom) was added to the mixture to remove extraneous salts that could interfere with matrix-assisted laser desorption/ionization time-of-flight analysis. Genotypes were determined by spotting an aliquot of each sample onto a 384 SpectroChip (Sequenom), which was subsequently read by the matrix-assisted laser desorption/ionization time-of-flight mass spectrometer. Assay conditions are available upon request and primer sequences are shown in Table 2.
Estimating the Incidence of Triallelic SNPs in Human Populations
The Seattle SNP database (SeattleSNPs, National Heart, Lung, and Blood Institute Program for Genomic Applications, SeattleSNPs, Seattle, WA)7 was used to estimate the proportion of SNPs that are triallelic. This database contains polymorphisms identified from DNA sequencing 5.9 Mb from 280 genes in a panel of unrelated subjects. The 280 genes were selected because they are thought to influence inflammatory response in humans. Each gene was genotyped in one panel of subjects. The panel of subjects was either a set of 20 individuals of African-American descent and 19 individuals of European descent, or a set of 20 Yoruba from Ibadan, Nigeria from the YRI HapMap panel and 20 Caucasians from Utah from the CEU HapMap panel (1). For simplicity, we refer to European and African samples without distinguishing whether the panel with the African-American samples was used, or the panel with the African samples was used.
 |
Results
|
|---|
Allele Frequencies for MDR1 G2677/T/A as Estimated using Six Different Methodologies
The allele frequencies, as estimated by different methods, are shown in Table 3
. The true genotype of each sample was defined as the result of matching genotypes of at least four methods. In the case of only three methods with matching results (six samples), Sequenom MassARRAY Genotyping system had to be one of them (note that in all six cases, DNA sequencing agreed with the Sequenom results). In this population, estimates of the proportion of the G allele ranged from 0.543 (judged by PCR) to 0.589 (LightCycler). Conversely, the T allele appeared lowest when estimated by LightCycler (0.377) and highest using the PCR-based allelic discrimination method (0.457). As most of the known SNPs are diallelic, we wanted to determine what effect the presence of a third allele would have when it is detected in a two-dimensional assay. Thus, the A/G genotype appears as a G/G genotype and an A/T as a T/T genotype in the Taqman assay. None of the PCR methods, RFLP, or Taqman SNP Genotyping Assay provided evidence for the presence of the A allele. However, although our LightCycler method was not designed using knowledge of the third allele, this allele became obvious from the spectra generated (Fig. 1
).
View this table:
[in this window]
[in a new window]
|
Table 3. The observed genotype frequencies obtained using the different methods of analysis (see Materials and Methods)
|
|

View larger version (26K):
[in this window]
[in a new window]
[Download PPT slide]
|
Figure 1. MDR1 G2677T/A allelic discrimination PCR by melting curve analysis using the LightCycler. A. Three common genotypes: T/T, G/T, and G/G. B. Genotypes T/T and G/G and the rare genotypes T/A or G/A, respectively.
|
|
Genotype Errors as Estimated Using Six Different Methodologies
The genotype error analysis is shown in Table 4
. Even when the genotype was called incorrectly, one of the two alleles was usually correct. The only exceptions in our data set were for two T/T genotypes that the LightCycler incorrectly called as G/G and the one T/T that RFLP incorrectly called as G/G.
Most of the errors (6 of 8 RFLP errors, 6 of 7 Taqman SNP Genotyping Assay errors, and 6 of 10 PCR errors) were due to the inability to detect the A allele in the six samples that carried the A allele. RFLP incorrectly called two T/T genotypes (as G/T and G/G), and PCR incorrectly called three G/G genotypes as G/T and one G/T genotype as T/T. The seven LightCycler errors did not have obvious pattern at the genotypic level or the allelic level. This method called six T alleles incorrectly as G alleles and called two G alleles and one A allele incorrectly as T alleles. Even if the six samples with an A allele are ignored, allelic discrimination PCR still showed four errors (error rate = 0.063), and RFLP had two errors in 63 non-missing genotypes. Neither DNA sequencing nor Sequenom MassARRAY Genotyping system generated any errors (Table 4).
To exclude the possibility that the A allele itself would not be detectable with methods where the knowledge of the third allele is necessary for the assay design, we specifically redesigned two of the assays with respect to the A allele (A/T Taqman assay and allelic discrimination PCR). Neither the Taqman assay nor allelic discrimination PCR failed to detect this allele (data not shown). However, the two-dimensional nature of the assay design restricts the Taqman assay to be able to detect only two alleles (A/T in our case) and results in missing another allele (here the G allele). Accordingly, all samples with a G/G genotype failed to amplify, and most of the G/T samples were detected as T/T or failed to amplify. On the other hand, all A/T and T/T genotypes were called correctly. However, in the case of an A/G genotype, the Taqman assay either calls it as an A/A or A/T genotype. No method called a G or T allele as an A allele. The six samples that contained an A allele were genotyped correctly by the DNA sequencing and Sequenom MassARRAY Genotyping system methods. The LightCycler correctly genotyped five of the six samples containing the A allele. However, the number of A alleles in our sample was too small to determine the accuracy of these methods when assaying samples carrying the A genotype.
Missing Genotype Analysis
It seemed that particular genotypes failed with certain methods (Table 5
). Homozygotes (G/G or T/T) seemed to be preferentially missing when using allelic discrimination PCR and Sequenom, whereas heterozygotes (G/T) seemed to be preferentially missing when using DNA Sequencing. The LightCycler and RFLP methods had no missing genotypes.
Estimation of Population Frequencies of Triallelic SNPs
The Seattle SNP database contained 29,827 diallelic SNPs, 67 triallelic SNPs, and 2,070 insertion/deletion polymorphisms. Therefore, 0.224% of the SNPs in the Seattle SNPs database are triallelic. Of the 67 triallelic SNPs, 12 were triallelic in the European samples, and 53 were triallelic in the African samples. Ten of the SNPs were diallelic in the European samples and in the African samples but were triallelic in the combined samples because the European and African samples had different minor alleles. In the African samples, 19 triallelic SNPs had all three allele frequencies >0.05, and in the European samples, five triallelic SNPs had all three allele frequencies >0.05.
 |
Discussion
|
|---|
It is recognized that some techniques (DNA sequencing and Sequenom MassARRAY Genotyping system analysis) can detect a third allele without knowing of its existence. Our data set suggested this was also true, at least for this allele at this locus, for the LightCycler method. However, most of the multiplex techniques that are being increasingly used for genotyping start with the assumption that the SNP is diallelic (Taqman SNP Genotyping Assay and allelic discrimination PCR) and would need the knowledge of a third allele being present for the assay design, although in some cases, a third allele can be detected by examination of the raw data before analysis (40). This is also true for RFLP, which is still commonly used for genotyping. Thus, our assay designs for genotyping analysis were based on the assumption that there are only two alleles (G and T), ignoring the presence of the rare A allele. As anticipated, several of the methods failed to provide signals that would have led us to suspect a third allele. Unexpectedly, however, it was not only the A allele that provided difficulties in genotyping with some of the tested methodologies.
Other than hypothesized, it was not apparent that any of the different detection techniques favored one allele over the other. Among all methods tested, the LightCycler and RFLP methods were the only methods that showed no unclear or failed results. Five of seven RFLP genotype errors, five of the six Taqman SNP Genotyping Assay errors, and five of the eight allelic discrimination PCR errors were due to the inability to detect the A allele, as would be expected for these methods. To analyze this further, we designed two sets of assays (Taqman and allelic discrimination PCR) to detect the A allele and have rerun these new assays through our sample set. Both Taqman assay and allelic discrimination PCR provide accurate measurement for the rare A allele.
For family-based studies, genotype error can be a serious problem because it can increase the false positive proportion (41). For case control studies, genotype error generally will cause a loss in power to detect marker-disease associations but not an increase in the false-positive proportion.
The consequences of not detecting a null (i.e., unknown) allele of a triallelic SNP are serious when the null allele affects risk of disease. This can be shown by calculating the population OR for each of the detected alleles in cases and controls for a population that is in Hardy-Weinberg equilibrium. We assume that there are three alleles (a, b, and c) with frequencies pa, pb, and pc, and that allele c is a null allele such that null allele homozygotes have missing genotypes and null allele heterozygotes are miscalled as homozygote genotypes for the non-null allele. Denote the population disease prevalence by
and the genotypic relative risk for c allele heterozygotes as r. Then the apparent population allelic OR for the a allele is:
where s = (1
r) / (1
).
For example, when the disease prevalence is <0.1, and the risk allele of a triallelic SNP has an allelic OR of 3.0, if the risk allele is not detected, the allelic OR for each of the detected alleles will be <1.25. In the worst case, when the disease model is recessive, or when the detected alleles have equal population frequency, the allelic OR for each of the detected alleles will appear as 1.0, and there will be no power to detect the true disease association.
When the null allele has low frequency, and there is a sufficiently small difference in the frequencies of the detected alleles, the presence of a null allele is unlikely to cause high levels of missing data or departures from Hardy-Weinberg equilibrium. In cases where a null allele does contribute to an unacceptably high level of missing data or departure from Hardy-Weinberg equilibrium, this evidence of error in the genotyping assay will often result in the SNP being dropped from the analysis.
If an assay (like the Taqman SNP Genotyping Assay) is not designed to detect an allele of a triallelic SNP, there will be little or no power to detect an association of the undetected allele with a disease. Even if all three alleles of a triallelic SNP are detected, high error rates, such as those observed with the LightCycler, can cause substantial loss in power (42).
It was apparent that particular genotypes failed with certain methods. Homozygotes (G/G and T/T) seemed to be preferentially missing when using allelic discrimination PCR and Sequenom MassARRAY Genotyping system, whereas heterozygotes (G/T) seemed to be preferentially missing when using DNA sequencing. The genotyping failures of all methods are not based on the DNA quality, as no sample failed with more then one method.
Another explanation for this failure rate could be the occurrence of allelic dropouts, whereby an unknown polymorphism exists on the template DNA strand where the PCR primer anneals (43, 44). This is unlikely to explain our results, as all our samples were previously sequenced over the whole area of primer annealing and did not reveal any further unknown polymorphisms. We have reviewed the SNP database and are unable to find any reported SNPs within the design of the primers. Although we were unable to obtain the sequence of the area spanning the forward sequencing primer binding site and possible/linked variants, we note that we have successfully used this primer for the allelic discrimination PCR, and we have never found the same samples failing with both methods. This makes it highly unlikely that there are other SNPs in that region. Although we cannot exclude the possibility of introduced errors during the process of primer synthesis, which might lead to the occurrence of null alleles as a consequence of inefficient amplification due to primer/template mismatches (43, 45), we consider that this is unlikely.
Non-random patterns of missing genotypes introduce noise into case-control studies but can cause apparent overtransmission to affected offspring in family-based studies even if the polymorphism is not associated with the disease (46). Our sample sizes were too small to give strong evidence that any of the genotyping platforms gave non-random patterns of missing genotypes. We note that 54.4% of the genotypes in our sample were homozygotes; yet, all four missing genotypes for the Sequenom MassARRAY Genotyping system platform were homozygotes, and all four of the missing genotypes from DNA sequencing were heterozygotes.
For case-control studies, genotype error and non-random missing genotypes can also inflate type 1 error above the nominal rate when using allelic tests that assume Hardy-Weinberg equilibrium (47), such as the
2 test or Fisher's exact test.
The distribution of SNPs at the MDR1 G2677T/A locus (rs2032582) has been reported to vary across population groups and has shown variable association with IBD. We have summarized published information on ethnic variations in unselected populations and reports on IBD patients (Table 6
). Although Schwab et al. (48) suggested that the Ser893 variant increased susceptibility in ulcerative colitis but not Crohn's disease, Brant et al. (20) suggested that the reference genotype (G2677) increased risk, whereas other studies have failed to show an association. The allelic frequency of the A variant was reported to range from 4.4% to 21% in Asians (49, 50) compared with 0.7% to 10% in White subjects (24, 51) and 0.5% in Black subjects (52). It is noteworthy that several different techniques have been used across different laboratories. From our data, comparisons across studies using different methodologies could be considerably misleading.
Some of the techniques that we have used, such as the Taqman SNP Genotyping Assay and RFLP, were designed on the assumption that the target was diallelic. We are aware that these assays can be re-designed to accommodate additional alleles; however, this would add to the cost of the assays and also the time involved in analyzing the samples. For RFLP, it would also depend on the presence of a suitable restriction enzyme. Clearly, when considering the type of genotyping platform to be used in a large association study, along with considering such things as the cost of genotyping, ability to multiplex, the time and handling involved, and/or access to the technology, one also needs to consider the likelihood of encountering a multiallelic SNP in the collection of SNPs being analyzed.
On the basis of the triallelic variant G2677T/A in the MDR1 gene, we have shown different detection techniques (Taqman SNP Genotyping Assay, LightCycler, allelic discrimination PCR, DNA sequencing, Sequenom MassARRAY Genotyping system, and RFLP). Our data lead us to suggest that multiallelic SNPs may be more common than generally realized, may have been overlooked in some studies, and could lead to erroneous overestimation of the frequency of certain alleles. In general, we conclude that more attention is required in the initial analysis of SNPs to determine whether they are multiallelic, perhaps by DNA sequence analysis of a reasonable number of samples. We consider that in some situations, failure to recognize the triallelic nature of the SNP may lead to the over or underestimation of real genetic associations.
 |
Footnotes
|
|---|
Grant support: Nutrigenomics: Tailoring New Zealand foods to people's genes New Zealand Foundation for Research Science and Technology grant C02X0403.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Nutrigenomics New Zealand is a collaboration among AgResearch Ltd., Crop & Food Research, HortResearch, and The University of Auckland, with funding through the Foundation for Research Science and Technology.
5 http://www.invitrogen.com 
6 http://www.ncbi.nlm.nih.gov/blast/ 
7 http://pga.gs.washington.edu, accessed June 25, 2006. 
Received 9/ 8/06;
revised 2/18/07;
accepted 3/30/07.
 |
References
|
|---|
- Consortium TIH. A haplotype map of the human genome. Nature 2005;437:1299320.[CrossRef][Medline]
- Wijsman EM, Daw EW, Yu CE, et al. Evidence for a novel late-onset Alzheimer disease locus on chromosome 19p13.2. Am J Hum Genet 2004;75:398409.[CrossRef][Medline]
- Huang YT, Zhang K, Chen T, Chao KM. Selecting additional tag SNPs for tolerating missing data in genotyping. BMC Bioinformatics 2005;6:263.[Medline]
- Potocnik U, Ferkolj I, Glavac D, Dean M. Polymorphisms in multidrug resistance 1 (MDR1) gene are associated with refractory Crohn disease and ulcerative colitis. Genes Immun 2004;5:5309.[Medline]
- Stoll M, Corneliussen B, Costello CM, et al. Genetic variation in DLG5 is associated with inflammatory bowel disease. Nat Genet 2004;36:47680.[CrossRef][Medline]
- Waller S, Tremelling M, Bredin F, et al. Evidence for association of OCTN genes and IBD5 with ulcerative colitis. Gut 2006;55:80914.[Abstract/Free Full Text]
- Karban AS, Okazaki T, Panhuysen CI, et al. Functional annotation of a novel NFKB1 promoter polymorphism that increases risk for ulcerative colitis. Hum Mol Genet 2004;13:3545.[Abstract/Free Full Text]
- Sashio H, Tamura K, Ito R, et al. Polymorphisms of the TNF gene and the TNF receptor superfamily member 1B gene are associated with susceptibility to ulcerative colitis and Crohn's disease, respectively. Immunogenetics 2002;53:10207.[CrossRef][Medline]
- Nohara H, Okayama N, Inoue N, et al. Association of the -173 G/C polymorphism of the macrophage migration inhibitory factor gene with ulcerative colitis. J Gastroenterol 2004;39:2426.[CrossRef][Medline]
- Klein W, Tromm A, Griga T, et al. Interleukin-4 and interleukin-4 receptor gene polymorphisms in inflammatory bowel diseases. Genes Immun 2001;2:2879.[CrossRef][Medline]
- Klein W, Rohde G, Arinir U, et al. A promotor polymorphism in the Interleukin 11 gene is associated with chronic obstructive pulmonary disease. Electrophoresis 2004;25:8048.[Medline]
- Thorgeirsson SS, Silverman JA, Gant TW, Marino PA. Multidrug resistance gene family and chemical carcinogens. Pharmacol Ther 1991;49:28392.[CrossRef][Medline]
- Gupta KP, Ward NE, Gravitt KR, Bergman PJ, O'Brian CA. Partial reversal of multidrug resistance in human breast cancer cells by an N-myristoylated protein kinase C-alpha pseudosubstrate peptide. J Biol Chem 1996;271:210211.[Abstract/Free Full Text]
- Hsia TC, Lin CC, Wang JJ, Ho ST, Kao A. Relationship between chemotherapy response of small cell lung cancer and P-glycoprotein or multidrug resistance-related protein expression. Lung 2002;180:1739.[CrossRef][Medline]
- Burger H, Foekens JA, Look MP, et al. RNA expression of breast cancer resistance protein, lung resistance-related protein, multidrug resistance-associated proteins 1 and 2, and multidrug resistance gene 1 in breast cancer: correlation with chemotherapeutic response. Clin Cancer Res 2003;9:82736.[Abstract/Free Full Text]
- Ferguson LR, De Flora S. Multiple drug resistance, antimutagenesis and anticarcinogenesis. Mutat Res 2005;591:2433.[Medline]
- Verstuyft C, Marcellin F, Morand-Joubert L, et al. Absence of association between MDR1 genetic polymorphisms, indinavir pharmacokinetics and response to highly active antiretroviral therapy. AIDS 2005;19:212731.[Medline]
- Kajinami K, Brousseau ME, Ordovas JM, Schaefer EJ. Polymorphisms in the multidrug resistance-1 (MDR1) gene influence the response to atorvastatin treatment in a gender-specific manner. Am J Cardiol 2004;93:104650.[Medline]
- Furuno T, Landi MT, Ceroni M, et al. Expression polymorphism of the blood-brain barrier component P-glycoprotein (MDR1) in relation to Parkinson's disease. Pharmacogenetics 2002;12:52934.[CrossRef][Medline]
- Brant SR, Panhuysen CI, Nicolae D, et al. MDR1 Ala893 polymorphism is associated with inflammatory bowel disease. Am J Hum Genet 2003;73:128292.[CrossRef][Medline]
- Ho GT, Nimmo ER, Tenesa A, et al. Allelic variations of the multidrug resistance gene determine susceptibility and disease behavior in ulcerative colitis. Gastroenterology 2005;128:28896.[Medline]
- Onnie CM, Fisher SA, Pattni R, et al. Associations of allelic variants of the multidrug resistance gene (ABCB1 or MDR1) and inflammatory bowel disease and their effects on disease behavior: a case-control and meta-analysis study. Inflamm Bowel Dis 2006;12:26371.[Medline]
- Palmieri O, Latiano A, Valvano R, et al. Multidrug resistance 1 gene polymorphisms are not associated with inflammatory bowel disease and response to therapy in Italian patients. Aliment Pharmacol Ther 2005;22:112938.[Medline]
- Urcelay E, Mendoza JL, Martin MC, et al. MDR1 gene: susceptibility in Spanish Crohn's disease and ulcerative colitis patients. Inflamm Bowel Dis 2006;12:337.[Medline]
- Birney E, Andrews D, Caccamo M, et al. Ensembl 2006. Nucleic Acids Res 2006;34:D55661.[Abstract/Free Full Text]
- Hoffmeyer S, Burk O, von Richter O, et al. Functional polymorphisms of the human multidrug-resistance gene: multiple sequence variations and correlation of one allele with P-glycoprotein expression and activity in vivo. Proc Natl Acad Sci U S A 2000;97:34738.[Abstract/Free Full Text]
- Cascorbi I, Gerloff T, Johne A, et al. Frequency of single nucleotide polymorphisms in the P-glycoprotein drug transporter MDR1 gene in white subjects. Clin Pharmacol Ther 2001;69:16974.[CrossRef][Medline]
- Kim RB, Leake BF, Choo EF, et al. Identification of functionally variant MDR1 alleles among European Americans and African Americans. Clin Pharmacol Ther 2001;70:18999.[CrossRef][Medline]
- Ameyaw MM, Regateiro F, Li T, et al. MDR1 pharmacogenetics: frequency of the C3435T mutation in exon 26 is significantly influenced by ethnicity. Pharmacogenetics 2001;11:21721.[CrossRef][Medline]
- Slovak ML, Kopecky KJ, Wolman SR, et al. Cytogenetic correlation with disease status and treatment outcome in advanced stage leukemia post bone marrow transplantation: a Southwest Oncology Group study (SWOG-8612). Leuk Res 1995;19:3818.[Medline]
- Tanabe M, Ieiri I, Nagata N, et al. Expression of P-glycoprotein in human placenta: relation to genetic polymorphism of the multidrug resistance (MDR)-1 gene. J Pharmacol Exp Ther 2001;297:113743.[Abstract/Free Full Text]
- Lewontin RC. The interaction of selection and linkage. I. General considerations;heterotic models. Genetics 1964;49:4967.[Free Full Text]
- Hedrick PW. Gametic disequilibrium measures: proceed with caution. Genetics 1987;117:33141.[Abstract/Free Full Text]
- Livak KJ. Allelic discrimination using fluorogenic probes and the 5' nuclease assay. Genet Anal 1999;14:1439.[Medline]
- Arjomand-Nahad F, Diefenbach K, Landt O, Gaikovitch E, Roots I. Genotyping of the triallelic variant G2677T/A in MDR1 using LightCycler with locked-nucleic-acid-modified hybridization probes. Anal Biochem 2004;334:2013.[Medline]
- Pals G, Pindolia K, Worsham MJ. A rapid and sensitive approach to mutation detection using real-time polymerase chain reaction and melting curve analyses, using BRCA1 as an example. Mol Diagn 1999;4:2416.[CrossRef][Medline]
- Leushner J, Chiu NH. Automated mass spectrometry: a revolutionary technology for clinical diagnostics. Mol Diagn 2000;5:3418.[CrossRef][Medline]
- Jurinke C, Denissenko MF, Oeth P, et al. A single nucleotide polymorphism based approach for the identification and characterization of gene expression modulation using MassARRAY. Mutat Res 2005;573:8395.[Medline]
- Bray MS, Boerwinkle E, Doris PA. High-throughput multiplex SNP genotyping with MALDI-TOF mass spectrometry: practice, problems and promise. Hum Mutat 2001;17:296304.[CrossRef][Medline]
- Carlson CS, Smith JD, Stanaway IB, Rieder MJ, Nickerson DA. Direct detection of null alleles in SNP genotyping data. Hum Mol Genet 2006;15:19317.[Abstract/Free Full Text]
- Mitchell AA, Cutler DJ, Chakravarti A. Undetected genotyping errors cause apparent overtransmission of common alleles in the transmission/disequilibrium test. Am J Hum Genet 2003;72:598610.[CrossRef][Medline]
- Gordon D, Finch SJ, Nothnagel M, Ott J. Power and sample size calculations for case-control genetic association tests when errors are present: application to single nucleotide polymorphisms. Hum Hered 2002;54:2233.[CrossRef][Medline]
- Pompanon F, Bonin A, Bellemain E, Taberlet P. Genotyping errors: causes, consequences and solutions. Nat Rev Genet 2005;6:84759.[CrossRef][Medline]
- Walsh PS, Erlich HA, Higuchi R. Preferential PCR amplification of alleles: mechanisms and solutions. PCR Methods Appl 1992;1:24150.[Medline]
- Paetkau D, Strobeck C. The molecular basis and evolutionary history of a microsatellite null allele in bears. Mol Ecol 1995;4:51920.[Medline]
- Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 2005;6:95108.[Medline]
- Sasieni PD. From genotypes to genes: doubling the sample size. Biometrics 1997;53:125361.[CrossRef][Medline]
- Schwab M, Schaeffeler E, Marx C, et al. Association between the C3435T MDR1 gene polymorphism and susceptibility for ulcerative colitis. Gastroenterology 2003;124:2633.[CrossRef][Medline]
- Tang K, Ngoi SM, Gwee PC, et al. Distinct haplotype profiles and strong linkage disequilibrium at the MDR1 multidrug transporter gene locus in three ethnic Asian populations. Pharmacogenetics 2002;12:43750.[CrossRef][Medline]
- Komoto C, Nakamura T, Sakaeda T, et al. MDR1 haplotype frequencies in Japanese and Caucasian, and in Japanese patients with colorectal cancer and esophageal cancer. Drug Metab Pharmacokinet 2006;21:12632.[Medline]
- Gerloff T, Schaefer M, Johne A, et al. MDR1 genotypes do not influence the absorption of a single oral dose of 1 mg digoxin in healthy white males. Br J Clin Pharmacol 2002;54:6106.[CrossRef][Medline]
- Kroetz DL, Pauli-Magnus C, Hodges LM, et al. Sequence diversity and haplotype structure in the human ABCB1 (MDR1, multidrug resistance transporter) gene. Pharmacogenetics 2003;13:48194.[CrossRef][Medline]
- Song P, Li S, Meibohm B, et al. Detection of MDR1 single nucleotide polymorphisms C3435T and G2677T using real-time polymerase chain reaction: MDR1 single nucleotide polymorphism genotyping assay. AAPS PharmSci 2002;4:E29.[Medline]