CEBP CTRC-AACR San Antonio Breast Cancer Symposium Cancer Health Disparities Conference 2009
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Cancer Research Clinical Cancer Research
Cancer Epidemiology Biomarkers & Prevention Molecular Cancer Therapeutics
Molecular Cancer Research Cancer Prevention Research
Cancer Prevention Journals Portal Cancer Reviews Online
Annual Meeting Education Book Meeting Abstracts Online

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Savas, S.
Right arrow Articles by Ozcelik, H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Savas, S.
Right arrow Articles by Ozcelik, H.
Cancer Epidemiology Biomarkers & Prevention Vol. 13, 801-807, May 2004
© 2004 American Association for Cancer Research

Identifying Functional Genetic Variants in DNA Repair Pathway Using Protein Conservation Analysis

Sevtap Savas1,3, David Y. Kim1, M. Farhan Ahmad1, Mehjabeen Shariff1 and Hilmi Ozcelik1,2,3

1 Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, 2 Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Ontario, Canada and 3 Department of Laboratory Medicine and Pathobiology, University of Toronto, Ontario, Canada

Requests for reprints: Hilmi Ozcelik, Mount Sinai Hospital Samuel Lunenfeld Research Institute, 600 University Avenue Room 992A, Toronto, ON M5G 1X5, Canada. Phone: (416) 586-4996; Fax: (416) 586-8869. E-mail: ozcelik{at}mshri.on.ca


    Abstract
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
The role of DNA repair in initiation, promotion, and progression of malignancy suggests that variations in DNA repair genes confer altered cancer risk. Accordingly, DNA repair gene variants have been studied extensively in the context of cancer predisposition. Single nucleotide polymorphisms (SNPs) are the most common genetic variations in the human genome. A fraction of SNPs are located within the genes, which are likely to alter the gene expression and function. SNPs that change the encoded amino acid sequence of the proteins (non-synonymous; nsSNPs) are potentially genetic disease determinant variations. However, as not all amino acid substitutions are supposed to lead to a change in protein function, it will be necessary to have a priori prediction and determination of the functional consequences of amino acid substitutions per se, and then together with other genetic and environmental factors to study their possible association with a trait. Here we report the analysis of nsSNPs in 88 DNA repair genes and their functional evaluation based on the conservation of amino acids among the protein family members. Our analysis demonstrated that >30% of variants of DNA repair proteins are highly likely to affect the function of the proteins drastically. In this study, we have shown that three nsSNPs, which were predicted to have functional consequences (XRCC1-R399Q, XRCC3-T241M, XRCC1-R280H), were already found to be associated with cancer risk. The strategy developed and applied in this study has the potential to identify functional protein variants of DNA repair pathway that may be associated with cancer predisposition.


    Introduction
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Nuclear DNA is under constant DNA damage stress induced by both endogenous (such as reactive oxygen species) and exogenous sources (such as irradiation). Proper recognition and repair of the DNA damage are essential for normal homeostasis and functioning of multicellular organisms (1, 2). DNA repair activities are maintained by the presence of five different DNA damage sensor and repair mechanisms (homologous recombinational repair, non-homologous end-joining, nucleotide excision repair, base excision repair, and mismatch repair). Defects in the DNA repair pathways are often associated with excessive cell death (by apoptosis) or transformation of the cells (1, 2), and variations in DNA repair genes were hypothesized to modify individual and population cancer risk (3).

To date, much success has been obtained in the identification of high-penetrant cancer predisposition genes using linkage analysis. However, the challenge that has remained is to identify those alleles conferring low to moderate cancer risk. It is hypothesized that genetic variation contributes to the susceptibility for complex traits such as cancer (4–6). Molecular epidemiological and genetic approaches use single nucleotide polymorphisms (SNPs) in the human genome to study disease susceptibility. Because genome-wide scans are still challenging, often candidate gene/pathway approach may prove more efficient. Due to presence of enormous number of SNPs, systematic prioritization on the basis of biological function and relevance to cancer will accelerate the identification of such susceptibility alleles (4).

The most common form of genetic variation in the human genome is the SNPs (5–8). SNPs are relatively stably inherited genomic variations with an estimated density of 1 in 1000 bp. SNPs are usually bi-allelic, their occurrence rates vary across the genomic regions, and their allelic frequencies may differ among ethnic groups. A fraction of SNPs alter the encoded amino acid sequence (non-synonymous SNPs; nsSNPs), and have the potential to affect the structure, function, and interactions of proteins. Thus, nsSNPs are excellent candidates for candidate-gene association studies (7). However, not all nsSNPs are anticipated to have functional consequences; it is essential to develop strategies to select the variations that may alter and disrupt the proper functions of the proteins. Studying the functional consequences of genetic variants has been challenging due to the enormous number of variants present in the genome. Although there is an increasing effort for establishing in vivo functional strategies for studying the effects of variants, it is still far from being available for a large number of variants of interest. Recently, several approaches have been developed and used to study the nature of the genetic variants (9–15). Among these, computational tools provide an efficient and high-throughput source for in vivo functional analyses and/or population studies. SIFT (Sorting Intolerant From Tolerant) (10, 11) is a powerful tool that predicts the functional importance of an amino acid based on the alignment of highly similar proteins (either orthologous or paralogous or both) with the protein of interest. The predictions rely on whether or not an amino acid is conserved (or substituted by only a similar amino acid) in the protein family, which can suggest its importance for the function/structure of the protein.

Here, using the public SNP databases, we have identified a wide range of DNA repair nsSNPs, and we have carried out a computational study to characterize the evolutionary importance of these DNA repair nsSNPs. This study has the potential to provide a pool of functional SNPs, which may play important roles in the predisposition to cancer as well as other DNA repair-associated genetic diseases.


    Methods
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Database Mining for SNPs
The list of DNA repair pathway genes studied was obtained from the CGAP-GAI web-site2 (16). A total of 88 DNA repair genes was screened for SNPs using five different public SNP databases: dbSNP3 (17), HGVbase4 (18), CGAP-GAI5 (16), SNP5006, and GeneSNP7. During this work, we noticed several problems concerning the specificity and accuracy of the information related to the SNPs (i.e., errors on annotations of SNP locations along the protein sequence, of the nature of the SNP as a sSNP, nsSNP, etc., and the specificity of the SNP, etc.). Accordingly, we developed a strategy to standardize the uniformity of the SNP collection procedure and to validate the specificity of the SNPs by performing two independent BLAST analyses. The sequences flanking the variants (preferably 100 bp) were retrieved from SNP databases. These sequences were then aligned against the mRNA sequences in GenBank using the "BLAST against gene transcripts" tool8 to confirm the location of the nsSNPs. Once a gene-specific accession number was obtained, it was used as a reference template to locate the other variants in the same gene. Also, we have aligned the same SNP-flanking sequence against the non-redundant human genome database of NCBI using the BLAST tool9 (19), and the genome view option was used to visualize the exact chromosomal location of the hit. Unigene10 resource of NCBI (19) was used to validate the chromosomal location of the gene of interest, and compared with that of the SNP-flanking sequence location obtained from genome view result. In those cases when a shorter than 100 bp SNP-flanking sequence was available, we checked the genome location using the "search for short nearly exact matches" option of the NCBI nucleotide BLAST choosing "Homo sapiens" as advanced blast option. The retrieved hits were then manually inspected to find the chromosomal location of the SNP sequence. We specified a SNP as gene specific if and only if the SNP-flanking sequence hits (without any other mismatch or gap) (a) the transcript(s) of interest but not any other gene's or expressed pseudogene's as a result of blast against gene transcripts, or (b) any other genomic sequence other than the location of the gene of interest. In case of alternatively spliced genes, only the information (such as nsSNP location, evolutionary conservation, etc.) of the amino acid sequence encoded by one alternatively spliced transcript was reported in this study. The SNPs annotated as splice site SNPs in HGVbase were excluded from this study. Whenever available, the frequency information of the nsSNPs was extracted from the SNP databases as well as from Mohrenweiser et al. (20). Entire list of genes and nsSNPs analyzed during this study can be found at http://www.ozceliklab.com/savas2004a/.

Mutation Data Set
Mutations with known functional consequences were retrieved from the SWISS-PROT database11 (21) using the key words "Human AND mutations AND functional" on September 2002. Following a manual inspection, a total of 231 mutations in 55 human genes constituted the final list. The mutations in this list were characterized by complete/partial loss of activity, gain of function, affecting protein-macromolecule interactions, interfering with cellular localization of the mutant protein, altering the protein stability, altering a protein-critical site, or interfering with the protein dimerization, as indicated in the feature table of each SWISS-PROT entry.

Evolutionary Conservation Analysis
Protein conservation analysis was performed using the SIFT12 software developed by Ng and Henikoff (10, 11). The SIFT algorithm predicts whether an amino acid substitution may have an impact on protein function by aligning similar proteins, and calculating a score which is used to determine the evolutionary conservation status of the amino acid of interest. It evaluates the identity (such as if only a single amino acid is observed in all proteins aligned at that position, then the alteration of it is predicted to affect the protein) and two physicochemical characteristics of amino acids, hydrophobicity and polarity (if the substituted amino acid differs in these characteristics from the wild-type amino acid and this kind of a substitution is not observed in the other proteins in the alignment at that position, it is predicted to affect the protein as well). These predictions are based on the assumption that amino acids conserved within the protein families are important for the function of the proteins. Whenever the frequency information was available, this conservation analysis was performed for the common allele. As we thought they would not be reliable, in this analysis, we did not consider the SIFT predictions based on less than six proteins in the alignments. We used the default median sequence conservation in the range of 3.0. In no cases the median sequence conservation score was found <=2.25. However, there were many amino acid substitutions where the score was calculated as >3.25. Such scores indicate that the substitution at that position might not have had time to evolve yet, and consequently, the prediction may be misleading (11). Thus, we designated the predictions with a median sequence conservation score of >3.25 as either possibly affecting or possibly tolerated. This evaluation is different from that of Ng and Henikoff (11), where such predictions were not accepted at all.

Statistical Analysis
The statistical analyses were done using a {chi}2 test (22). We applied the Yates correction for approximation of 2 x 2 tables. The test was conducted at the {alpha} = 0.05 level of significance. This test was applied to examine possible significant differences of the evolutionary conservation status of the amino acids altered in mutation and DNA repair nsSNP data sets, and between the rare and common DNA repair nsSNPs.


    Results
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
We have compiled a total of over 1000 SNP entries from 88 DNA repair genes using five web-based public SNP databases (see "Methods"). Extensive manual inspection of all SNP entries have shown at least one gene-specific nsSNP in 51.1% (45 of 88) of the proteins (a total of 150 nsSNPs resulting in an amino acid substitution). Four of the nsSNPs were unique to the CGAP-GAI database. There was no nsSNP unique to the SNP500 database. Most of the nsSNPs were found in dbSNP (n = 128, 85.3%), GeneSNP (n = 105, 70.0%), and HGVbase (n = 89, 59.3%). The average number of nsSNPs for genes with at least one nsSNP was 3.3. Among all the genes studied, ATM was found to have the highest number of nsSNPs (n = 19).

In this study, we have used a modified interpretation of the SIFT algorithm results to define the nature of the variations (see "Methods"). To determine the sensitivity of the modified SIFT interpretation, we have used a panel of 231 missense mutations supported with functional evidence (see "Methods"; Table 1). Except one mutation, the number of proteins in all the alignments was at least six or higher (n = 230). Mutations in this group were predicted as either damaging (57.39%) or possibly damaging (19.13%), whereas 17.83% and 5.65% of the mutations were predicted either tolerated or possibly tolerated, respectively. Thus, the sensitivity of the modified SIFT predictions (damaging together with possibly damaging) reported in this study was 76.52%.


View this table:
[in this window]
[in a new window]
 
Table 1. Comparison of SIFT evolutionary conservation status of mutations versus DNA repair nsSNPs

 
We have also applied the modified SIFT predictions to study our panel of 150 nsSNPs involved in DNA repair genes. In 44 of 150 variants, the predictions were based on the alignment of less than six sequences, which was considered inconclusive (NP nsSNPs). Reliable predictions were obtained in 106 (70.6%) nsSNPs, and the results are depicted in Table 1. Within this group, 11 (10.37%) nsSNPs were predicted to be damaging the protein function. Twenty-eight of the 106 variants (26.41%) were predicted as possibly damaging, indicating that they are likely to have functional consequences as well. On the other hand, 67 nsSNPs (63.2%) were predicted either tolerated or possibly tolerated by our SIFT analysis. We have found that SIFT detects a significantly higher number of damaging alterations (including the possibly damaging alterations) in the mutation panel as compared to the DNA repair nsSNP panel (P < 0.0001) (Table 1).

Frequency information of 102 (68.0%) of 150 nsSNPs13 was available either in the SNP databases or in Mohrenweiser et al. (20) (herein called validated/proven SNPs). For 68 of the validated nsSNPs, there were reliable SIFT predictions (Table 1). We classified the nsSNPs as rare or common if the frequencies of the minor allele fell between <=5% and >5% ranges, respectively. There were a total of seven nsSNPs (6.86%) that were reported in independent submissions as both common and rare according to our classification. In the remaining cases, we have categorized 76.47% (78 of 102) and 16.66% (17 of 102) as rarely and commonly occurring nsSNPs, respectively. The comparisons of evolutionary conservation status of the amino acids and the frequency ranges of the nsSNPs substituting these amino acids are depicted in Table 2. In addition, there were >20 nsSNPs in our set with minor allele frequencies >=10% at least in one submission (see http://www.ozceliklab.com/savas2004a/).


View this table:
[in this window]
[in a new window]
 
Table 2. Comparison of evolutionary conservation status of the rare and common nsSNPs

 
In case of rare nsSNPs, we predicted 4 nsSNPs as damaging and 11 as possibly damaging (Table 3). Our results have also shown that none of the 17 SNPs with allelic frequencies of 5% and higher were predicted to be damaging, whereas 3 of them (IGHMBP2-T671A; XRCC1-R399Q; XRCC3-T241M) were predicted to be possibly damaging (Table 3). The two nsSNPs, ERCC4-P379S (HGVbase SNP ID: SNP000000067; Ref. 23), and XRCC1-R280H (SNP000000031/rs25489; see also GeneSNP entry) variants were predicted as damaging and possibly damaging by SIFT analysis, respectively, though the reported minor allele frequencies were inconsistent (Table 3).


View this table:
[in this window]
[in a new window]
 
Table 3. List of DNA repair nsSNPs found as damaging and possibly damaging during this study

 

    Discussion
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
To enrich the SNP information for each gene studied, we have used five different public SNP databases. While dbSNP and HGVbase contained SNP information related to almost all kinds of genes, the SNP500, CGAP-GAI, and GeneSNP databases particularly focused on candidate genes/pathways that may play role in cancer susceptibility. Majority of the nsSNPs (97.33%) were found in the dbSNP, HGVbase, and GeneSNP databases. All nsSNPs reported here were curated using a highly stringent SNP extraction procedure to eliminate false annotations of the SNPs. Although SNP mining sensitivity is reduced following such a stringent procedure, we strongly suggest evaluating the SNP information using the same or similar approaches described in this study to increase the specificity of the curated data.

Among 1000 entries in the SNP databases, we have extracted a total of 150 nsSNPs resulting in an amino acid substitution from 51.1% (45 of 88) of the DNA repair genes analyzed in this study. The number of SNPs in these genes is likely to improve as more SNPs are discovered, and the SNP databases continue to be updated. Several factors may lead to underestimation of the number of SNPs in genes of interest. For example, a considerable number of SNPs in these databases is not validated to distinguish them from sequencing errors, and thus these nsSNPs represent "suspected" or "non-proven" SNPs. In terms of suspected SNPs, which are described based on the DNA/RNA sequence alignments, there may be a bias toward the genetic variations through the 3' end of the transcripts as well as for abundant transcripts, common variations, and variations in less complex regions of the genome (24–26). Therefore, sequencing of the entire coding region of the genes of interest in significant number of DNA samples may reveal additional SNPs in the genes. Sequencing might especially help to demonstrate whether these genes found to have no nsSNPs during this study are really devoid of nsSNPs or not. This information could be useful for assessing conservation status of the genes, or the different mutation/recombination rates at genomic regions containing the genes of interest (7, 8, 26).

Protein conservation analyses based on the alignment of similar proteins (either among species or within species) can reveal those amino acids that are important for the function and probably for the structure of the protein families. Although such analyses would not indicate newly evolved critical amino acids with a particular function, or amino acid which are under positive selection under today's conditions, it may still be critical in assigning evolutionary conserved residues along the proteins. SIFT (10, 11) is an automated tool that calculates the conservation scores of each amino acid residue along the given protein sequences. Originally, the prediction sensitivity of SIFT for damaging amino acid substitutions was found to be 69% (10). Our SIFT predictions reported in this paper differ in some aspects from what Ng and Henikoff (11) did. First, in this study, we have modified the SIFT predictions by only considering predictions that are based on at least six protein sequences in the alignment at the amino acid position of interest. Second, whenever the median sequence conservation was >3.25, Ng and Henikoff (11) did not accept any predictions (a median sequence conservation score >3.25 indicates that the proteins in the alignment did not diverge yet, and thus the predictions would not be reliable as much as the predictions obtained from alignment of the diverged proteins where conserved residues are more easily identified) (11). However, considering the fact that 19.03% of the mutations were also found with median sequence conservation scores of >3.25 (Table 1), we preferred to include such predictions in our results, only stating that they were either "possibly tolerated" or "possibly damaging."

The sensitivity of the modified SIFT prediction system was tested on a mutation set with experimentally determined functional consequences (see "Methods"). According to our results, it can be concluded that approximately 57.39% of the mutations occurred at amino acids that are conserved within the protein family in our set (median sequence conservation score 2.75–3.25). On the other hand, 19.03% of the mutations occurred either at regions of proteins that are highly conserved, or in the proteins for which homologous proteins from only close species were available (median sequence conservation score >3.25). Further analyses may be performed to investigate the latter possibility. The mutations that were not detected by SIFT as damaging could be those that occurred at query specific functional residues or are the variations in linkage disequilibrium with yet unidentified causative mutations (10). As far as DNA repair genes are concerned, over one third of the nsSNPs turned out to be likely to have functional consequences (i.e., found damaging and possibly damaging). Eleven DNA repair nsSNPs were found damaging, suggesting that they are excellent candidates for disease-predisposition studies. Another 28 nsSNPs were predicted as possibly damaging. We suggest that along with the damaging SNPs, these possibly damaging nsSNPs may also be good candidates for functional and association studies.

We were not able to make predictions for 44 DNA repair nsSNPs, due to the lack of sufficient sequence information available from homologous proteins (<6 proteins in the alignment at the position of the nsSNPs). As these analyses are based on the availability of the similar proteins in the public databases, we believe that as the number of curated proteins increases in protein databases, the predictions will become possible for these nsSNPs, and the reliability of the predictions for other nsSNPs will also improve.

Classification of the proven (validated) nsSNPs based on allele frequencies showed that only 16.2% of the nsSNPs was presented in the population(s) with an allele frequency of >5%, suggesting that most of the nsSNPs presented here are actually rare nsSNPs. These nsSNPs may be rare because they are either under negative selection, or newly evolved and thus not fixed in the population yet. None of the common nsSNPs investigated in this study were found to be truly damaging, whereas three of them were predicted to be possibly damaging (Table 3). We were unable to find any published reports regarding the analysis of the IGHMBP2-T671A variant, which was found to be possibly damaging in this study. IGHMBP2 (immunoglobulin µ binding protein 2) protein is presumably involved in a variety of cellular functions such as immunoglobulin-class switching, pre-mRNA processing, and transcription, and mutations in this protein have been shown to result in a neurodegenerative disease (27). On the other hand, the XRCC1-R399Q and XRCC3-T241M variants were intensively studied in the context of cancer association. XRCC1-R399Q SNP was shown to be associated with altered breast (28, 29) and lung (30) cancer risk. XRCC3-M241T has also been shown to confer increased risk to breast cancer14 (31), bladder cancer (32), and melanoma (33). Both of the XRCC1-R399 and XRCC3-T241 residues were conserved in mammalian orthologues, suggesting that they may be important for the function of these proteins.15 There were two nsSNPs (ERCC4-P379S and XRCC1-R280H) for which the minor allele frequencies were reported as both lower and higher than 5% cutoff. The ERCC4-P379S variation was reported in SNP databases as well as in the literature (23) as both rare and common in different sample sets. Our SIFT analysis showed that ERCC4-P379 was damaging. To our knowledge, the association of this SNP with cancer risk has not been studied yet. On the other hand, XRCC1-R280H nsSNP was predicted possibly damaging by SIFT analysis and was already found to be associated with nasopharyngeal carcinoma (34), prostate cancer (35), and lung cancer (36), and was found to have a role in mutagen sensitivity (37). There were 4 and 11 nsSNPs which were both rare and either damaging or possibly damaging, respectively (Table 3). Literature search for these nsSNPs did not reveal any association of them with cancer risk. To sum up, our strategy has the ability to select the potentially disease-related SNPs, and we propose that the other nsSNPs found as evolutionary conserved during this study are good candidates for further cancer-association studies.

Mutations that reduce the fitness of the individuals will be subject to purifying selection that eventually eliminate the mutations from the gene pool of a population, and thus never reach high frequencies (38), unless they confer a selective advantage because of a disease resistance in carriers of such mutations (39). Therefore, we analyzed the common and rare DNA nsSNPs for their conservation status. As a result, we could not detect any statistically significant difference (P < 0.0001, Table 3). Thus, it is tempting to speculate that some deleterious nsSNPs with moderate-high frequencies do not reduce the fitness of the individuals. In this context, the nature of such proteins with deleterious variations can be explained by either (a) the protein's function can be compensated by other proteins, (b) the protein's function is required only under certain environmental exposures/conditions, or (c) the protein is a rapidly evolving one, thus accumulating more mutations without affecting the fitness of the individual. Alternatively, these new substitutions may be either neutral or even positively selected. Analysis of a much larger data set will be helpful to fully characterize frequency-conservation status relation of genetic variations.

Genetic variation has been suggested to alter disease-susceptibility risk. SNPs being the most common variation in the human genome have been extensively studied in the context of disease predisposition. SNPs that alter important molecular features such as the expression, function, structure, stability, and interaction of candidate proteins are excellent candidates to study a possible association/direct involvement of a SNP and a phenotypic expression. However, both the presence of an enormous number of SNPs and the search for biologically relevant SNPs in candidate gene approaches require the application of reliable and logical selection systems. Here we presented results obtained using a highly stringent SNP mining strategy and a modified version of the previously developed SIFT tool to select DNA repair nsSNPs that are conserved within the protein family. Our results suggest that more than one third of the nsSNPs in the DNA repair genes are likely to have functional consequences. These nsSNPs are excellent candidates for cancer association as well as for experimental functional studies. In addition, these genetic variations are likely to be critical in studies aiming to elucidate the disparity in cancer-treatment responses among patients as well as to improve the effectiveness of the cancer treatments (40).


    Acknowledgments
 
We thank the groups that have developed the databases and the web-based tools used in this study. We are indebted to Michael Edmenson and Pauline Ng for their invaluable assistance with the Blast against gene transcripts and SIFT tools, respectively.


    Footnotes
 
Grant support: Grant (BCTR0100627) from Susan Komen Breast Cancer Foundation, USA and "CIHR Strategic Training Program Grant—The Samuel Lunenfeld Research Institute Training Program: Applying Genomics to Human Health" fellowship (S. Savas).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

2 Internet address: http://lpgws.nci.nih.gov/html-cgap/cgl/DNA_damage.html). Back

3 Internet address: http://www.ncbi.nlm.nih.gov/SNP/. Back

4 Internet address: http://hgvbase.cgb.ki.se/. Back

5 Internet address: http://lpgws.nci.nih.gov/. Back

6 Internet address: http://snp500cancer.nci.nih.gov/home.cfm. Back

7 Internet address: http://www.genome.utah.edu/genesnps/. Back

8 M. Edmenson, K. Buetow. The BLAST against gene transcripts tool (unpublished). Internet address: http://lpgws.nci.nih.gov:80/perl/blast2. Back

9 Internet address: http://www.ncbi.nlm.nih.gov/BLAST/. Back

10 Internet address: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=unigene. Back

11 Internet address: http://us.expasy.org/sprot/. Back

12 Internet address: http://blocks.fhcrc.org/sift/SIFT_seq_submit2.html. Back

13 A few number of nsSNPs were screened in population(s) but could not be detected: we still report them as there was a chance that these nsSNPs could not be validated because they may represent either ethnic group specific or rare nsSNPs. Back

14 J. C. Figueiredo, J. A. Knight, L. Briollais, I. L. Andrulis, H. Ozcelik. Polymorphisms XRCC1-R399Q and XRCC3-T241M and the risk of breast cancer at the Ontario site of the breast cancer family registry, in press. Back

15 J. C. Figueiredo, N. Diaz-Granados, J. A. Knight, S. Savas, L. Briollais, H. Ozcelik. XRCC1-R399Q and XRCC3-T241M: a systematic review of biological importance and role in cancer, in preparation. Back

Received 9/24/03; revised 12/ 4/03; accepted 12/24/03.


    References
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 

  1. Bernstein C, Bernstein H, Payne CM, Garewal H. DNA repair/pro-apoptotic dual-role proteins in five major DNA repair pathways: fail-safe protection against carcinogenesis. Mutat Res 2002;511:145-78.[CrossRef][Medline]
  2. Rouse J, Jackson SP. Interfaces between the detection signaling and repair of DNA damage. Science 2002;297:547-51.[Abstract/Free Full Text]
  3. Mohrenweiser HW, Wilson DM III, Jones IM. Challenges and complexities in estimating both the functional impact and the disease risk associated with the extensive genetic variation in human DNA repair genes. Mutat Res 2003;526:93-125.[Medline]
  4. Daly AK, Day CP. Candidate gene case-control association studies: advantages and potential pitfalls. Br J Clin Pharmacol 2001;52:489-99.[CrossRef][Medline]
  5. Miller RD, Kwok P-Y. The birth and death of human single-nucleotide polymorphisms: new experimental evidence and implications for human history and medicine. Hum Mol Genet 2001;10:2195-8.[Abstract/Free Full Text]
  6. Taylor JG, Choi EH, Foster CB, Chanock SJ. Using genetic variation to study human disease. Trends Mol Med 2001;7:507-12.[CrossRef][Medline]
  7. Gray IC, Campbell DA, Spurr NK. Single nucleotide polymorphisms as tools in human genetics. Hum Mol Genet 2000;9:2403-8.[Abstract/Free Full Text]
  8. Shastry BK. SNP alleles in human disease and evolution. J Hum Genet 2002;47:561-6.[CrossRef][Medline]
  9. Chasman D, Adams RM. Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. J Mol Biol 2001;307:683-706.[CrossRef][Medline]
  10. Ng PC, Henikoff S. Predicting deleterious amino acid substitutions. Genome Res 2001;11:863-74.[Abstract/Free Full Text]
  11. Ng PC, Henikoff S. Accounting for human polymorphisms predicted to affect protein function. Genome Res 2002;12:436-46.[Abstract/Free Full Text]
  12. Sunyaev S, Ramensky V, Koch I, Lathe W 3rd, Kondrashov AS, Bork P. Prediction of deleterious human alleles. Hum Mol Genet 2001;10:591-7.[Abstract/Free Full Text]
  13. Wang Z, Moult J. SNPs, protein structure, and disease. Hum Mutat 2001;17:263-70.[CrossRef][Medline]
  14. Ferrer-Costa C, Orozco M, de la Cruz X. Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. J Mol Biol 2002;315:771-86.[CrossRef][Medline]
  15. Ramensky V, Bork P, Sunyaev S. Human non-synonymous SNPs: server and survey. Nucleic Acids Res 2002;30:3894-900.[Abstract/Free Full Text]
  16. Clifford R, Edmonson M, Hu Y, Nguyen C, Scherpbier T, Buetow KH. Expression-based genetic/physical maps of single-nucleotide polymorphisms identified by the cancer genome anatomy project. Genome Res 2000;10:1259-65.[Abstract/Free Full Text]
  17. Sherry ST, Ward MH, Kholodov M, et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 2001;29:308-11.[Abstract/Free Full Text]
  18. Fredman D, Siegfried M, Yuan YP, Bork P, Lehvaslaiho H, Brookes AJ. HGVbase: a human sequence variation database emphasizing data quality and a broad spectrum of data sources. Nucleic Acids Res 2002;230:387-91.
  19. Wheeler DL, Church DM, Lash AE, et al. Database resources of the National Center for Biotechnology Information: 2002 update. Nucleic Acids Res 2002;30:136.
  20. Mohrenweiser HW, Xi T, Vazquez-Matias J, Jones IM. Identification of 127 amino acid substitution variants in screening 37 DNA repair genes in humans. Cancer Epidemiol Biomark Prev 2002;11(10 Pt 1):1054-64.
  21. O'Donovan C, Martin MJ, Gattiker A, Gasteiger E, Bairoch A, Apweiler R. High-quality protein knowledge resource: SWISS-PROT and TrEMBL. Brief Bioinform 2002;3:275-84.[Abstract/Free Full Text]
  22. Pagano M, Gauvreau K. Principles of biostatistics. 2nd ed. Pacific Grove, CA: Duxbury; 2000. p. 342-6.
  23. Shen MR, Jones IM, Mohrenweiser H. Nonconservative amino acid substitution variants exist at polymorphic frequency in DNA repair genes in healthy humans. Cancer Res 1998;58:604-8.[Abstract/Free Full Text]
  24. Gu Z, Hillier L, Kwok PY. Single nucleotide polymorphism hunting in cyberspace. Hum Mutat 1998;12:221-5.[CrossRef][Medline]
  25. Cox DG, Boillot C, Canzian F. Data mining: efficiency of using sequence databases for polymorphism discovery. Hum Mutat 2001;17:141-50.[CrossRef][Medline]
  26. Lercher MJ, Hurst LD. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet 2002;18:337-340.[CrossRef][Medline]
  27. Grohmann K, Schuelke M, Diers A, et al. Mutations in the gene encoding immunoglobulin µ-binding protein 2 cause spinal muscular atrophy with respiratory distress type 1. Nat Genet 2001;29:75-7.[CrossRef][Medline]
  28. Duell EJ, Millikan RC, Pittman GS, et al. Polymorphisms in the DNA repair gene XRCC1 and breast cancer. Cancer Epidemiol Biomark Prev 2001;10:217-22.[Abstract/Free Full Text]
  29. Kim SU, Park SK, Yoo KY, et al. XRCC1 genetic polymorphism and breast cancer risk. Pharmacogenetics 2002;12:335-8.[CrossRef][Medline]
  30. Divine KK, Gilliland FD, Crowell RE, et al. The XRCC1 399 glutamine allele is a risk factor for adenocarcinoma of the lung. Mutat Res 2001;461:273-8.[Medline]
  31. Kuschel B, Auranen A, McBride S, et al. Variants in DNA double-strand break repair genes and breast cancer susceptibility. Hum Mol Genet 2002;11:1399-407.[Abstract/Free Full Text]
  32. Matullo G, Guarrera S, Carturan S, et al. DNA repair gene polymorphisms, bulky DNA adducts in white blood cells and bladder cancer in a case-control study. Int J Cancer 2001;92:562-7.[CrossRef][Medline]
  33. Winsey SL, Haldar NA, Marsh HP, et al. A variant within the DNA repair gene XRCC3 is associated with the development of melanoma skin cancer. Cancer Res 2000;60:5612-6.[Abstract/Free Full Text]
  34. Cho EY, Hildesheim A, Chen CJ, et al. Nasopharyngeal carcinoma and genetic polymorphisms of DNA repair enzymes XRCC1 and hOGG1. Cancer Epidemiol Biomark Prev 2003;12:1100-4.[Abstract/Free Full Text]
  35. van Gils CH, Bostick RM, Stern MC, Taylor JA. Differences in base excision repair capacity may modulate the effect of dietary antioxidant intake on prostate cancer risk: an example of polymorphisms in the XRCC1 gene. Cancer Epidemiol Biomark Prev 2002;11:1279-84.[Abstract/Free Full Text]
  36. Ratnasinghe D, Yao SX, Tangrea JA, et al. Polymorphisms of the DNA repair gene XRCC1 and lung cancer risk. Cancer Epidemiol Biomark Prev 2001;10:119-23.[Abstract/Free Full Text]
  37. Tuimala J, Szekely G, Gundy S, Hirvonen A, Norppa H. Genetic polymorphisms of DNA repair and xenobiotic-metabolizing enzymes: role in mutagen sensitivity. Carcinogenesis 2002;23:1003-8.[Abstract/Free Full Text]
  38. Graur D, Li W-H. Dynamics of genes in populations. Fundamentals of molecular evolution. 2nd ed. Sunderland, MA: Sinaur Associates, Inc; 2000. p. 41.
  39. Dean M, Carrington M, O'Brien SJ. Balanced polymorphism selected by genetic versus infectious human disease. Annu Rev Genomics Hum Genet 2002;3:263-92.[CrossRef][Medline]
  40. Martin NMB. DNA repair inhibition and cancer therapy. J Photochem Photobiol 2001;63:162-70.



This article has been cited by other articles:


Home page
BloodHome page
G. S. Sellick, R. Wade, S. Richards, D. G. Oscier, D. Catovsky, and R. S. Houlston
Scan of 977 nonsynonymous SNPs in CLL4 trial patients for the identification of genetic variants influencing prognosis
Blood, February 1, 2008; 111(3): 1625 - 1633.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
N. Johnson, O. Fletcher, C. Palles, M. Rudd, E. Webb, G. Sellick, I. dos Santos Silva, V. McCormack, L. Gibson, A. Fraser, et al.
Counting potentially functional variants in BRCA1, BRCA2 and ATM predicts breast cancer susceptibility
Hum. Mol. Genet., May 1, 2007; 16(9): 1051 - 1057.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
S. Savas, I. W. Taylor, J. L. Wrana, and H. Ozcelik
Functional nonsynonymous single nucleotide polymorphisms from the TGF-{beta} protein interaction network
Physiol Genomics, April 24, 2007; 29(2): 109 - 117.
[Abstract] [Full Text] [PDF]


Home page
JCOHome page
A. Ruzzo, F. Graziano, F. Loupakis, E. Rulli, E. Canestrari, D. Santini, V. Catalano, R. Ficarelli, P. Maltese, R. Bisonni, et al.
Pharmacogenetic Profiling in Patients With Advanced Colorectal Cancer Treated With First-Line FOLFOX-4 Chemotherapy
J. Clin. Oncol., April 1, 2007; 25(10): 1247 - 1254.
[Abstract] [Full Text] [PDF]


Home page
CarcinogenesisHome page
P. Vodicka, R. Stetina, V. Polakova, E. Tulupova, A. Naccarati, L. Vodickova, R. Kumar, M. Hanova, B. Pardini, J. Slyskova, et al.
Association of DNA repair polymorphisms with DNA repair functional outcomes in healthy human subjects
Carcinogenesis, March 1, 2007; 28(3): 657 - 664.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. G. Jegga, S. Gowrisankar, J. Chen, and B. J. Aronow
PolyDoms: a whole genome database for the identification of non-synonymous coding SNPs with the potential to impact disease
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D700 - D706.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
E. L. Webb, M. F. Rudd, G. S. Sellick, R. El Galta, L. Bethke, W. Wood, O. Fletcher, S. Penegar, L. Withey, M. Qureshi, et al.
Search for low penetrance alleles for colorectal cancer through a scan of 1467 non-synonymous SNPs in 2575 cases and 2707 controls with validation by kin-cohort analysis of 14 704 first-degree relatives
Hum. Mol. Genet., November 1, 2006; 15(21): 3263 - 3271.
[Abstract] [Full Text] [PDF]


Home page
Am J EpidemiolHome page
P. Bhatti, D. M. Church, J. L. Rutter, J. P. Struewing, and A. J. Sigurdson
Candidate Single Nucleotide Polymorphism Selection using Publicly Available Tools: A Guide for Epidemiologists
Am. J. Epidemiol., October 15, 2006; 164(8): 794 - 804.
[Abstract] [Full Text] [PDF]


Home page
CarcinogenesisHome page
R. K. Thirumaran, J. L. Bermejo, P. Rudnai, E. Gurzau, K. Koppova, W. Goessler, M. Vahter, G. S. Leonardi, F. Clemens, T. Fletcher, et al.
Single nucleotide polymorphisms in DNA repair genes and basal cell carcinoma of skin
Carcinogenesis, August 1, 2006; 27(8): 1676 - 1681.
[Abstract] [Full Text] [PDF]


Home page
BloodHome page
M. F. Rudd, G. S. Sellick, E. L. Webb, D. Catovsky, and R. S. Houlston
Variants in the ATM-BRCA2-CHEK2 axis predispose to chronic lymphocytic leukemia
Blood, July 15, 2006; 108(2): 638 - 644.
[Abstract] [Full Text] [PDF]


Home page
CarcinogenesisHome page
L. E. Mechanic, R. C. Millikan, J. Player, A. R. de Cotret, S. Winkel, K. Worley, K. Heard, K. Heard, C.-K. Tse, and T. Keku
Polymorphisms in nucleotide excision repair genes, smoking and breast cancer in African Americans and whites: a population-based case-control study
Carcinogenesis, July 1, 2006; 27(7): 1377 - 1385.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. F. Rudd, E. L. Webb, A. Matakidou, G. S. Sellick, R. D. Williams, H. Bridle, T. Eisen, R. S. Houlston, and the GELCAPS Consortium
Variants in the GH-IGF axis confer susceptibilityto lung cancer.
Genome Res., June 1, 2006; 16(6): 693 - 701.
[Abstract] [Full Text] [PDF]


Home page
JCOHome page
A. Ruzzo, F. Graziano, K. Kawakami, G. Watanabe, D. Santini, V. Catalano, R. Bisonni, E. Canestrari, R. Ficarelli, E. T. Menichetti, et al.
Pharmacogenetic Profiling and Clinical Outcome of Patients With Advanced Gastric Cancer Treated With Palliative Chemotherapy
J. Clin. Oncol., April 20, 2006; 24(12): 1883 - 1891.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
P. D. Terry and M. Goodman
Is the Association between Cigarette Smoking and Breast Cancer Modified by Genotype? A Review of Epidemiologic Studies and Meta-analysis.
Cancer Epidemiol. Biomarkers Prev., April 1, 2006; 15(4): 602 - 611.
[Abstract] [Full Text] [PDF]


Home page
CarcinogenesisHome page
R. C. Millikan, A. Hummer, C. Begg, J. Player, A. R. de Cotret, S. Winkel, H. Mohrenweiser, N. Thomas, B. Armstrong, A. Kricker, et al.
Polymorphisms in nucleotide excision repair genes and risk of multiple primary melanoma: the Genes Environment and Melanoma Study
Carcinogenesis, March 1, 2006; 27(3): 610 - 618.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
M. F. Rudd, R. D. Williams, E. L. Webb, S. Schmidt, G. S. Sellick, and R. S. Houlston
The Predicted Impact of Coding Single Nucleotide Polymorphisms Database
Cancer Epidemiol. Biomarkers Prev., November 1, 2005; 14(11): 2598 - 2604.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
R. C. Millikan, J. S. Player, A. R. deCotret, C.-K. Tse, and T. Keku
Polymorphisms in DNA Repair Genes, Medical Exposure to Ionizing Radiation, and Breast Cancer Risk
Cancer Epidemiol. Biomarkers Prev., October 1, 2005; 14(10): 2326 - 2334.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
C. B. Begg
Reflections on Publication Criteria for Genetic Association Studies
Cancer Epidemiol. Biomarkers Prev., June 1, 2005; 14(6): 1364 - 1365.
[Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
M. M. Johnson, J. Houck, and C. Chen
Screening for Deleterious Nonsynonymous Single-Nucleotide Polymorphisms in Genes Involved in Steroid Hormone Metabolism and Response
Cancer Epidemiol. Biomarkers Prev., May 1, 2005; 14(5): 1326 - 1329.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
D. C. Thomas
The Need for a Systematic Approach to Complex Pathways in Molecular Epidemiology
Cancer Epidemiol. Biomarkers Prev., March 1, 2005; 14(3): 557 - 559.
[Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Savas, S.
Right arrow Articles by Ozcelik, H.
Right arrow Search for Related Content
PubMed