| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
1 Channing Laboratory, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School; 2 Department of Epidemiology, Harvard School of Public Health; 3 Division of Hematology/Oncology, 4 Dana-Farber/Harvard Cancer Center Proteomics Core and BIDMC Genomics Center; and 5 Division of Interdisciplinary Medicine and Biotechnology, Beth Israel Deaconess Medical Center, Boston, Massachusetts
Requests for reprints: Shelley S. Tworoger, Channing Laboratory, 181 Longwood Avenue, 3rd Floor, Boston, MA 02115. Phone: 617-525-2087; Fax: 617-525-2008. E-mail: nhsst{at}channing.harvard.edu
| Abstract |
|---|
|
|
|---|
9, plasma fraction on a CM10 chip, and the organic fraction on the H50 chip, all with a low- and high-energy transfer protocol. Participant and quality control samples were aligned to a reference sample and then peak intensity was assessed for all peaks identified in the reference sample. The average coefficient of variation (CV) of the peak intensity within conditions ranged from 16% (H50, organic, low protocol) to 63% (CM10, pH
9, high protocol). Generally, the CV and mean peak intensity of the quality control samples were inversely correlated (median –0.48). The mean intraclass correlation (ICC) within conditions ranged from 0.37 (H50, unfractionated, low protocol) to 0.68 (CM10, unfractionated, high protocol). For a signal-to-noise cutoff of 2.0, we observed 334 peaks, of which 241 (72%) had an ICC of
0.40. Although we observed a large range of CVs and ICCs, sufficient numbers of peaks had reasonable ICCs to suggest that protein peak reproducibility over 3 years was reasonable among postmenopausal women not taking hormones. (Cancer Epidemiol Biomarkers Prev 2008;17(6):1480–5) | Introduction |
|---|
|
|
|---|
Although large projects, including the HUPO Plasma Proteome Project, are exploring the effects of various blood collection methods on proteomic profiles, no studies have assessed within-person variability of plasma protein profiles over time. This information is important for understanding whether one measure of the proteome is reflective of longer-term patterns, and is one key prerequisite for determining whether variations in markers of risk are meaningful.
Large epidemiologic cohorts with prospectively banked blood specimens, such as the Nurses' Health Study (NHS), offer the opportunity to perform studies of sample collection and processing methods as well as assess within-person variation over time. This study had three research goals: (a) to assess laboratory variation in measurement of proteomic peaks, (b) to examine the effect of delayed processing of up to 48 hours on the plasma proteome, and (c) to evaluate the reproducibility of proteomic profiles in postmenopausal NHS women not taking postmenopausal hormones over a 3-year period, using surface-enhanced laser desorption and ionization time-of-flight mass spectrometry (SELDI-TOF).
| Materials and Methods |
|---|
|
|
|---|
90%. In 1989 to 1990, heparin blood samples (2-10 mL tubes) from 32,826 NHS participants were obtained and transported via overnight courier with a cold pack to our laboratory; 97% of the samples arrived within 26 h of being drawn (8). A brief questionnaire asked for time of day and date of the blood draw, number of hours since the woman had last eaten before blood draw, and current medication use. On arrival, blood samples were centrifuged and aliquotted into plasma, WBC, and RBC components. Cryotubes have been stored in the vapor phase of liquid nitrogen freezers at less than –130°C since that time.
Mass Spectrometry
All samples for this study were run using a SELDI-TOF (Ciphergen, PBS II) platform. The detailed procedures of sample preparation and preprocessing for the different SELDI-TOF surfaces have been reported previously (9). Briefly, protein detection was based on a spectrum of signals generated when a plasma sample was spotted on a chip surface and subjected to energy transfer (low or high energy) by a nitrogen laser beam. The low-energy protocol was optimized to detect proteins up to
20,000 Da and the high-energy protocol was optimized for larger proteins. The laboratory was blinded to quality control (QC) status and to the identity of samples from the same individual; samples from the same participant were run in the same batch.
Assessment of Laboratory Variability
QC samples were obtained from two sources. Individual women (herein called donors) who participated in the 1989 to 1990 NHS blood collected were selected to serve as QCs if they had contributed an extra plasma vial (i.e., the woman sent three 10-mL heparin tubes as opposed to the two we requested). For this type of QC, we sent duplicate samples from individual donors in the subsequent assays. Additionally, two large QC pools were used; these were created shortly after the initial blood collection using discarded plasma from blood donation centers; one pool was of premenopausal women and another pool was of postmenopausal women. Multiple aliquots of each pool were included in subsequent assays.
To assess laboratory error, we calculated the coefficients of variation (CV) for each peak separately by QC donor/pool by dividing the SD by the mean peak intensity (10). Then, we averaged the CVs across the QC types. This method of CV calculation was used for all subsequent analyses of laboratory variation. CVs of <20% are considered desirable (10), although if the between-person variability is very large, higher CVs may be acceptable (11).
Delayed Processing Study
To assess the effect of our blood collection methods on the reproducibility of proteomic profiles, we obtained heparin samples from male and female volunteers that were split into three equal parts. The first part was processed and frozen immediately after collection, whereas the second and third parts were shipped via overnight mail back to the laboratory (with a cold pack), where they were processed 24 and then 48 h after blood collection, thus mimicking the collection conditions of the NHS participants. We then compared profiles in samples processed and frozen immediately to those stored as heparinized whole blood for 24 or 48 h before processing.
For this aim, we included the 0-, 24-, and 48-h samples from 12 donors, in addition to a total of 12 QC samples from 2 plasma pools (2 replicates of each) and 4 individual donors (2 replicates of each), for a total of 48 samples. Samples were spotted in duplicate with no plasma fractionation on an H50 chip and run in one batch. We calculated the CV across the three processing method times within each donor and then averaged across all donors. Finally, we considered the number of shared peaks [e.g., peaks with the same mass/charge (m/z) ratio] across the three processing method times within donors.
Within-Woman Reproducibility Study
To assess within-person stability over time, over 300 NHS women were asked to collect two additional blood samples over the following 2 to 3 years after the initial blood collection. We randomly selected a subset of 60 women for the current study who provided blood samples at baseline and 2 years later; this subset has been used previously to assess reproducibility over time of various hormonal markers (12). Selection criteria included that women were postmenopausal, had no prior diagnosis of cancer (except nonmelanoma skin cancer), the blood samples for both draws were collected in the morning hours after at least an 8 h fast, samples were processed by our laboratory within 24 to 30 h of collection, and women were not using postmenopausal hormones at any blood draw. At the first collection, women had not used postmenopausal hormones for at least 3 months before the blood draw; after this period, sex hormone levels, which are strongly affected by exogenous hormone use, have returned to pre-use concentrations (13, 14). Further, eligible women did not use postmenopausal hormones between the first and third blood draws.
The final assay set for the reproducibility study included 120 participant samples (2 samples each, i.e., year 1 and year 3 sample, from 60 women) and 45 QC samples from 2 plasma pools (12 replicates of each) and 11 individual donors (2 replicates of each). We examined four different protein chip surface conditions on the SELDI, with two batches for each condition: unfractionated plasma on a CM10 chip, unfractionated plasma on an H50 chip, pH
9, plasma fraction on a CM10 chip, and the organic fraction on the H50 chip. Details of the mass spectrometry protocol together with the raw data are provided on a supplementary Web page.6
We created a reference sample for this study by combining a small amount of plasma from each study participant and QC sample and stirring to create a homogenous mixture. Initial peak detection was then based on the reference sample, resulting in a reference profile, obtained in duplicate. The reference profile was assessed for protein peaks above a 2.0, 2.5, and 3.0 signal-to-noise ratio cutoff, and was used to interrogate and align sample spectra in the study participants and QC samples. Thus, an aligned, quantitative matrix was generated for the reference peaks in the entire data set. All samples were run in duplicate; we averaged the peak intensities for each identified peak from the duplicate runs.
For each of the four protein surface chip conditions used, we obtained data for both the high- and low-energy protocols, using three signal-to-noise ratio cutoffs (2.0, 2.5, and 3.0), for a total of 24 data sets. All analyses were run separately by data set, that is, for each energy protocol, signal-to-noise ratio cut off, and chip type/fractionation condition. The CVs for QC samples were determined as mentioned above (10). Additionally, we determined the mean peak intensity for each peak across all QC specimens. For each data set, we then calculated the Spearman correlation (10) between the CV of a peak, mean peak intensity, and the mass to charge (m/z) ratio of the peak. We also determined the percent of peaks in each data set with CVs <30% or 40%.
For the participant samples, we calculated the intraclass correlation (ICC) within-woman over time for each peak within each data set, using a mixed model with participant as the random variable (10); all peak intensities were natural log transformed for this analysis. This model estimated the within- and between-person variances for each peak. The ICC was calculated as the between-person variance divided by the total variance (10). For each peak, we also determined the mean peak intensity across all study participant specimens and, within data set, the number of peaks with at least a fair ICC (
0.40; ref. 10). Finally, we calculated the Spearman correlation between the ICC, mean peak intensity, and the m/z ratio of the peak in each data set.
| Results |
|---|
|
|
|---|
Delayed Processing Study
Among plasma from healthy donors, we observed an overall CV of 23% across the three conditions of delayed processing of blood samples (immediate to 0-, 24-, or 48-hour delay in processing). To better assess the effect of different processing delays on proteome stability, we calculated the CVs comparing 0- to 24-hour samples and 0- to 48-hour samples. Although 92% of the CVs when comparing samples with a processing time of 0 and 24 hours were
20%, only 80% of comparable CVs were
20% when examining the 0- and 48-hour samples. Furthermore, the number of shared peaks across the delayed processing times dropped by an average of 10% when including the 48-hour time point.
Within-Woman Reproducibility Study
NHS women in this study ranged in age from 51 to 68 years old (mean age 61 years) and were on average overweight (Table 1
). Less than 30% of women reported past use of postmenopausal hormones or ever use of oral contraceptives. Use of other medications at blood collection was low. All samples were fasting, and, except one drawn around 11 a.m., were drawn between 6 a.m. and 10 a.m.
|
9 fraction, high protocol; Table 2
). The intra-assay CVs were similar to the interassay CVs (data not shown); however, the former were based on fewer QC samples, as only 6 of 11 QC donors had their two replicates in the same batch. The mean CVs decreased when increasing the signal-to-noise ratio cutoff; however, these changes were relatively small. In general, there was a strong inverse correlation between the CV and mean peak intensity of the QC samples (median r across data sets = –0.48), whereas there was no consistent correlation between the CV and m/z ratio of the peak. The percentage of peaks with CVs
30% ranged from 3% to 96% (median = 30%) across the conditions; for CVs
40%, the range was 14% to 100% (median = 57%).
|
0.40. For a signal-to-noise ratio cutoff of 3.0, we assessed 188 peaks, of which 118 (63%) had an ICC of
0.40. The ICC was inversely correlated with the average peak intensity in each data set examined. Interestingly, in general, the ICC was inversely correlated with m/z ratio for the low-energy protocol data sets, but positively correlated with the m/z ratio for the high-energy protocol data sets. Adjustment for participant age, time of day of each blood draw, and date of blood draw did not notably change the ICC estimates (data not shown). Furthermore, the mean and median ICCs did not change after excluding peaks with CVs higher than 30% or 40% or when restricting to never postmenopausal hormone users (data not shown).
|
| Discussion |
|---|
|
|
|---|
The first goal of this study was to assess laboratory variability within blinded replicates of QC samples. In general, there was a wide range of CVs across the various protein chip surface types and plasma fractions. In the reproducibility study, the average peak intensity CVs within a data set ranged from 16% to 64%. Interestingly, for most of the data sets, there was a strong inverse correlation between the peak CV and the mean peak intensity of the QC samples for that peak. This is expected given that mass spectrometry platforms have a certain amount of "noise" in the peak spectra, particularly in the low-intensity range, which has been shown to contain matrix-associated variability (1-3). This increase in assay variability at low peak intensities raises an important methodologic issue in choosing QC samples for proteomic profiling studies. For many biomarker studies, only one to three QC pools are included across the sample set (11). However, within each QC pool, only some peaks will exist at a high intensity. Thus, more QC pools (>10) are needed to increase the likelihood that at least some of the QC samples will have reasonably high peak intensities for all peaks. This will allow for a more accurate estimation of assay variability and removal of peaks with a very high CV from further analysis.
The second goal of this study was to examine the effect of delayed sample processing on proteomic profiles. We observed that a 24-hour delay in processing did not substantially affect proteomic profiles in heparin plasma. This is consistent with previous studies of this sample type, although for serum samples even a 4-hour processing delay can substantially alter protein profiles (4, 7). In our study, a 48-hour delay appeared to alter the observable proteins for at least some of the samples and/or peaks, as evidenced by fewer shared peaks between the 0- and 48-hour samples versus the 0- and 24-hour samples. Thus, epidemiologic studies should pilot and standardize their collection methods on proteomic platforms before proceeding with analyses as long delays in processing can alter at least some protein peaks.
The third goal was to examine the amount of within-person variability over time of proteomic profiles. When conducting this analysis, we excluded data sets with extremely high CVs to reduce error in our ICC measures. In general, the majority of peaks had at least fair ICCs. The ICCs in this study are likely underestimated, because the measure of total variability (the denominator of the ICC) includes the assay variability, which was somewhat high. The various sample fractions and chip surfaces differed in the number of peaks with good ICCs (ranging from 35% to 91%). This suggests that certain chip types and plasma fractions may reflect plasma proteins with differing levels of reproducibility within person over time. Thus, future studies should consider including a reproducibility study from the population of interest within the primary study. This will allow the investigator to examine peaks with high reproducibility within a person over time in the analysis.
We observed that plasma protein peaks with either a low or high m/z ratio tended to have higher ICCs. Although it is unclear why this might be, it is possible that peaks in the middle of the m/z range are more likely to (1) represent multiple proteins, which means the ICC would reflect the combined reproducibility of all the proteins in that peak, or (2) are less likely to have strong homeostatic regulation of protein levels. However, it is not possible to definitively determine whether these or other factors are important in this observation. We also observed that the ICC was inversely correlated with the average peak intensity across the participant samples. Given that the CVs tend to be smaller for lower peak intensities, this finding was somewhat counterintuitive and it is unclear whether this was a chance finding or has some biological significance.
The major strength of our study is the relatively homogenous population of postmenopausal women not taking postmenopausal hormones and having had samples prospectively collected over 3 years, although our results can only be directly generalized to this population. Nonetheless, other unmeasured factors, such as changes in medication use or diet, may have increased the within-person variability over time, thus lowering the observed ICCs. However, populations in longitudinal epidemiologic studies rarely maintain exactly the same characteristics over time so our results may better reflect within-person variation that would be observed in such studies. The primary limitation is the collection method used to obtain the samples in this study. Because NHS participants live across the entire United States, we asked them, because they were nurses, to collect their own blood sample and mail it back to our laboratory where it was processed. This means that there was a delay in processing into the plasma, buffy coat, and RBC fractions. Although previous studies have recommended immediate processing (4), our pilot study suggested that delayed processing up to 24 hours was acceptable for a substantial number of proteins. We used samples that were processed within 24 to 30 hours of collection for the reproducibility study to minimize the effect of this issue on the estimated ICCs. We also specifically selected women whose two blood samples were collected in the morning and after fasting for at least 10 hours to reduce effects of circadian rhythms and eating on protein profiles. Another limitation is that we assessed a relatively small subset of the overall proteome because we did few fractionations and only examined peaks with a signal-to-noise ratio of at least 2.0. Future studies should examine additional protein subsets. Also, it is difficult to separate the effects of within-person variability from those of long-term storage in this study. However, few analytes seem to be altered by only a difference of 2 years of storage (11, 15), especially when kept at very cold temperatures (all samples were kept at less than –130°C). Further, the CVs were relatively high in the QC samples of the reproducibility study; although this is not optimal, the between-person variation in peak intensities is very large and seems to outweigh the assay variability for the majority of peaks.
It also should be noted that SELDI-TOF mass spectrometry has been associated with certain strengths and limitations. The major strength is that it lends itself to a high-throughput and technically straightforward application to large-scale epidemiologic studies. Limitations include primarily the inability to directly identify protein IDs within the peak patterns, somewhat limited peak resolution, and issues related to interlaboratory reproducibility (9). However, the purpose of this study was not to directly compare the advantages and disadvantages of different proteomic platforms but rather to evaluate the stability of proteomic profiles over time, as a prerequisite for the design of major biomarker discovery studies.
In conclusion, this is the first study to our knowledge to examine the reproducibility of plasma proteomic profiles within an individual over time. Our results suggest that plasma protein peak reproducibility over 3 years was reasonable for the majority of peaks among postmenopausal women not taking postmenopausal hormones, such that one sample may reflect the proteome pattern over time. We also observed that the laboratory variability of this method is somewhat higher than is desirable; however, this may be acceptable, particularly in discovery studies, given the wide between-person variability in most protein peaks. Further, delayed processing of up to 24 hours seems acceptable for heparin plasma. Future studies should consider eliminating peaks with poor CVs or a low ICC by including multiple QC and reproducibility samples in their design.
| Disclosure of Potential Conflicts of Interest |
|---|
|
|
|---|
| Acknowledgments |
|---|
We thank Mr. Christopher Murphy for his invaluable assistance in programming the statistical analysis.
| Footnotes |
|---|
Note: S.S. Tworoger and D. Spentzos contributed equally and should be considered co-first authors. T.A. Liebermann and S.E. Hankinson contributed equally and should be considered co-senior authors.
6 http://www.dfhccproteomics.org/nhs/reproducibility/1 ![]()
Received 10/22/07; revised 3/12/08; accepted 3/31/08.
| References |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |