Creation of a federated database of blood proteins: a powerful new tool for finding and characterizing biomarkers in serum
© Marshall et al.; licensee BioMed Central Ltd. 2014
Received: 23 April 2013
Accepted: 22 October 2013
Published: 29 January 2014
Protein biomarkers offer major benefits for diagnosis and monitoring of disease processes. Recent advances in protein mass spectrometry make it feasible to use this very sensitive technology to detect and quantify proteins in blood. To explore the potential of blood biomarkers, we conducted a thorough review to evaluate the reliability of data in the literature and to determine the spectrum of proteins reported to exist in blood with a goal of creating a Federated Database of Blood Proteins (FDBP). A unique feature of our approach is the use of a SQL database for all of the peptide data; the power of the SQL database combined with standard informatic algorithms such as BLAST and the statistical analysis system (SAS) allowed the rapid annotation and analysis of the database without the need to create special programs to manage the data. Our mathematical analysis and review shows that in addition to the usual secreted proteins found in blood, there are many reports of intracellular proteins and good agreement on transcription factors, DNA remodelling factors in addition to cellular receptors and their signal transduction enzymes. Overall, we have catalogued about 12,130 proteins identified by at least one unique peptide, and of these 3858 have 3 or more peptide correlations. The FDBP with annotations should facilitate testing blood for specific disease biomarkers.
Most human diseases involve the changes in the expression of normal proteins, or the creation of abnormal proteins, that disrupt physiology; such protein changes can arise from mutations, viruses, chemicals, radiation, free radicals, ischemia, disease or other the sources. In many instances these proteins may appear in blood thus providing an easily accessible biomarker that may give insight into the disease process.
Proteins that are specifically expressed in certain tissue types or cells may be useful markers of disease. Prostate specific antigen (PSA) is not selective for malignant disease but is a tissue-specific kalikrein protease that indicates benign and cancerous proliferation of prostate tissue and cells . Transcription factors, DNA binding and RNA binding zinc finger proteins, chromatin remodeling proteins, cellular receptors or agonists, and cell signaling proteins including oncogenes that may play a role in cell and tissue differentiation , were first reported from the tryptic peptides of blood fluid after partition chromatography [3, 4]. The presence of so many nuclear and cellular factors was unexpected given previous views on the nature of plasma proteins [5, 6]. Now it appears that the nucleic acid and protein binding factors that play a regulatory or signal function in specific cell types  may frequently be detected in blood and have clinical diagnostic/prognostic or even therapeutic value [7–9].
The proteins and endogenous peptides of human plasma may be randomly and independently sampled, identified by liquid chromatography and tandem mass spectrometry, followed by log transformation and comparative statistical analysis by ANOVA [10–12]; with the current generation of mass spectrometers, these processes are relatively simple and accessible to many laboratories. However, to facilitate routine use in biomarker studies, we have created a federated database of all proteins and peptides reported in blood. Here we provide for the first time the specific identity of the set of cellular regulatory proteins detected with high confidence and good agreement internationally from human serum/plasma by liquid chromatography and tandem mass spectrometry that show great promise as potential biomarkers and therapeutic targets. The good agreement on peptides from the same cellular proteins by different research groups, together with the capacity of mass spectrometry to compare blood samples at the level of the parent peptide and fragment ion intensity using ANOVA, indicates that conditions for the detailed analysis of blood fluids between diseases and physiological states may now exist [10–12].
Although several organizations and individuals have published large data sets of proteins in blood, it remains to provide an SQL database by a systematic compilation and analysis of publically available data sets. This mathematical review of published data evaluates the reliability of the data in 11 published reports [13–23] and includes a federated database of all of the proteins and peptides detected in blood. The new SQL based, federated database of blood proteins (FDBP) available in toto with this publication will provide a good starting point for groups wishing to evaluate various peptides in blood as biomarkers of disease.
Construction and analysis of the federated database of blood proteins
Federated protein sequence library
A federated database of protein sequences was assembled by downloading protein sequences from RefSeq, Ensembl, SwissProt, Trembl or Uniprot that contain sequences from many sources to yield a non-redundant set of proteins, protein fragments, splice variants and alleles that contains over 193,000 possible protein sequences representing different genes, isoforms, transcript lengths, splice variants, alleles, and recorded nucleic sequences.
Meta database approach
A separate computer program was written to parse each of the publicly available serum and plasma proteomic data [13–23]. The data of Zhu et al  is a reanalysis of the raw data of Marshall et al . The data of Bowden et al is the analysis of the raw data of the HuPO PPP consortium downloaded from TRANCHE website and calculated by X!TANDEM . The results were reported in different formats with accession numbers and FASTA protein sequences obtained from RefSeq, Swiss, protein, ENSEMBL, Uniprot IPI or other predicted protein sequences. To facilitate the easy analysis, the peptides and proteins from the independently published reports of human serum/plasma proteins were then transferred to a single SQL database. It has been previously suggested that it is challenging to derive high confidence identifications of human serum/plasma proteins  and that most proteomic identifications are false positive results using the “empirical model” based on putatively pure protein standards verses a decoy library . However, statistical confidence in protein identification certainly increases through replication and independent identification of the same proteins [25, 26]. Moreover the so called FDR used in proteomics, based on the empirical model, has been shown to disagree with classical statistical analyses by many orders of magnitude and to incorrectly reject well known blood proteins, including albumin, resulting is a large type II error (false negative) of protein identification [10, 27]. Comparing the goodness-of-fit scores of authentic MS/MS spectra versus random or noise spectra [27, 28] shows that the score distributions of real spectra correlations can be easily separated from false positive results [27, 28] so long as the data are collected with a high signal to noise ratio [10–12].
Confidence is known to increase with the number of peptides identified from each protein [25, 26] and so the peptide to protein distribution of a data set is a simple means to infer the statistical reliability of the data [10–12]. Previously, control experiments using random libraries and random or noise spectra showed that roughly 88% of false positive proteins were identified by 1 peptide, about 11% of false positive hits show 2 peptides, and about 1% of false positive hits show 3 or more peptides: Experiments with random libraries of amino acid sequences, decoy libraries and random spectra all agree that proteins observed with at least 3 peptides show a false positive rate of less than 1% [27, 28] and a low false positive rate (type I error) [25, 27]. LC-ESI-MS/MS data sets that have been recorded with a high signal to noise ratio in replicated experiments show up to tens or hundreds of peptide detections per protein and thus a low probability of false positive results. In contrast, the peptide to protein distribution of noise spectra was not different from random spectra [27, 28]. An emphasis on replication, and good agreement between independent experiments, is a pragmatic method to provide confidence in sensitive ion trap data.
Analysis of the data sets
The database of HuPO plasma proteome results  and all previously published blood results  were obtained as previously described . The HuPO consortium raw data was obtained from the TRANCHE website and analyzed with X!TANDEM [29, 30]. The resulting proteins were detected from independent experiments using different instruments and/or correlation algorithms including X!TANDEM , MASCOT, OMSSA or others  and SEQUEST [31, 32]. The advantages and disadvantages of these approaches have previously been considered and the statistical reliability of the data set has previously been estimated . The protein and peptide sequences reported, or where required the accession numbers, were used to map all of the previous blood fluid results to the federated protein library to create the FDBP.
The federated sum of the peptide and protein sequences reported, and the number of observations, were calculated using Structured Query Language (SQL). The SQL server database system was installed on a personal computer (PC) using the Windows server operating system to create the FDBP: The redundant and distinct protein sequences were determined using SQL . The resulting FDBP was subsequently analyzed on a 64 bit PC with SQL Access as previously described .
The BLAST (Basic Local Alignment of Sequence Tool) algorithm  provides a standardized way to search for homologous regions on proteins and quantify the level of similarity within a set of proteins; BLAST was downloaded from NCBI and installed on a PC . A subset of blood protein results were organized into a BLAST matrix for the purpose of comparing BLAST to the automatic features of SQL. The results of the BLAST analysis for each sequence were also captured in the SQL database alongside the corresponding proteins. The top scoring BLASTp alignment for every query protein was considered a “match” if the identity was greater than 75% of the full length and if the alignment contained a perfect match string of at least 20 amino acids .
The SQL database of the peptides and proteins mapped to the federated protein library was explored using SAS to generate the graphs provided . The data were graphed in SAS JMP after sorting the parameter(s) of interest in descending order using the Tables - Sort menu. The resulting data was plotted versus increasing protein number using the graphic features inherent to the SAS. The peptide to protein distributions of the blood data compared to that of randomized libraries  or randomized spectra or noise indicate that the proteins detected by three or more different peptides show a low expectation of false positive results ≤ 1% FDR [10, 23, 27–29]. The graphs were exported from SAS as rich text format files. The data were analyzed on a 64 bit PC with SAS JMP as previously described . After comparing BLAST versus SQL we imported the peptides, proteins, and ion characteristics including intensity values from Liu et al 2007 and included these in the SQL database only for summary of the FBPD.
The peptide and protein sequences were used as queries to obtain the protein descriptions from RefSeq, Ensembl, Swiss Prot and other protein libraries that were concatenated with semicolons. Similarly BLAST was used to obtain the molecular identity or molecular functions, biological processes and sub cellular localization of the proteins found in human blood where gene ontology terms (GO) and gene symbols were available. The peptide and protein sequences along with the accession numbers, GO terms and the results of bio-informatic calculations from the GO and BLAST analysis are organized and stored together.
The cellular factors of specific interest from the FDBP are made available in the Additional files. The cellular proteins of blood fluids listed in the FDBP are presented in a graphical form using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING)  that automatically generated a list of gene symbols from the protein descriptions provided in the FDBP. However, more complete information can be found in the Additional files.
Comparison of SQL versus BLAST analysis
We compared the BLAST analysis of amino acid sequences versus SQL analysis at the level of peptides and proteins on a large subset of the data to compare the methods and characterize the distributions of blood protein data. BLAST analysis  of a set of 44,019 reported proteins at a standard of 75% full length and 20 contiguous amino acids, compressed these to a set of 17,506 protein types. Among the proteins, 14,224 had no close homologues in the reference library of protein sequences; the remaining proteins occurred at least twice in the FDBP. After compression by BLAST a set of 7,707 proteins types were detected by at least three peptides. Based on the BLAST analysis the available annotation such as descriptions, GO data and accession numbers could be connected with the appropriate database entry.
Homology expectation value
Sequence gap analysis
Protein alignment length
BLAST percent identity
SQL analysis is based on the peptide or protein sequences. Liquid chromatography, coupled to electrospray ionization with tandem mass spectrometry can identify thousands of protein types, but there may be ambiguity in the results when there is a low level of peptide coverage and the peptides are shared by more than one protein. A total of 75,432 peptides produced a list of 57,784 peptides after the removal of duplicates using the distinct function of SQL. However, some of these peptides represented smaller pieces of other peptides and removal of these subsets of peptides gave 50,452 unique peptide sequences.
Redundant proteins by SQL
Distinct proteins by SQL
Unique or characteristic peptide sequence summary by SQL
There are many methods that can be used to estimate the vital statistics of the blood proteome, and perhaps the most conservative method would be to consider only proteins identified by at least one peptide that is unique to that protein and not characteristic of any other protein. An analysis of all the data reveals a set of 91,373 peptides from published studies on human serum/plasma of which 12,130 proteins that were detected by at least one unique peptide not shared with other proteins and of these 3858 had a total of at least three peptide identifications, conferring near certain molecular identity.
The types of proteins detected in blood/serum
Broad spectrum of proteins detected in blood/serum
The distribution of cell location in the blood protein SQLdatabase
Membrane, integral to membrane,
Integral to membrane,
Plasma membrane, integral to membrane,
Extracellular region, extracellular space,
Ubiquitin ligase complex,
Ubiquitin ligase complex,
Extracellular region, proteinaceous extracellular matrix,
Plasma membrane, integral to plasma membrane,
Integral to plasma membrane, membrane,
Plasma membrane, integral to plasma membrane,
Proteinaceous extracellular matrix,
Endoplasmic reticulum, endoplasmic reticulum membrane, membrane, integral membrane,
Nucleosome, nucleus, chromosome,
Intracellular, nucleus, cytoplasm,
Plasma membrane, integral to membrane,
Membrane fraction, integral to plasma membrane,
Ubiquitin ligase complex, nucleus,
Membrane fraction, integral to plasma membrane, membrane,
The distribution of molecular functions in the blood protein SQL database
Calcium ion binding,
Transcription factor activity,
Structural molecule activity,
Calcium ion binding, protein binding,
DNA binding, zinc ion binding,
DNA binding, zinc ion binding, metal ion binding,
Structural constituent of ribosome,
Nucleic acid binding, zinc ion binding,
Protein binding, zinc ion binding, metal ion binding,
Nucleic acid binding,
Calcium ion binding, protein binding,
GTPase activator activity,
Extracellular matrix structural constituent,
Ubiquitin-protein ligase activity, zinc ion binding,
Serine-type endopeptidase inhibitor activity,
Receptor activity, olfactory receptor activity,
Transcription factor activity, zinc ion binding,
Signal transducer activity,
Nucleotide binding, protein serine/threonine kinase
Nucleotide binding, ATP binding,
Nucleotide binding, RNA binding,
Transcription factor, sequence-specific DNA binding
Nucleotide binding, RNA binding, protein binding,
Protein binding, zinc ion binding,
Structural constituent of cytoskeleton,
Nucleic acid binding, zinc ion binding, metal ion binding,
Molecular_function, protein binding,
RNA binding, protein binding,
Transcription factor activity, RNA polymerase II
DNA binding, protein binding, zinc ion binding,
Actin binding, actin binding, actin binding, structural
Growth factor activity,
The distribution of biological processes in the blood protein SQL database
Transcription, regulation of transcription, DNA-dependent,
Regulation of transcription, DNA-dependent,
Protein amino acid phosphorylation,
Intracellular signaling cascade,
Protein amino acid dephosphorylation,
Multicellular organismal development,
Cell adhesion, homophilic cell adhesion,
Signal transduction,G-protein coupled receptor protein
mRNA processing, RNA splicing,
Cell adhesion, homophilic cell adhesion,
Small GTPase mediated signal transduction,
Cation transport, calcium ion transport,
Cell surface receptor linked signal transduction,
Signal transduction, G-protein coupled receptor protein factor
Signal transduction, G-protein coupled receptor protein
Nucleosome assembly, chromosome organization
Muscle contraction, cytoskeletal anchoring, development
Carbohydrate metabolic process,
Intracellular protein transport,
DNA binding factors and transcription factors
Chromatin remodelling and nucleic acid modification enzymes
Zinc finger proteins and RNA proteins
SNAPS, SNARES, kinesin, secretion apparatus and exosome components
Cellular receptors and signal transduction factors
Cytokines, chemokines, interleukins and tumor necrosis factor receptors and binding proteins
Growth factor receptors and binding proteins
A number of strategies have been suggested for the analysis of proteomic data that require proteomic-specific data storage and analysis platforms. The data analysis strategies generally advocate the storage of raw data in xml or text files with proteomics-specific software routines to manage, analyze and summarize the data [45–47]. In contrast, we proposed that the generic data analysis systems such as SQL, BLAST, and a generic statistical analysis system (SAS) may be used to organize and analyze proteomic data, using this broadly available and well tested software [10, 12, 20, 23, 29, 48]. When considering the choice of a data analysis system, it is important to note the differences in proteomic versus genomic data. Genomic data is the linear character sequence of about 3 billion base pairs. The proteome corresponds to only about 1% of the genome, comprising 22,000 protein types to date. In addition, there is a 3 to 1 compression of the data from bases to amino acids and so the protein sequence data is no more than 0.3% that of the genome. In many instances, only a few representative peptides have been recorded from each protein, and so the sequence data collapses to less than 0.1% of the genome sequence. However, individual peptides may be detected repetitively and these detections can be stored as numeric information. Hence proteomics data sets will contain at least a thousand fold less sequence information than genomic databases but have much more numerical data including m/z values and continuous intensity values from the parent and fragment ions [10, 11]. The large amount of continuous fragment m/z and intensity data must be connected to the relatively small amount of protein and peptide sequences or masses [M+H], that are ordinal or nominal variables, in order to compute the differences in intensity values over treatments [10, 12, 20, 23, 29, 48]. The ion intensity data must be linked to the protein, peptide, and m/z information in a format that will permit immediate statistical analysis by generic routines [10–12].
Analytical error in protein identification
When a highly purified protein is analyzed by LC-MS/MS it is sometimes possible to achieve complete sequence coverage and therefore unambiguous identification between highly related sequences. However, when many proteins are identified and quantified simultaneously, the peptide coverage of each protein is not complete and so there may be more than one protein sequence that matches the detected peptides. In some cases, where only a few peptides are detected there may be no way to rule out related proteins without subsequent investigation. Most proteomic scientists support the concept of creating large databases of proteins from different sources, but there are no universally accepted processes for creating such databases. We have chosen to collect data on serum/plasma proteins from several published sources to create a FDBP that depends on the veracity of the methods used to collect, combine and analyze the data to avoid the pitfalls that might spuriously incorporate inappropriate molecules into the FDBP. The proteins of human blood have been separated by various methods, including a variety of chromatographic strategies for separation prior to ionization and the MS/MS spectra were collected with commercially available quadrupole or ion trap instruments [23, 29]. Together these methods yield a large number of peptides correlated to a small number of proteins in sharp contrast to random expectation. It’s has been suggested that three peptides many be a reasonable standard to limit false positive rates into protein databases based on random protein libraries and calculations based on random spectra or noise spectra also indicate that the false positive rate (type I error) estimated from peptide-to-protein distributions was in agreement with independent goodness of fit tests of the protein spectra and graphical approaches [25, 27, 28].
BLAST versus SQL
Less than half of the human blood proteins have closely related homologs in the FDBP and there may be partial sequences, splice variants, or allelic forms of proteins in the FDBP that differ by at least one amino acid. In some instances, it may be desirable to collapse proteomic data to a set of unique or representative protein sequences. It has been previously shown that the protein databases can be collapsed into a smaller number of protein types by listing all perfect subsets under the longest representative sequence, or collapsing the proteins using BLAST analysis with similar results [10–12, 20, 23, 29]. The BLAST criteria of 75% full length homology and 20 contiguous amino acids [20, 23, 29] is a standard method to assign structural identity to different proteins. The standard BLAST algorithm can be used to quantify the extent of identity between proteins and domains and to annotate them. We conclude the BLAST algorithm is a sufficient metric to distinguish protein types that flexibly discovers relationships between proteins which can then be directly captured in an SQL database. The strategy of parsing control or test data sets, using existing software, such as SQL, for assembly prior to BLAST or statistical analysis by generic routines like SAS or R, may be appropriate for annotation and quantitative analysis of proteomic data. Moreover, as shown here, the results of BLAST can be conveniently stored in the same SQL source database, dramatically simplifying the organization and increasing archival quality of the data with convenient analysis and graphical presentation the generic statistical analysis software such as SAS. We conclude that up to 17,506 protein types might exist in blood based on homology to proteins where peptides were obtained and of these about 7,707 of these are identified with reasonable certainty. The redundant protein count is the number of times any MS/MS spectrum has been correlated to a peptide from any protein sequence(s). The same peptide, and therefore protein(s), may be detected in replicate experiments and the redundant peptide count provided yields an estimate of the relative levels of detection. Some peptides are found in protein sequences that are identical between protein libraries, and the many equivalent library accession numbers may be concatenated with semi colons, for convenience without losing information. Multiple protein sequences that are exactly the same can be eliminated by SQL with a simple automated function to yield a distinct protein list of all implicated proteins that differ by at least 1 amino in the protein sequence. Hence the redundant versus distinct peptide and protein counts of 10,138 distinct proteins with 3 peptides are convenient and easily reproducible metrics of the relative levels of detection and the number of potential proteins using commonly available software. Considered together, the direct comparison of BLAST versus SQL indicate that about 70% of the proteins detected in blood by three peptides or more have no other close homologues in circulation while an minority of proteins may have other similar protein variants, isoforms or related sequences in circulation.
Unique or characteristic peptide sequence analysis
Some fourteen thousand of the reported serum/plasma proteins map to only one distinct protein sequence that cannot be related to any other protein by BLAST but these proteins can still be summarized at the peptide and protein level using SQL. Moreover it is important to remember that mass spectrometers most typically detect peptides and not proteins. Thus a summary on the basis of unique peptides that can be unambiguously analyzed by LC-ESI-MS/MS is a meaningful metric for mass spectrometry experiments. If we accept the set of proteins detected by at least one unique or characteristic peptide not found in any other protein, as list of 12,130 proteins are apparently in the blood and from these a conservative estimate of 3,858 proteins in the blood with reasonable certainty was obtained.
Biological sources of error
It seems unlikely that cellular proteins observed with three or more peptides, and in agreement between different research groups, could be identified erroneously. However, it remains possible that at least some of these proteins could be released from cells during blood collection or processing. Some of the observed blood proteins may have been released from the site of wounding and diffused into the blood from the damaged skin tissue or cells. The activation and degranulation of blood cells is known to sometimes occur during the formation of serum and might release the contents from cells that burst during blood clotting. Red blood cells are anucleate and so they might not seem like a rich source of nuclear factors. Similarly, platelets are anucleate and so at least superficially  they are unlikely source of DNA remodeling enzymes and transcription factors. Direct measurements of secreted platelet proteins by LC-MS make little mention of such cellular factors except for well-known secreted proteins such as 14-3-3 proteins and a single PI3K isoform and a few other similar proteins [50, 51]. It is known that neutrophils and potentially other blood cells use expelled DNA as a net or snare to entrap bacteria . It remains possible that white blood cell degranulation during processing results in expulsion of nucleic acids and their binding proteins.
Analysis of the proteins released from leukocytes was used to rule out the degranulation of white blood cells during collection as the source of the transcription factors and other nuclear proteins in the blood. We tested the hypothesis that the observed transcription factors, receptors, signaling enzyme, DNA remodeling and other signaling proteins observed in the FDBP were merely secreted by white blood cells during degranulation. To test whether DNA binding factors and other cellular proteins were released from white cells, human neutrophils were isolated and degranulation was stimulated with the combination of cytochalasen B and the bacterial peptide fMLP. The results of the neutrophil stimulation experiment showed that very few of the observed cellular factors in blood were secreted from these abundant white blood cells during degranualtion (not shown). The abundance of cellular and nuclear materials in plasma samples seems to indicate that a very efficient system for releasing proteins from cells, such as secretion or the release of exosomes, must be present to account for such a large concentration of so many proteins [7, 8, 35, 53, 54].
Utility of the federated database of blood proteins
The FDBP will be useful only if the data are reliable and easy to search or to manipulate. The above paragraphs give the reasons for believing that highly reliable data may be derived from the FDBP. To make the FDBP easily useful, we placed all of the data in a SQL database to permit analysis of the data. The generic SQL and SAS system can also be used to capture, organize and analyze the results of bioinformatic algorithms such as BLAST or the results of GO term analysis, as shown here. The FDBP contains the BLAST and GO term data for the proteins listed that can be rapidly and conveniently summarized by a generic statistical analysis system such as R or SAS . The results of the many additional calculations are also made available in the provided excerpts of SQL databases where the data may be analyzed and graphically presented with SAS. The generic data systems SQL and SAS are sufficient to analyze proteomics data and can derive the necessary attributes and distributions of the data.
A further capacity to provide the calculated parent and fragment m/z values for the peptides in the FDBP is a significant advantage in designing experiments for unambiguous identification and quantification by precise mass spectrometric methods [10–12]. The mapping of the peptides to the different protein sequences in the FBPD will help to interpret proteomic results and for the planning of experiments to make unambiguous protein determinations. Comparing the attributes between the different related sequences or subsequences may be informative and so collapsing the data into one representative protein from each protein type may result in the loss of valuable information. Where a feature of interest is discovered in the data that span several similar, but distinct protein sequences, it is a simple task to determine if the data available support the presence of one or more related proteins, and which peptides are unique to each protein, on a case by case basis in SQL so long as all data is made available. A separate intensity or frequency calculation can be made for each different protein sequences regardless of homology to other proteins [10–12, 20, 23, 29]. Where such discrimination between partial sequences, splice variants, predicted proteins or allelic forms is made by subsequent experiments, it will first be required to compare all of the protein sequences together in the same database to look for sequences unique to specific proteins.
The limit of quantification of an LC-ESI-MS/MS experiment for a pure compound is typically about 100 femto mol to 1 pico mol injected on column. Testing purified protein digests on an LC-ESI-MS/MS running at 2 μl per minute via an electropsray into an ion trap showed 10 f mol of standard proteins may be reproducibly and confidently identified, 1 femto of peptide on column seems to be at the detection limit and 100 atto mol of digest on column was typically beyond the sensitivity of a simple LC-ESI-MS/MS method for automatic identification [19, 55]. Based on the above estimates of system sensitivity, we can calculate the range of required concentrations of the above mentioned regulatory proteins in order for them to be detected in the approximate volume of serum/plasma used in the LC-MS experiments summarized here. Since the plasma proteins were apparently detectable by LC-ESI-MSMS then there must be at least 1 to 10 femto mol of the serum/plasma peptide on the column for identification by a simple ion trap. Anderson and Anderson  estimated that the concentration of proteins that leak from tissue and diffuse from cells could reach the nanogram per ml of blood. A protein with a mass of 50,000 Da present at 1 ng per ml has a concentration of about 20 pico molar. Therefore, in order to detect a protein in the 1 ng per ml range in blood, a starting sample in the tens to hundreds of microlitres of blood would have to be efficiently captured and fractionated, to deliver 1-10 femto mol in a single discrete fraction within detection limits and in agreement with the sample sizes used in some of the studies cited here. These calculations are consistent with previous observations of proteins known to be at least as low as 1 ng/ml which have been observed by mass spectrometry from a sample volume in the order of tens to hundreds of microliters [19, 55]. From these calculations, we infer proteins in the ng/ml or roughly pico molar range are near the limit of robust detection by electrospray with a simple ion trap in an unbiased LC-MS experiment after a simple chromatographic pre-fractionation of small samples  and this estimate has been confirmed . Protein biomarkers known to be in the range of 1 ng/ml such as thyroglobulin and others have been repeatedly detected by mass spectrometry [19, 55].
Cellular proteins in serum/plasma
Tissue or cell leakage , secretion  or release of membrane-bound exosomes  have been proposed as the pathways by which cellular proteins, such as nucleic acid binding proteins, might reach the plasma. It now appears that there are significant amounts of intact nucleic acid strings in plasma and that enough fetal DNA is released into the blood stream of a pregnant mother to provide a draft fetal genome sequence . The existence of nucleic acid polymers in plasma probably leads to the presence of their binding proteins in circulation. Nucleic acid binding proteins such as histones and high mobility group proteins have previously been detected in serum/plasma at concentrations as high as 1 to 40 ng/ml, using Western blot and ELISA [58–62]. The cytokine receptors or growth factor receptors that are known to exist in serum/plasma in the ng per ml range were detected by LC-MS. In contrast, there were no detections of cytokines or growth factors, that exist at pico gram per ml levels: From these observations, we conclude that the concentrations of the proteins effectively detected in serum/plasma by mass spectrometry closely match the established detection limits of the LC-MS systems referred to in this review [10, 11, 19, 42–44]. The presence of many serum/plasma proteins, associated with circulation or transport functions, proteolysis and metabolic processes agrees with traditional views of the circulating proteins [5, 6]. However our meta analysis showed that nuclear proteins with a role in DNA transcription and/or RNA binding or metabolism as well as proteins associated with signal transduction from the plasmalemma receptor pathways were identified with impressive agreement between different groups, confirming previous reports [3, 19, 41, 63, 64].
The collection of protein and peptide information, along with the proteins’ cellular locations and molecular functions, together with expression patterns in differentiated tissues and cells provides a powerful means for elaborating hypotheses about potential biomarkers in serum/plasma, before validating them by targeted assays. The detection of zinc finger and other nucleic acid binding domains known to bind RNA by mass spectrometry [3, 19] correlates with the recent discovery of circulating RNA in blood . Purified exosomes from blood may also contain nucleic acids and their binding proteins . It may be possible to identify and quantify the presumably non-coding RNA (ncRNA) in the plasma that will help shed light on the function of the RNA binding proteins detected in blood by mass spectrometry. Complexes of nucleic acids and proteins, including histones, are estimated to circulate at the level of several hundreds of nanograms per ml . It has been suggested that modified histones, complexed with nucleic acids in the plasma, may be biomarkers of cancer [59, 60]. High mobility group proteins and histones may be secreted by cells in response to immunological activation and have been reported to be a biomarker of lupus or other diseases reaching concentrations as high as 40 ng/ml in blood [58–60]. If relatively non-specific DNA binding proteins such as HMG or histones may serve as biomarkers, other more specific nucleic acid binding proteins might also have some clinical significance. However, for any plasma protein to serve as a reliable biomarker, it is important that the blood collection and pre-anaytical procedures be standardized and documented .
Towards the definitive analysis of blood proteins
The reliability of the data presented here was previously established using a variety of statistical methods including distribution between NP versus XP protein libraries, agreement between the data sets, peptide to protein distributions, and non-random distribution of the data over specific GO term categories in agreement with expectation values from goodness of fit tests of the MS/MS spectra [20, 23, 29]. The peptide to protein distribution of the database in toto is consistent with the veracity of the correlation algorithms used by the different research groups [23, 29]. To date it seems that LC-ESI-MS/MS of serum/plasma has revealed a total of 12,130 proteins detected with at least 1 unique or characteristic peptide not found in any other sequence and other these 3858 showed reasonable certainty. In contrast, 7,707 high confidence blood proteins were calculated by BLAST. The related protein sequences can be analyzed using SQL or BLAST, but this additional level of collapse is not necessarily required to make comparisons of detection frequency by Chi square or mean ion intensity by ANOVA to detect proteins of potential interest. At present, routine monitoring of proteins in blood requires the use of monoclonal antibodies for standard ELISA assays. However, there are not sufficient immunological reagents to confirm the majority of the blood proteins discovered to date by mass spectrometry. The limit of detection of mass spectrometry may rival that of ELISA after reproducible partition chromatography  and this level of sensitivity has been confirmed . Many blood proteins discovered by mass spectrometry using sensitive ion traps are near or below the present quantification limits of ELISA or LC-ESI-MS/MS for routine analysis. Extensive experimentation based on affinity reagents and/or mass spectrometry will be required to establish the protein or peptide biomarkers of blood with the acceptable standard of certainty provided by three independent biophysical or biochemical methods in agreement.
JGM gratefully acknowledges the receipt of an award for a Scientist mobility grant from the Fonds National de la Recherche Luxembourg under the auspices of CRP Santé. The authors thank Robert A. Phillips for his helpful comments, suggestions and discussion.
- Kulasingam V, Smith CR, Batruch I, Buckler A, Jeffery DA, Diamandis EP: “Product Ion monitoring” assay for prostate-specific antigen in serum using a linear Ion-trap. J Proteome Res. 2008, 7: 640-647. 10.1021/pr7005999View ArticlePubMedGoogle Scholar
- Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis CA, Doyle F, Epstein CB, Frietze S, Harrow J, Kaul R, Khatun J, Lajoie BR, Landt SG, Lee BK, Pauli F, Rosenbloom KR, Sabo P, Safi A, Sanyal A, Shoresh N, Simon JM, Song L, Trinklein ND, Altshuler RC, Birney E, Brown JB, Cheng C, Djebali S, Dong X, Ernst J: An integrated encyclopedia of DNA elements in the human genome. Nature. 2012, 489: 57-74. 10.1038/nature11247View ArticleGoogle Scholar
- Marshall J, Jankowski A, Furesz S, Kireeva I, Barker L, Dombrovsky M, Zhu W, Jacks K, Ingratta L, Bruin J, Kristensen E, Zhang R, Stanton E, Takahashi M, Jackowski G: Human serum proteins preseparated by electrophoresis or chromatography followed by tandem mass spectrometry. J Proteome Res. 2004, 3: 364-382. 10.1021/pr034039pView ArticlePubMedGoogle Scholar
- States DJ, Omenn GS, Blackwell TW, Fermin D, Eng J, Speicher DW, Hanash SM: Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study. Nat Biotechnol. 2006, 24: 333-338. 10.1038/nbt1183View ArticlePubMedGoogle Scholar
- Putnam F: The plasma proteins: structure function, and genetic control. 1975, second ed, New York: Academic,Google Scholar
- Tietz: Tietz fundamentals of clinical chemistry. 2001, Philadelphia, USA: Saunders,Google Scholar
- Lai RC, Arslan F, Lee MM, Sze NS, Choo A, Chen TS, Salto-Tellez M, Timmers L, Lee CN, El Oakley RM, Pasterkamp G, de Kleijn DP, Lim SK: Exosome secreted by MSC reduces myocardial ischemia/reperfusion injury. Stem Cell Res. 2013, 4: 214-222.View ArticleGoogle Scholar
- Al-Nedawi K, Szemraj J, Cierniewski CS: Mast cell-derived exosomes activate endothelial cells to secrete plasminogen activator inhibitor type 1. Arterioscler Thromb Vasc Biol. 2005, 25: 1744-1749. 10.1161/01.ATV.0000172007.86541.76View ArticlePubMedGoogle Scholar
- Al-Nedawi K, Meehan B, Micallef J, Lhotak V, May L, Guha A, Rak J: Intercellular transfer of the oncogenic receptor EGFRvIII by microvesicles derived from tumour cells. Nat Cell Biol. 2008, 10: 619-624. 10.1038/ncb1725View ArticlePubMedGoogle Scholar
- Bowden P, Thavarajah T, Zhu P, McDonell M, Thiele H, Marshall JG: Quantitative statistical analysis of standard and human blood proteins from liquid chromatography, electrospray ionization, and tandem mass spectrometry. J Proteome Res. 2012, 11: 2032-2047. 10.1021/pr2000013View ArticlePubMedGoogle Scholar
- Florentinus AK, Bowden P, Sardana G, Diamandis EP, Marshall JG: Identification and quantification of peptides and proteins secreted from prostate epithelial cells by unbiased liquid chromatography tandem mass spectrometry using goodness of fit and analysis of variance. J Proteomics. 2012, 75: 1303-1317. 10.1016/j.jprot.2011.11.002View ArticlePubMedGoogle Scholar
- Florentinus AK, Jankowski A, Petrenko V, Bowden P, Marshall JG: The Fc receptor-cytoskeleton complex from human neutrophils. J Proteomics. 2011, 75: 450-468. 10.1016/j.jprot.2011.08.011View ArticlePubMedGoogle Scholar
- Adkins JN, Varnum SM, Auberry KJ, Moore RJ, Angell NH, Smith RD, Springer DL, Pounds JG: Toward a human blood serum proteome: analysis by multidimensional separation coupled with mass spectrometry. Mol Cell Proteomics. 2002, 1: 947-955. 10.1074/mcp.M200066-MCP200View ArticlePubMedGoogle Scholar
- Shen Y, Jacobs JM, Camp DG, Fang R, Moore RJ, Smith RD, Xiao W, Davis RW, Tompkins RG: Ultra-high-efficiency strong cation exchange LC/RPLC/MS/MS for high dynamic range characterization of the human plasma proteome. Anal Chem. 2004, 76: 1134-1144. 10.1021/ac034869mView ArticlePubMedGoogle Scholar
- Tirumalai RS, Chan KC, Prieto DA, Issaq HJ, Conrads TP, Veenstra TD: Characterization of the low molecular weight human serum proteome. Mol Cell Proteomics. 2003, 2: 1096-1103. 10.1074/mcp.M300031-MCP200View ArticlePubMedGoogle Scholar
- Chan K, Lucas DA, Hise D, Schaefer CF, Xiao Z, Janini GM, Beutow KH, Issaq HJ, Veenstra TD, Conrads TP: Analysis of the human serum proteome. Clinical Proteomics. 2004, 1: 101-225. 10.1385/CP:1:2:101. 10.1385/CP:1:2:101View ArticleGoogle Scholar
- Shen Y, Kim J, Strittmatter EF, Jacobs JM, Camp DG, Fang R, Tolie N, Moore RJ, Smith RD: Characterization of the human blood plasma proteome. Proteomics. 2005, 5: 4034-4045. 10.1002/pmic.200401246View ArticlePubMedGoogle Scholar
- Omenn GS, States DJ, Adamski M, Blackwell TW, Menon R, Hermjakob H, Apweiler R, Haab BB, Simpson RJ, Eddes JS, Kapp EA, Moritz RL, Chan DW, Rai AJ, Admon A, Aebersold R, Eng J, Hancock WS, Hefta SA, Meyer H, Paik YK, Yoo JS, Ping P, Pounds J, Adkins J, Qian X, Wang R, Wasinger V, Wu CY, Zhao X: Overview of the HUPO plasma proteome project: results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly-available database. Proteomics. 2005, 5: 3226-3245. 10.1002/pmic.200500358View ArticlePubMedGoogle Scholar
- Tucholska M, Bowden P, Jacks K, Zhu P, Furesz S, Dumbrovsky M, Marshall J: Human serum proteins fractionated by preparative partition chromatography prior to LC-ESI-MS/MS. J Proteome Res. 2009, 8: 1143-1155. 10.1021/pr8005217View ArticlePubMedGoogle Scholar
- Zhu P, Bowden P, Pendrak V, Thiele H, Zhang D, Siu M, Diamandis EP, Marshall J: Comparison of protein expression lists from mass spectrometry of human blood fluids using exact peptide sequences versus BLAST. Clinical Proteomics. 2007, 2: 185-203.View ArticleGoogle Scholar
- Sennels L, Salek M, Lomas L, Boschetti E, Righetti PG, Rappsilber J: Proteomic analysis of human blood serum using peptide library beads. J Proteome Res. 2007, 6: 4055-4062. 10.1021/pr070339lView ArticlePubMedGoogle Scholar
- Faca V, Pitteri SJ, Newcomb L, Glukhova V, Phanstiel D, Krasnoselsky A, Zhang Q, Struthers J, Wang H, Eng J, Fitzgibbon M, McIntosh M, Hanash S: Contribution of protein fractionation to depth of analysis of the serum and plasma proteomes. J Proteome Res. 2007, 6: 3558-3565. 10.1021/pr070233qView ArticlePubMedGoogle Scholar
- Bowden P, Beavis R, Marshall J: Tandem mass spectrometry of human tryptic blood peptides calculated by a statistical algorithm and captured by a relational database with exploration by a general statistical analysis system. J Proteomics. 2009, 73: 103-111. 10.1016/j.jprot.2009.08.004View ArticlePubMedGoogle Scholar
- Keller A, Nesvizhskii AI, Kolker E, Aebersold R: Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem. 2002, 74: 5383-5392. 10.1021/ac025747hView ArticlePubMedGoogle Scholar
- Cargile BJ, Bundy JL, Stephenson JL: Potential for false positive identifications from large databases through tandem mass spectrometry. J Proteome Res. 2004, 3: 1082-1085. 10.1021/pr049946oView ArticlePubMedGoogle Scholar
- Moore RE, Young MK, Lee TD: Qscore: an algorithm for evaluating SEQUEST database search results. J Am Soc Mass Spectrom. 2002, 13: 378-386. 10.1016/S1044-0305(02)00352-5View ArticlePubMedGoogle Scholar
- Zhu P, Bowden P, Tucholska M, Zhang D, Marshall JG: Peptide-to-protein distribution versus a competition for significance to estimate error rate in blood protein identification. Anal Biochem. 2011, 411: 241-253. 10.1016/j.ab.2010.12.003View ArticlePubMedGoogle Scholar
- Zhu P, Bowden P, Tucholska M, Marshall JG: Chi-square comparison of tryptic peptide-to-protein distributions of tandem mass spectrometry from blood with those of random expectation. Anal Biochem. 2011, 409: 189-194. 10.1016/j.ab.2010.10.027View ArticlePubMedGoogle Scholar
- Bowden P, Pendrak V, Zhu P, Marshall JG: Meta sequence analysis of human blood peptides and their parent proteins. J Proteomics. 2010, 73: 1163-1175. 10.1016/j.jprot.2010.02.007View ArticlePubMedGoogle Scholar
- Craig R, Beavis RC: TANDEM: matching proteins with tandem mass spectra. Bioinformatics. 2004, 20: 1466-1467. 10.1093/bioinformatics/bth092View ArticlePubMedGoogle Scholar
- Jimmy K, Eng ALM, Yates JR: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Am Soc Mass Spectr. 1994, 5: 979-989.Google Scholar
- Yates JR, Eng JK, McCormack AL, Schieltz D: Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal Chem. 1995, 67: 1426-1436. 10.1021/ac00104a020View ArticlePubMedGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.View ArticlePubMedGoogle Scholar
- von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini M, Jouffre N, Huynen MA, Bork P: STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res. 2005, 33: D433-D437.PubMed CentralView ArticlePubMedGoogle Scholar
- Hunter MP, Ismail N, Zhang X, Aguda BD, Lee EJ, Yu L, Xiao T, Schafer J, Lee ML, Schmittgen TD, Nana-Sinkam SP, Jarjoura D, Marsh CB: Detection of microRNA expression in human peripheral blood microvesicles. PLoS One. 2008, 3: e3694- 10.1371/journal.pone.0003694PubMed CentralView ArticlePubMedGoogle Scholar
- Nusslein-Volhard C, Wieschaus E: Mutations affecting segment number and polarity in Drosophila. Nature. 1980, 287: 795-801. 10.1038/287795a0View ArticlePubMedGoogle Scholar
- Yang Y, Lu J, Rovnak J, Quackenbush SL, Lundquist EA: SWAN-1, a caenorhabditis elegans WD repeat protein of the AN11 family, is a negative regulator of Rac GTPase function. Genetics. 2006, 174: 1917-1932. 10.1534/genetics.106.063115PubMed CentralView ArticlePubMedGoogle Scholar
- Jung DJ, Na SY, Na DS, Lee JW: Molecular cloning and characterization of CAPER, a novel coactivator of activating protein-1 and estrogen receptors. J Biol Chem. 2002, 277: 1229-1234. 10.1074/jbc.M110417200View ArticlePubMedGoogle Scholar
- Chang KH, Chen Y, Chen TT, Chou WH, Chen PL, Ma YY, Yang-Feng TL, Leng X, Tsai MJ, O’Malley BW, Lee WH: A thyroid hormone receptor coactivator negatively regulated by the retinoblastoma protein. Proc Natl Acad Sci U S A. 1997, 94: 9040-9045. 10.1073/pnas.94.17.9040PubMed CentralView ArticlePubMedGoogle Scholar
- Lanner JT: Ryanodine receptor physiology and its role in disease. Adv Exp Med Biol. 2012, 740: 217-234. 10.1007/978-94-007-2888-2_9View ArticlePubMedGoogle Scholar
- Tucholska M, Scozzaro S, Williams D, Ackloo S, Lock C, Siu KW, Evans KR, Marshall JG: Endogenous peptides from biophysical and biochemical fractionation of serum analyzed by matrix-assisted laser desorption/ionization and electrospray ionization hybrid quadrupole time-of-flight. Anal Biochem. 2007, 370: 228-245. 10.1016/j.ab.2007.07.029View ArticlePubMedGoogle Scholar
- Zhu P, Bowden P, Zhang D, Marshall JG: Mass spectrometry of peptides and proteins from human blood. Mass Spectrom Rev. 2011, 30: 685-732.PubMedGoogle Scholar
- Shi T, Sun X, Gao Y, Fillmore TL, Schepmoes AA, Zhao R, He J, Moore RJ, Kagan J, Rodland KD, Liu T, Liu AY, Smith RD, Tang K, Camp DG, Qian WJ: Targeted quantification of Low ng/mL level proteins in human serum without immunoaffinity depletion. J Proteome Res. 2013, 12: 3353-3361. 10.1021/pr400178vPubMed CentralView ArticlePubMedGoogle Scholar
- Addona TA, Abbatiello SE, Schilling B, Skates SJ, Mani DR, Bunk DM, Spiegelman CH, Zimmerman LJ, Ham AJ, Keshishian H, Hall SC, Allen S, Blackman RK, Borchers CH, Buck C, Cardasis HL, Cusack MP, Dodder NG, Gibson BW, Held JM, Hiltke T, Jackson A, Johansen EB, Kinsinger CR, Li J, Mesri M, Neubert TA, Niles RK, Pulsipher TC, Ransohoff D: Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma. Nat Biotechnol. 2009, 27: 633-641. 10.1038/nbt.1546PubMed CentralView ArticlePubMedGoogle Scholar
- Jeong SK, Kwon MS, Lee EY, Lee HJ, Cho SY, Kim H, Yoo JS, Omenn GS, Aebersold R, Hanash S, Paik YK: BiomarkerDigger: a versatile disease proteome database and analysis platform for the identification of plasma cancer biomarkers. Proteomics. 2009, 9: 3729-3740. 10.1002/pmic.200800593View ArticlePubMedGoogle Scholar
- Mueller LN, Brusniak MY, Mani DR, Aebersold R: An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data. J Proteome Res. 2008, 7: 51-61. 10.1021/pr700758rView ArticlePubMedGoogle Scholar
- Mortensen P, Gouw JW, Olsen JV, Ong SE, Rigbolt KT, Bunkenborg J, Cox J, Foster LJ, Heck AJ, Blagoev B, Andersen JS, Mann M: MSQuant, an open source platform for mass spectrometry-based quantitative proteomics. J Proteome Res. 2010, 9: 393-403. 10.1021/pr900721eView ArticlePubMedGoogle Scholar
- Florentinus AK, Bowden P, Barbisan V, Marshall J: Capture and qualitative analysis of the activated Fc receptor complex from live cells. Curr Protoc Protein Sci. 2012, Chapter 19: Unit 19 22,Google Scholar
- Schwertz H, Koster S, Kahr WH, Michetti N, Kraemer BF, Weitz DA, Blaylock RC, Kraiss LW, Greinacher A, Zimmerman GA, Weyrich AS: Anucleate platelets generate progeny. Blood. 2010, 115: 3801-3809. 10.1182/blood-2009-08-239558PubMed CentralView ArticlePubMedGoogle Scholar
- Holly SP, Chen X, Parise LV: Abundance- and activity-based proteomics in platelet biology. Curr Proteomics. 2011, 8: 216-228. 10.2174/157016411797247512PubMed CentralView ArticlePubMedGoogle Scholar
- Senzel L, Gnatenko DV, Bahou WF: The platelet proteome. Curr Opin Hematol. 2009, 16: 329-333. 10.1097/MOH.0b013e32832e9dc6PubMed CentralView ArticlePubMedGoogle Scholar
- Brinkmann V, Reichard U, Goosmann C, Fauler B, Uhlemann Y, Weiss DS, Weinrauch Y, Zychlinsky A: Neutrophil extracellular traps kill bacteria. Science. 2004, 303: 1532-1535. 10.1126/science.1092385View ArticlePubMedGoogle Scholar
- Looze C, Yui D, Leung L, Ingham M, Kaler M, Yao X, Wu WW, Shen RF, Daniels MP, Levine SJ: Proteomic profiling of human plasma exosomes identifies PPARgamma as an exosome-associated protein. Biochem Biophys Res Commun. 2009, 378 (2): 433-438.PubMed CentralView ArticlePubMedGoogle Scholar
- Simpson RJ, Jensen SS, Lim JW: Proteomic profiling of exosomes: current perspectives. Proteomics. 2008, 8: 4083-4099. 10.1002/pmic.200800109View ArticlePubMedGoogle Scholar
- Zhu P, Bowden P, Zhang D, Marshall JG: Mass spectrometry of peptides and proteins from human blood. Mass Spectrom Rev. 2011, 30 (5): 685-732.PubMedGoogle Scholar
- Anderson NL, Anderson NG: The human plasma proteome: history, character, and diagnostic prospects. Mol Cell Proteomics. 2003, 2: 50-10.1074/mcp.A300001-MCP200. 10.1074/mcp.A300001-MCP200View ArticleGoogle Scholar
- Lo YM, Chan KC, Sun H, Chen EZ, Jiang P, Lun FM, Zheng YW, Leung TY, Lau TK, Cantor CR, Chiu RW: Maternal plasma DNA sequencing reveals the genome-wide genetic and mutational profile of the fetus. Sci Transl Med. 2010, 2 (61): 61ra91-View ArticlePubMedGoogle Scholar
- Abdulahad DA, Westra J, Bijzet J, Dolff S, van Dijk MC, Limburg PC, Kallenberg CG, Bijl M: Urine levels of HMGB1 in systemic lupus erythematosus patients with and without renal manifestations. Arthritis Res Ther. 14 (4): R184-Google Scholar
- Li Y, Liu B, Fukudome EY, Lu J, Chong W, Jin G, Liu Z, Velmahos GC, Demoya M, King DR, Alam HB: Identification of citrullinated histone H3 as a potential serum protein biomarker in a lethal model of lipopolysaccharide-induced shock. Surgery. 2011, 150: 442-451. 10.1016/j.surg.2011.07.003PubMed CentralView ArticlePubMedGoogle Scholar
- Gezer U, Mert U, Ozgur E, Yoruker EE, Holdenrieder S, Dalay N: Correlation of histone methyl marks with circulating nucleosomes in blood plasma of cancer patients. Oncol Lett. 2012, 3: 1095-1098.PubMed CentralPubMedGoogle Scholar
- Fahmueller YN, Nagel D, Hoffmann RT, Tatsch K, Jakobs T, Stieber P, Holdenrieder S: Predictive and prognostic value of circulating nucleosomes and serum biomarkers in patients with metastasized colorectal cancer undergoing selective internal radiation therapy. BMC Cancer. 2012, 3 (5): 1095-1098.Google Scholar
- Holdenrieder S, Von Pawel J, Nagel D, Stieber P: Long-term stability of circulating nucleosomes in serum. Anticancer Res. 2010, 30: 1613-1615.PubMedGoogle Scholar
- Tucholska M, Florentinus A, Williams D, Marshall JG: The endogenous peptides of normal human serum extracted from the acetonitrile-insoluble precipitate using modified aqueous buffer with analysis by LC-ESI-Paul ion trap and Qq-TOF. J Proteomics. 2010, 73: 1254-1269. 10.1016/j.jprot.2010.02.022View ArticlePubMedGoogle Scholar
- Williams D, Ackloo S, Zhu P, Bowden P, Evans KR, Addison CL, Lock C, Marshall JG: Precipitation and selective extraction of human serum endogenous peptides with analysis by quadrupole time-of-flight mass spectrometry reveals posttranslational modifications and low-abundance peptides. Anal Bioanal Chem. 2010, 396: 1223-1247. 10.1007/s00216-009-3345-0View ArticlePubMedGoogle Scholar
- Betsou F, Gunter E, Clements J, DeSouza Y, Goddard KA, Guadagni F, Yan W, Skubitz A, Somiari S, Yeadon T, Chuaqui R: Identification of evidence-based biospecimen quality-control tools: a report of the international society for biological and environmental repositories (ISBER) biospecimen science working group. J Mol Diagn. 2013, 15: 3-16. 10.1016/j.jmoldx.2012.06.008View ArticlePubMedGoogle Scholar
- McClintock B: The stability of broken ends of chromosomes in Zea Mays. Genetics. 1941, 26: 234-282.PubMed CentralPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.