Identification of serum proteins AHSG, FGA and APOA-I as diagnostic biomarkers for gastric cancer

Background The development of clinically accessible biomarkers is critical for the early diagnosis of gastric cancer (GC) in patients. High-throughput proteomics techniques could not only effectively generate a serum peptide profile but also provide a new approach to identify potentially diagnostic and prognostic biomarkers for cancer patients. Methods In this study, we aim to identify potentially discriminating serum biomarkers for GC. In the discovery cohort, we screened potential biomarkers using magnetic-bead-based purification and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry in 64 samples from 32 GC patients that were taken both pre- and post-operatively and 30 healthy volunteers that served as controls. In the validation cohort, the expression patterns and diagnostic values of serum FGA, AHSG and APOA-I were further confirmed by ELISA in 42 paired GC patients (pre- and post-operative samples from 16 patients with pathologic stage I/II and 26 with stage III/IV), 30 colorectal cancer patients, 30 hepatocellular carcinoma patients, and 28 healthy volunteers. Results ClinProTools software was used and annotated 107 peptides, 12 of which were differentially expressed among three groups (P < 0.0001, fold > 1.5). These 12 peptide peaks were further identified as FGA, AHSG, APOA-I, HBB, TXNRD1, GSPT2 and CAKP5. ELISA data suggested that the serum levels of FGA, AHSG and APOA-I in GC patients were significantly different compared with healthy controls and had favorable diagnostic values for GC patients. Moreover, we found that the serum levels of these three proteins were associated with TNM stages and could reflect tumor burden. Conclusion Our findings suggested that FGA, AHSG and APOA-I might be potential serum biomarkers for GC diagnosis. Electronic supplementary material The online version of this article (10.1186/s12014-018-9194-0) contains supplementary material, which is available to authorized users.


Background
Gastric cancer (GC) is the fourth most common cancer with almost 1000,000 new cases diagnosed every year [1]. The incidence of GC is highest in Eastern Asia, especially in China, which alone accounts for nearly 50% of the world's cases [2]. Moreover, GC is the second leading fatal cancer subtype, and approximately 498,000 Chinese patients died from GC in 2015 [3]. The high mortality rate of GC is mainly due to delayed diagnosis, at which time the cancer has advanced to an inoperable stage and can no longer be eradicated by surgical resection [4]. There are non-specific symptoms displayed in GC patients at the early stages [5]. Therefore, exploring novel biomarkers for GC patients will help monitor tumor status and guide clinical treatment.
Serum tumor biomarkers can be secreted by tumor cells or by normal cells responding to the malignant behavior of tumors [6]. For decades, serum-based biomarkers were considered the most important biomarkers to reflect tumor burden and have been applied for cancer diagnosis and post-operation monitoring. The conventional serum-based biomarkers for GC, such as carcino-embryonic antigen (CEA), carbohydrate antigen 19-9 (CA19-9) and CA72-4, did not have favorable specificity and sensitivity, which always resulted in delayed diagnosis [7,8]. Ebert MP et al. stated that the sensitivities of above three biomarkers are only 16-63, 20-56, and 18-51%, respectively [9].
In recent years, several high-throughput proteomics techniques have been applied in serum samples to uncover novel diagnostic markers [10]. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) is becoming a standard tool in protein analysis in particular [11][12][13]. In this study, we first evaluated a discovery group that included 32 GC patients and 30 healthy volunteers and employed MALDI-TOF-MS to identify peptides that were candidate biomarkers for GC. Next, we evaluated a validation group that included 42 paired GC patients, 30 colorectal cancer (CRC) patients, 30 hepatocellular carcinoma (HCC) patients, and 28 healthy volunteers and performed enzyme-linked immunosorbent assay (ELISA) to validate the diagnostic values of the candidate biomarkers identified in the first step.

Patient selection and sample preparation
The research protocol was approved by the Ethics and the Human Research Review Committee of Xi'an Jiao Tong University. All subjects signed a consent form before participating in this research study, which was approved by the Institutional Review Board of Xi'an Jiao Tong University. All experiments were carried out in accordance with the approved guidelines. A total of 266 serum samples from 192 individuals were collected from the Department of General Surgery, Department of Gastroenterology, the Department of Physical Examination, the First Affiliated Hospital of Xi'an Jiao Tong University, China, from March 2016 to April 2017. The discovery cohort consisted of 32 pairs of serum samples from 32 pre-and post-operative GC patients as well as 30 healthy controls. The validation cohort was composed of 42 pairs of serum samples from GC patients (16 pairs are at I/II stage, and 26 pairs are at III/IV stage), 30 CRC patients, 30 HCC patients, and 28 healthy volunteers. The diagnosis of GC, CRC and HCC was confirmed by pathological diagnosis. The discovery cohort and validation cohort are completely non-overlapping. Moreover, the healthy control groups were gender-and age-matched with the cancer groups. The characteristic information of all subjects is shown in Table 1.
The exclusion criteria for subjects were as follows: (1) patients with a known history of any other tumors and any obvious inflammatory diseases, such as liver cirrhosis, chronic renal disease, and diabetes mellitus; (2) patients with a known history of any surgical operations, chemotherapy or radiotherapy before collection of the serum; and (3) patients with a known history of receiving blood transfusion within a month before collection of the serum.
All blood samples were obtained from non-fasting patients or healthy controls in the morning. The serum samples were collected in 10-cc separator tubes (BD, #367820) and were kept at 4 °C for 1 h, then centrifuged at 3000 rpm for 10 min at 4 °C. The serum samples were distributed into 400-μL aliquots and stored at − 80 °C until use.

MS analysis: magnetic beads-based immobilized metal-ion affinity chromatography (MB-IMAC-Cu) fractionation and MALDI-TOF-MS
Magnetic Beads-based Immobilized Metal-ion Affinity Chromatography (MB-IMAC-Cu) (ClinProt purification reagent sets; Bruker Daltonics, Bremen, Germany) was used for enrichment of serum peptides followed by MALDI-TOF-MS analysis. A total of 94 serum samples were fractionated according to instructions provided by Bruker Daltonics. Briefly, 5 μl of magnetic beads was pretreated with 50 μl of binding buffer, and the supernatant was carefully discarded. The magnetic beads were re-suspended in 20 μl of binding buffer in a PCR tube, and then 5 μl of serum sample was added and mixed gently. The mixtures were incubated at room temperature for 5 min and separated in the magnetic separator. The beads were washed once with 100 μl of wash buffer, and the peptides and proteins were eluted with 10 μl of elution buffer from beads. Then, 1 μl of the eluted peptides and proteins and 1 μl of a mixture containing 3 mg/ml α-cyano-4hydroxy-cinnamic acid (Bruker) in 50% acetonitrile and 0.5% trifluoroacetic acid was spotted onto the MALDI AnchorChip surface. Samples were spotted in triplicate to evaluate the reproducibility of this method.

Data analysis with ClinProTools
Air-dried targets were immediately tested with calibrated Autoflex III MALDI-TOF-MS (Bruker), flexControl version 3.0 software (Bruker), via an optimized measuring protocol. The settings of the instrument were as follows: ion source 1, 20.00 kV; ion source 2, 18.90 kV; lens, 6.50 kV; and pulsed ion extraction, 120 ns. Ionization was achieved by irradiation with a crystal laser operating at 200.0 Hz. A standard calibration mixture of peptides and proteins (mass range 1-10 kDa) was used for mass calibration. For each MALDI spot, 1200 spectra were acquired (200 laser shots at 6 different spot positions). All tests were performed in a blinded manner, including the serum analysis of different groups. The Flex analysis software (version 3.0; Bruker) was applied for all serum data analysis. Recognition of peptide patterns was analyzed by ClinProTools version 2.2 software (Bruker).

Peptide identification by LC-ESI-MS/MS
After completing the statistical analysis, the peptides were identified using liquid chromatography-mass spectrometry, which combined Nano Acquity UPLC liquid chromatography (Waters, USA) with an LTQ Orbitrap XL mass spectrometer system (Thermo Fisher Scientific, USA). The Peptide mixture solutions purified by MB-IMAC-Cu trapping used a captrap C18 (2 mm × 0.5 mm) column (Michrom Corporation, USA) and an analytical Magic C18, AQ (100 µm × 150 mm) column (Michrom Corporation). Mobile phase A was a solution of 5% acetonitrile and 0.1% formic acid, and mobile phase B was a solution of 90% acetonitrile and 0.1% formic acid. Peptide mixtures were injected into the trap column with a flow of 20 μl/min for 5 min and then eluted with a three-step linear gradient, starting from 5% B to 45% B for 40 min, increased to 80% B for 1 min, and then held at 80% B for 4 min. The column was re-equilibrated at the initial conditions for 15 min. The column flow rate was maintained at 500 nl/min, and the column temperature was maintained at 35°C. Electrospray voltage of 1.9 kV versus the inlet of the mass spectrometer was used. The LTQ Orbitrap XL mass spectrometer was operated in the data-dependent mode to switch automatically between MS and MS/MS acquisition. Survey full scan MS spectra with two microscans (m/z 400-2000) were acquired in the Obitrap with a mass resolution of 100,000 at m/z 400, followed by eight sequential LTQ-MS/MS scans. Dynamic exclusion was used with two repeat counts, consisting of a 10 s repeat duration and a 60 s exclusion duration. For MS/ MS, precursor ions were activated using 25% normalized collision energy at the default activation q of 0.25. All MS/MS spectra were profiled with SEQUEST [v.28 (revision 12), Thermo Electron Corp.] which searched the human International Protein index (IPI) database (IPI human v3.64 fasta with 71,983 entries) and the Uni-protKB (http://www.unipr ot.org) for peptide-to-spectral matching. To minimize false positives, a decoy database containing all of the reverse protein sequences was added to this database. The search parameters were as follows: no enzyme digestion, the variable modification was the oxidation of methionine, a peptide mass tolerance of 20 ppm, and a fragment ion tolerance of 1.0 Da.
The resulting filter parameters were as follows: ∆ Cn ≥ 0.10, Xcorr ≥ 2.3 for two charged ions, Xcorr ≥ 2.6 for three charged ions, Xcorr ≥ 3.0 for four or more charged state ions, and FDR < 0.01. The errors were less than 0.1 Da in the m/z of peptide determined by LC-MS.

Enzyme-linked immunosorbent assay (ELISA)
All serum samples were run in triplicate and analyzed in a blinded fashion in triplicate. The concentrations of Isoform I of Fibrinogen alpha chain precursor (FGA), Alpha-2-HS-glycoprotein precursor (AHSG) and Apolipoprotein A-I precursor (APOA1) were quantified with a Human FGA ELISA Kit (Elisa Biotech, #: CK-E93791H), a Human AHSG ELISA Kit (Elisa Biotech, #: CK-E95306H), and a Human ApoA-1 ELISA kit (Elisa Biotech, #: CK-E11517H), respectively. Standard curves were generated and used to determine the concentrations of FGA, AHSG and APOA-1 in the samples analyzed.

Statistical analysis
Statistical analysis was performed with GraphPad Prism version 6.0 (GraphPad Software, La Jolla, CA, USA). All data are shown as the mean ± SD. P < 0.05 was considered significant. Comparisons among multiple groups were performed via repeated measures analysis of variance and the least significant difference test. Student's t test was used to analyze the ELISA data. ROC curves were utilized to assess the diagnostic value of FGA, AHSG, and APOA-I.

Serum proteomic profiling of GC patients and healthy controls
As shown in Fig. 1b, the reproducibility and stability of the mass spectra data were closely reproducible in triplicate samples of each group. We then performed mass spectrometry-based proteomics via the MALDI-TOF-MS method (Fig. 1a). Our data revealed that the mass spectra differed among the pre-operative GC patients (red), post-operative GC patients (blue) and healthy controls (green) (Fig. 1). Additionally, the serum samples fractionated by MB-IMAC-Cu and MALDI-TOF-MS showed that pre-operative GC patients (red), post-operative GC patients (blue) and controls (green) had proteomic profiles that ranged from 1000 to 10,000 Da (Fig. 1c). Principal component analysis revealed that pre-operative GC patients (red), postoperative GC patients (blue) and control (green) samples could be distinguished by several peptides (Fig. 1d, e), which suggested the possibility of exploring serum biomarkers to separate GC patients from control subjects. Fig. 1 Reproducibility of mass spectra generated in individuals from different groups and the comparative analysis of serum proteomic profiling between different groups. a Gel view of mass spectra from healthy control (green), GC pre-operative (red) and GC post-operative (blue) serum samples, in the mass range from 1000 to 10000 Da. b Representative mass spectra of three samples in each group in the mass range from 1000 to 10,000 Da, showing high reproducibility and stability between replicates. c Overall sum of the spectra in the mass range from 1000 to 10,000 Da obtained from all GC patients pre-operation (red) and post-operation (blue), as well as healthy controls (green). d 3D plot and bivariate plot e of pre-operative GC patients (red), healthy controls (green), and post-operative GC patients (blue) in the PCA

Selection of differential expressed peptides and diagnostic model testing
ClinProTools software (version 2.2, Bruker) was used to identify a total of 107 different peaks from serum samples. Among them, 12 peaks were significantly different among pre-operative and post-operative GC patients and healthy controls (P < 0.0001, fold change > 1.5) (Fig. 2) and showed a response to therapy and a tendency to return to healthy control values after the operation. Peptide peaks 1-  Table 2). The relative expression levels of the above 12 peptide peaks among healthy controls (green), pre-operative GC (red), and post-operative GC (blue) are shown in Fig. 3a. The receiver operating characteristic (ROC) curves of 12 differentially expressed peptides between GC patients and healthy controls are shown in Fig. 3b. The area under ROC (AUC) values of all peptides were more than 0.8,

Identification of selected serum peptides in GC patients
All 12 peptide peaks were further identified using LC-ESI-MS/MS and human International Protein Index (IPI) database and UniprotKB. The results of peptide fragmentation identification are shown in Table 3 Table 3. There is no matching information for peptide fragmentation identification of the peak at m/z 1406.85 and m/z 1539.22 in the above databases. The results of identification peptide are shown in Additional file 1.

Validation of expression patterns for three biomarkers in cancers by ELISA
To validate the selected serum proteins that were identified by proteomics, a validation cohort of 42 GC patients, 30 CRC patients, 30 HCC patients and 28 healthy controls was evaluated in the present study. First, we chose three proteins, FGA, AHSG and APOA-I, according to the relative expression levels of the 12 peptides in three different groups and validated in GC by ELISA (Table 4). Our results revealed that the serum FGA levels were significantly higher in GC patients than in controls, and the ROC analysis showed high diagnostic values of serum FGA in GC with AUCs of 0.98 (Fig. 4a, d). In addition, the serum AHSG and APOA-I levels were noticeably elevated in GC patients compared with controls. Moreover, the ROC analysis also demonstrated high diagnostic values of serum AHSG and APOA-I in GC with AUCs of 0.92 and 0.83, respectively (Fig. 4b, c, e, f ). Finally, to further determine the specificity of FGA, AHSG and APOA-I in GC patients, we examined the serum levels in patients with three common digestive carcinomas (CRC, HCC, and above GC) by ELISA. These three candidates had higher serum levels in GC than in CRC and HCC (P < 0.05, both) (Fig. 4g, h, i). All results indicated that FGA, AHSG and APOA-I could be considered valuable diagnostic biomarkers for GC.

Diagnostic values of three biomarkers in GC patients with early or late stage
We also evaluated the expression patterns of three biomarkers in patients with early-or late-stage GC. A total of 16 patients with early-stage GC (pTNM I + II), 26 patients with late-stage GC (pTNM III + IV), and 28 healthy controls were included. Our results showed that the serum level of FGA was elevated along with GC progression (Fig. 5a). Serum FGA showed favorable diagnostic values for GC patients both in the early stage (Fig. 5b) and late stage (Fig. 5c). Similarly, the serum level of AHSG was also elevated with an increase in tumor stage (Fig. 5d). The diagnostic value of serum AHSG in GC patients with late-stage GC (Fig. 5f ) was significantly higher than in patients with early-stage GC (Fig. 5e). In contrast, the serum level of APOA-I was only elevated in GC patients with early-stage GC (Fig. 5g). The diagnostic value of serum APOA-I was significantly higher   in patients (Fig. 5h) with early-stage GC than those with late-stage GC (Fig. 5i). Furthermore, we also compared the expression levels of three biomarkers between preand post-operation groups. Our results showed that the serum levels of three biomarkers were all decreased after the operation, suggesting that those biomarkers could reflect tumor burden (Fig. 6).

Discussion
GC is a highly aggressive cancer associated with high mortality in China. Due to the lack of early detection, most GC patients are diagnosed with advanced-stage disease, for which treatment options are limited, resulting in an overall 5-year survival rate of 10-28% [14]. Serum-based biomarkers are of considerable importance  in the early diagnosis of various diseases including cancer [10]. Due to the complex mixture in serum samples, the identification of potential tumor biomarkers secreted in very tiny amounts is very difficult [15,16]. The development of new proteomics techniques has greatly advanced in this field [17]. The enrichment of protein from serum can be screened with MB-IMAC-Cu, which is designed for extremely sophisticated biomarker profiling studies on a discovery level. Magnetic bead-based fractionation followed by MALDI-TOF-MS combined with advanced bio-informatic tools (ClinProTools software) can identify biomarkers effectively and precisely [18]. In the current study, we employed MALDI-TOF/MS to screen candidate peptides from serum samples. A total of 12 candidate peptides were selected and then identified as the fragment of FGA, AHSG, APOA-I, HBB, CKAP5 and GSPT2 by LC-ESI-MS/MS. In a validation cohort, we confirmed three of them (FGA, AHSG and APOA-I) via ELISA. Our data suggested a relative high diagnostic accuracy of the above three biomarkers in GC patients. Alpha 2-HS glycoprotein (AHSG), also known as Fetuin-A, is synthesized by hepatocytes and secreted into the circulatory system [19]. AHSG is a multifunctional glycoprotein involved in numerous normal and pathological processes such as brain development, bone metabolism regulation, insulin resistance, migration and invasion in human colorectal cancer [20]. In addition, AHSG had also been described as a potential diagnostic biomarker for cancer patients. Yi et al. [21] reported that serum AHSG could be used in breast cancer diagnosis. Chen et al. [22] also suggested that the high serum level of AHSG could be a potential biomarker for patients with pancreatic cancer. Moreover, the diagnostic value of AHSG had also been reported in hepatocellular carcinoma [23], glioblastoma [24], esophageal squamous cell carcinoma [25], and prostate cancer [26]. In our study, we reported for the first time that the serum level of AHSG was higher in GC patients than in healthy controls. Our data suggested that serum AHSG could be a potential biomarker for GC patients, especially for those patients with late-stage GC.
Fibrinogen alpha chain (FGA) is one of three types of non-identical polypeptide chains consisting of fibrinogen, which is a blood-borne glycoprotein. Comprehensive research indicated that the serum level of FGA was up-regulated in many malignant neoplasms, such as breast cancer, nasopharyngeal carcinoma, acute lymphocytic leukemia, esophageal squamous cell carcinoma and renal cell cancer [27][28][29]. Our data based on mass spectrometry revealed that 5 out of 12 candidate peptides were identified as FGA. Moreover, in the ELISA-based validation cohort, the serum level of FGA was significantly higher in GC patients than in healthy controls, and serum FGA showed favorable diagnostic accuracy in patients with both early-and late-stage GC. Our finding is consistent with those of previous studies by Liu et al. [30] and Wu et al. [31]. The high serum level of FGA in cancer patients may be due to the release of pro-coagulant factors from endothelial cells and platelets, the latter of which were stimulated by cancer cells [32,33]. In addition, some studies also reported that FGA was involved in tumor progression and metastasis [34], which also suggested that serum FGA could be a potential tumorassociated biomarker to improve the sensitivity of the diagnosis of GC.
Recently, the relationship between apolipoprotein and cancer has been highlighted. Signe Borgquist et al. reported that overall cancer risk is associated with the circulatory content of apolipoproteins in males [35]. Recent studies have also revealed that some apolipoproteins, such as APOC-I [36] and APOC-III [37], were overexpressed in serum samples derived from GC patients. APOA-I, which is produced in the liver and intestine [38], is a primary structural and functional portion of high-density lipoprotein and plays an indispensable role in cholesterol transportation and metabolism homeostasis [39]. Due to its intricate biological functions, APOA-I has been reported to be involved in various pathological processes, such as cardiovascular disease, myeloproliferative disorders and Alzheimer's disease [40]. Moreover, it was observed that APOA-I could be secreted by cancer cells [41,42], suggesting the tight association between APOA-I and cancer. In the clinic, the diagnostic value of APOA-I as a potential tumor biomarker was also reported in multiple malignancies, such as breast cancer, lung cancer, ovarian cancer, bladder cancer and cholangiocarcinoma [40]. In our study, APOA-I was shown to be up-regulated in GC patients, especially in those with early-stage GC. Moreover, the level of APOA-I was significantly decreased after gastrectomy, suggesting that this biomarker reflects tumor burden. Our findings are consistent with those of a previous study [43] that found an association between the circulating level of APOA-I and tumor burden in a mouse model. However, in a previous study by Muntoni et al. [44], the level of serum APOA-I in female GC patients was decreased compared with healthy controls, and there was no significant difference between male GC patients and healthy controls. The inconsistent results in our study and Muntoni's study may be due to the following reasons. First, the apolipoprotein isoforms and the level of lipoprotein were reported to be highly variable and ethnicity-specific [45]. Second, the apolipoproteins were unstable, resulting in the generation of some vulnerable protein fragments during prolonged storage [46]. In Muntoni's study, the collection period of serum samples was more than 10 years (1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998). In contrast, our samples were consecutively collected within 10 months, which might reduce the degradation of serum proteins. Finally, the different methods of sample preparation and detection in the two studies could also contribute to the discrepancy, which needs to be validated by a large-sample size and multi-center cohort in the future.

Conclusion
Twelve peptides were identified as candidate biomarkers for GC patients by high-throughput proteomics. Three of them (FGA, AHSG and APOA-I) were validated in a larger cohort with high diagnostic accuracies, which suggested that FGA, AHSG and APOA-I might be developed as potential diagnostic biomarkers. Meanwhile, all proteins identified are highly abundant serum proteins and the presence of different elevated levels of the proteins in the other cohort, which suggested these candidate biomarkers could probably be secondary biomarkers. Nevertheless, they are still potentially valuable to improve the sensitivity and specificity of the diagnosis of GC. Further reliable validation with bigger sample cohorts is still needed for these three potential GC biomarkers, which might be valuable for the clinical diagnosis of GC in the future. Moreover, further studies will focus on validating the diagnostic accuracies of another four non-validated potential biomarkers (HBB, TXNRD1, CAKP5, GSPT2) and on analyzing the relationships between the expressed levels of those biomarkers and the prognosis of GC patients.