Identification of prothymosin alpha (PTMA) as a biomarker for esophageal squamous cell carcinoma (ESCC) by label-free quantitative proteomics and Quantitative Dot Blot (QDB)

Background Esophageal cancer (EC) is one of the malignant tumors with a poor prognosis. The early stage of EC is asymptomatic, so identification of cancer biomarkers is important for early detection and clinical practice. Methods In this study, we compared the protein expression profiles in esophageal squamous cell carcinoma (ESCC) tissues and adjacent normal esophageal tissues from five patients through high-resolution label-free mass spectrometry. Through bioinformatics analysis, we found the differentially expressed proteins of ESCC. To perform the rapid identification of biomarkers, we adopted a high-throughput protein identification technique of Quantitative Dot Blot (QDB). Meanwhile, the QDB results were verified by classical immunohistochemistry. Results In total 2297 proteins were identified, out of which 308 proteins were differentially expressed between ESCC tissues and normal tissues. By bioinformatics analysis, the four up-regulated proteins (PTMA, PAK2, PPP1CA, HMGB2) and the five down-regulated proteins (Caveolin, Integrin beta-1, Collagen alpha-2(VI), Leiomodin-1 and Vinculin) were selected and validated in ESCC by Western Blot. Furthermore, we performed the QDB and IHC analysis in 64 patients and 117 patients, respectively. The PTMA expression was up-regulated gradually along the progression of ESCC, and the PTMA expression ratio between tumor and adjacent normal tissue was significantly increased along with the progression. Therefore, we suggest that PTMA might be a potential candidate biomarker for ESCC. Conclusion In this study, label-free quantitative proteomics combined with QDB revealed that PTMA expression was up-regulated in ESCC tissues, and PTMA might be a potential candidate for ESCC. Since Western Blot cannot achieve rapid and high-throughput screening of mass spectrometry results, the emergence of QDB meets this demand and provides an effective method for the identification of biomarkers.


Introduction
Esophageal cancer (EC) is one of the malignant tumors with a 5-year survival incidence of 20.9% [1,2]. EC is ranked as the eighth most common malignant tumor with the sixth highest mortality rate worldwide. There are two histological subtypes of EC: esophageal squamous cell carcinoma (ESCC) and esophageal adeno carcinoma (EAC). ESCC often occurs in the top or middle of the esophagus, and starts in the flat thin cells that make up the lining of the esophagus. Meanwhile, EAC is most common in the lower portion of the esophagus, and starts in the glandular cells that are responsible for the production of fluids such as mucus. China is a high-risk area for EC, and more than 90% of cases are esophageal squamous cell carcinoma (ESCC) [3][4][5]. Moreover, most of the patients exhibit locally advanced or metastatic EC at the time of being diagnosed [6,7]. Therefore, it is urgent to discover biomarkers for early clinical diagnosis to improve survival.
Esophageal cancer biomarkers have been found in saliva, blood, and urine. Sedighi et al. showed that the serum level of Matric metalloproteinase (MMP)-13 in ESCC patients were significantly higher than in the control group, and suggested that the MMP-13 was associated with increasing ESCC invasion, lymph node involvement and decreased survival rates [8]. In saliva, the miRNAs (miR-10b*, miR-144 and miR-451) were identified up-regulated expression in EC, which possessed discriminatory ability of detecting EC [9]. Although these biomarkers contribute to the early diagnosis and prognosis of EC, the EC biomarker is still in the stage of exploration and verification, with limitations of specificity and low sensitivity.
Proteomic technologies have been applied to understand tumor pathogenesis, and to discover novel targets for cancer therapy or prognosis. Combining MS-based proteomic data with integrative bioinformatics can predict protein signal network and identify more clinical relevant molecules [10][11][12]. To date, quantitative proteomic methods have been applied in the study of various cancer, such as breast cancer, lung cancer, pancreatic cancer and gastric cancer [13]. Mass spectrometric identification of differentially expressed proteins has been a highly successful approach for finding novel cancer-specific biomarkers [14]. For more than a decade, attempts have been made to uncover valid biomarkers for the diagnosis of EC. Currently, various molecules have been identified as closely correlated with ESCC, such as transgelin (TAGLN) and proteasome activator 28-beta subunit (PA28β) [15], pituitary tumor transforming gene (PTTG) [6], transglutaminase 3 (TGM) by proteomics [2]. However, the number of proteins identified was limited in these studies and they did not provide validation of the suggested biomarkers. Therefore, it is still necessary to perform further in-depth proteomics to explore novel candidate biomarkers for EC, and to validate the findings with orthogonal techniques.
Differential proteins obtained from mass spectrometry are commonly identified by Western Blot. However, it couldn't meet the requirements for high-throughput analysis, due to the complicated processing steps and the requirements for large amount of total protein. Recently, Quantitative Dot Blot (QDB) technology developed by our team achieves high-throughput quantitative detection with the same principle of traditional Western Blot. In addition, QDB technology has the advantages of less sample consumption, short time consumption and low cost [16]. The experiment has been successfully applied to the detection of biomarker of papillary thyroid carcinoma. With its accuracy and reliability, the QDB is a very effective method for protein detection.
The aim of this study was to investigate the protein expression profiles in ESCC tissues and adjacent normal esophageal tissues with a label-free quantitative proteomics approach through nano-liquid chromatography coupled with tandem mass spectrometry (Nano-LC-MS/ MS). The differentially expressed proteins were selected and their expression trends were validated in ESCC by Western Blot, then high-throughput protein screening was achieved by QDB, and the results of QDB were verified by classical IHC experiment. This research provides a new methodological strategy for validation and identification ESCC biomarkers by combining quantitative proteomic with QDB.

Tissue samples
The five patients for LC/MS analysis were all male, with the average age of 61. Samples of ESCC tissues and adjacent normal esophageal tissues were taken for mass spectrometry analysis. The 64 pairs of matched ESCC and adjacent normal tissue samples for QDB were based on a clear pathological diagnosis, which included 35 men and 29 women, with an age range of 46-73 years (mean 61 years). The above samples were obtained at the Affiliated Yantai Hospital of Binzhou Medical University. All data were obtained from patient medical records. All specimens were quickly rinsed and then frozen immediately in liquid nitrogen and then stored at − 80 °C until further processing. The tissue microarrays (TMA) (ES701 and ES1922) for immunohistochemistry analysis were purchased from the alenabio company, the total sample size reached 117 pairs after removing duplicates in two arrays (n = 14). This study was approved by the Human Research Ethics Committee of Binzhou Medical University.

Sample preparation
The 5 pairs of clinical samples were homogenized and broken with lysis buffer containing 9 M Urea, 20 mM HEPES, and protease inhibitor cocktail. The samples were centrifuged at 12,000×g for 10 min at 4 °C and supernatants retained. Then 20 μg of total protein were digested using the way of in-solution digestion. Firstly, the samples were reduced with 50 mM dithiothreitol (DTT) at 50 °C for 15 min, then alkylated with 50 mM iodoacetamide (IAA) for 15 min in darkness, and then diluted 4 times with digestion buffer (50 mM NH 4 HCO 3 , pH 8.0). The proteins were digested by Trypsin with a final concentration of 5% (w/w), then incubated at 37 °C overnight. The reaction was stopped by diluting the sample 1:1 with trifluoroacetic acid (TFA) in acetonitrile (ACN) and Milli-Q water (1/5/94 v/v). Finally, peptides were desalted using Pierce C18 Spin Columns and dried completely in a vacuum centrifuge.

LC-MS/MS
The peptides were dissolved in 20 μL 0.5% TFA in 5% ACN and analyzed using QExactive Plus Orbitrap ™ mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) coupled with the liquid chromatography system (EASY-nLC 1000, Thermo Fisher Scientific, Bremen, Germany). A 85-min LC gradient was applied, with a Table 1 The clinical features of ESCC patients for mass  spectrometry   No. Gender Age Organ/anatomic site  Grade TNM   1  Male  69  Mid-thoracic esophagus  II  T2N0MO   2  Male  61  esophagus  I  T1N0M0   3  Male  59  Middle-lower esophagus  II  T1N0M0   4  Male  52  Mid-thoracic esophagus  III  T3N0M0   5  Male  64 Middle segment of esophagus II T2N1M1

Proteomic data processing
The acquired data were analyzed by using Maxquant (version 1.5.0.1) against the UniProt Homo sapiens database. The searching parameters were set as maximum 10 and 5 ppm error tolerance for the survey scan and MS/ MS analysis, respectively. The enzyme was trypsin, and two missed cuts were allowed. The max number of modifications per peptide is 5. Using the Label-free quantification (LFQ), the LFQ minimum ratio count was set to 2. The FDR (false discovery rate) was set to 1% for the peptide spectrum matches (PSMs) and protein quantitation. Gene ontology and protein class analysis were performed with the PANTHER system (http://panth erdb. org/). Meanwhile, the heat map of significantly different proteins was screened by using Morpheus (https ://softw are.broad insti tute.org/morph eus). The protein-protein interaction analysis of the differently expressed proteins was performed by STRING (https ://strin g-db.org/).

Western blot (WB)
Tissues lysates were prepared by using highly efficient RIPA lysis buffer including PMSF (Phenylmethanesulfonyl fluoride). The total proteins were quantified by BCA protein assay kit and then separated by sodium dodesyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE). Equal amounts of protein were separated by 6%, 15% and 12% SDS-PAGE, respectively. Subsequently,

Immunohistochemistry (IHC)
The PTMA expression was detected by IHC in tissue microarrays (TMA) (ES701, ES1922). Firstly, the tissue microarrays were heated at 60 °C for 30 min, then deparaffinized and hydrated with xylol and gradient alcohol, respectively. Next, the antigen retrieval was accomplished by boiling the TMAs for 10 min in citrate buffer (0.01 M, pH 6.0). After cooling at room temperature, the microarrays were treated with 3% hydrogen peroxide for 30 min at 37 °C. The samples were blocked with bovine serum albumin for 30 min at 37 °C, then the PTMA antibody (YN2871, ImmunoWay; dilution 1:50) were incubated overnight at 4 °C in a moist chamber. After using the Histostain-SP (Streptavidin-Peroxidase) kit (SP-0023) as the secondary antibody following the recommendation from the manufacture, operation manual, the samples were washed with PBS (0.01 M, pH 7.2-7.4). Finally, the

Statistics analysis
The WB data was analyzed by means and standard deviation for four independent experiments. The other data was compared between esophageal cancer tissues and adjacent normal esophageal tissues using the two-tailed paired Student's t test. All statistical analyses were performed by using the statistical software SPSS v20.0 (Chicago, Illinois, USA). P < 0.05 was considered statistically significant.

Identification of differently expressed proteins
The clinical information of the five patients was summarized in Table 1. The five pairs of cancer tissues and adjacent normal tissues were analyzed by label-free mass spectrometry. Total 2297 proteins were identified and 308 proteins with significant differences were selected. Among these proteins, 102 proteins were expressed only in ESCC tissues (Table 2), 155 proteins were significantly up-regulated (Table 3) and 40 proteins were down-regulated in ESCC tissues (Table 4) (P < 0.05).
Using the PANTHER classification system, we analyzed the biological significance of these proteins including the cellular component, molecular function and biological process (Fig. 1). The majority of proteins belonged to cell part proteins (37.3%) and organelle proteins (30.1%), possessed the ability of binding (41.8%) and catalytic activity (25.8%), and involved in the cellular process (29.6%), metabolic process (20.2%), cellular component organization or biogenesis (16.3%).

Bioinformatics analysis of differentially expressed proteins
A volcano plot was generated based on the differential expression ratio and P value (Fig. 2a). Moreover, the heat map of significantly different proteins was shown in Fig. 2b by using Morpheus (https ://softw are.broad insti tute.org/morph eus). Further protein-protein interaction analysis of the differently expressed proteins was performed by STRING, the result was shown in Fig. 3. Out of the four proteins selected for next analysis, the PPI network analysis revealed that PTMA was a valid target of c-myc transcriptional activation, while PPP1CA was involved in down-regulation of TGF-beta receptor signaling. PAK2 plays a role in apoptosis and activation of Rac, while HMGB2 is participating in chromatin regulation and retinoblastoma in cancer. Above mentioned, all these four proteins were associated with the occurrence and development of cancer. Bioinformatics analysis of the four genes from TCGA database revealed that the four genes up-regulated in gene level in EC tissue (Fig. 4). Whether these four genes can be used as biomarkers of esophageal cancer remains to be further studied.

Validation of differentially expressed proteins by Western Blot
To further validate the LC-MS/MS results, we evaluated the four up-regulated proteins (PTMA, PAK2, PPP1CA, HMGB2) and the five down-regulated proteins [Caveolin, Integrin beta-1, Collagen alpha-2(VI), Leiomodin-1 and Vinculin] with Western Blot on the same samples. Compared with adjacent normal tissues, the protein expression of PTMA, PAK2, PPP1CA, HMGB2 were upregulated (Fig. 5a, b), and the protein expression of Caveolin, Integrin beta-1, Collagen alpha-2(VI), Leiomodin-1, Vinculin were down-regulated in ESCC tissues from four pairs of samples (Fig. 5c, d). The results showed that the trends expression of these proteins were consistent with the LC-MS results.

Validation of PTMA involved in ESCC by QDB and IHC
In order to validate the proteins identified by mass spectrometric, the QDB technique was applied in a larger set of samples. We collected the samples of 64 patients, and the relevant clinical information was summarized in  Table 5. In the analysis of 64 patient samples, we found that 53 out of 64 esophageal cancer tissues showed higher PTMA expression than in the normal tissues (P < 0.001) (Fig. 6). This trend was in accordance with the previous data. To further validate the QDB results, we performed the tissue microarray analysis by IHC. The results showed that among 117 pairs of tissues, the high expression rate of PTMA in tumor tissues was 98% (115/117). A significant overexpression of PTMA was found in tumor tissues in contrast to adjacent normal tissues (P < 0.01) (Fig. 7). The sample information in the chip is summarized in Tables 6 and 7. We further evaluated the expression pattern of PTMA with the progression, and analyzed the PTMA expression trend in the different tumor Grades. The results revealed that the PTMA expression was up-regulated gradually along the progression of ESCC (Fig. 8). The PTMA expression ratio between tumor and adjacent normal tissue was significantly increased along with the progression (P < 0.05). So we can suspect that PTMA might be participating in the development of esophageal cancer.

Discussions
At present, most patients with esophageal cancer are diagnosed at the late and advanced stages [17]. It is thus urgent to reveal biomarkers related to the progression of esophageal cancer for early diagnosis. Recently, several biomarkers were identified in EC detection, diagnosis, treatment and prognosis. For example, the epidermal growth factor receptor (EGFR), vascular endothelial growth factor (VEGF) and estrogen receptor (ER) were important detection factors for immunohistochemistry in EC [18][19][20]. In blood, the serum p53 antibody had a potential diagnostic value for EC, however, the detection was limited by its low sensitivity [21]. Therefore, we need to discover and verify more biomarker candidates for the prediction, diagnosis, treatment and prognosis of esophageal cancer. Mass spectrometry is an effective method for finding distinct molecular regulators, between normal tissues and cancer tissues [22]. In current study, we proposed a significant proteomics profiling difference including 308 proteins. However, compare to previous tissue-based  SD; b, d).The experiments were repeated at least three times, N represented normal tissues and T represented tumor tissues ESCC proteomics study, a poor overlap of proteome profiling was noticed. There are several potential reasons. First, like many other cancers, ESCC is a heterogeneous cancer with different gene expression profiles from different populations [23]. Recently, the whole-genome sequencing revealed the diverse models of structural variations in ESCC, which indicted the biological differences among patients [24]. Therefore, the proteome variation may be a consequence of distinct molecular signatures that exist in ESCC. Another reasons could be related to the different experiment design, some of studies pooled several individual samples into a sample pooling, which would also lead to potential difference compare to our individual analysis [25]. The difference of data analysis method would be another reason too, most of the labeled-based MS approach selected the expression fold change as the major criteria. In our study, with a label-free approach, we proposed paired Student's t-test significance as the main criteria. Such difference could lead to a different proteome profiling. The poor overlap indicated the importance of large-scale validation of biomarker. Thus we suggest in future studies, the proposed novel biomarker should be validated in a larger population no less than 100 samples. Besides TMA, our group recently developed QDB as a novel fast and accurate validation approach, which can easily validate biomarkers up to thousand samples [16].
Human prothymosin-α (PTMA) is a 109 amino acid protein belonged to the α-thymosin family, which is ubiquitously distributed in mammalian blood, tissues and especially abundant in lymphoid cells. However, its role still remains elusive. The growing evidences suggested that PTMA being an important immune mediator as well   Fig. 6 The relative PTMA expression was tested by QDB in ESCC and adjacent normal tissues from 64 esophageal cancer patients. a The differential expression of PTMA was shown in each pair of tissues. b The PTMA expression was up-regulated in esophageal cancer tissues from the average of 64 pairs of tissues as a biomarker might eventually become a new therapeutic target or diagnostic method in several diseases such as cancer and inflammation [26]. So we focused on the possibility of PTMA as a biomarker of ESCC.
The proteomic studies show that PTMA exerts multifunction in nuclear and cytoplasmic. In proliferating cells, PTMA mainly locates in nuclear depending on the C-terminus signal sequence, but this protein can be transferred from the nucleus into the cytoplasmic during the cell extraction process [27,28]. PTMA may mediate the chromatin activity by participated the nuclear-protein complex. In cytoplasmic, the function of PTMA is related to the state of phosphorylation, for example, the Thr7 is the only residue phosphorylated in carcinogenic lymphocytes while the Thr12 or Thr13 phosphorylated in normal lymphocytes [29,30]. The co-immunoprecipitation experiments shows that PTMA interact with SET, ANP32A and ANP32B to form the complex, which is related to the cell proliferation, membrane trafficking, proteolytic processing and so on [31][32][33].
In this study, we included both explore experiment and validation experiment, using early and late stage samples. The results from explore experiment indicated that PTMA was overexpressed in all stages. We further evaluated the expression pattern of PTMA with the progression, and analyzed the PTMA expression trend in the different Grades. The results revealed that the PTMA expression was up-regulated gradually along the progression of ESCC, and the PTMA expression ratio between tumor and adjacent normal tissue was significantly increased along with the progression. As it is almost impossible to obtain the extreme early stage (such as the stage without any symptom, or the stage prior to Grade I), but from the trend between Grade I and III, we can suspect the expression ratio of PTMA would be a potential indicator for the progression, even in the early diagnosis.

Conclusions
In our research, we used label-free quantitative proteomics to detect differentially expressed protein profiles in ESCC tissues compared to control tissues. In total 2297 proteins were identified and 308 proteins with significant differences were selected for study. Based on in-depth bioinformatic analysis, the four up-regulated proteins