iTRAQ and PRM-based quantitative proteomics in early recurrent spontaneous abortion: biomarkers discovery

Background Early recurrent spontaneous abortion (ERSA) is a common condition in pregnant women. To prevent ERSA is necessary to look for abortion indicators, such as hormones and proteins, in an early stage. Methods Thirty patients with ERSA were enrolled in the case group. In the control group, we recruited 30 healthy women without a history of miscarriage undergoing voluntary pregnancy termination. The differentially expressed proteins in the serum were identified between the two groups using PRM and iTRAQ. Results Seventy-eight differentially expressed proteins were identified. Using GO functional annotation and KEGG pathway analysis, we detected that the most significant changes occurred in the pathway of Fc gamma R-mediated phagocytosis. Meanwhile, using PRM, we identified three proteins that were closely related to abortion, B4DTF1 (highly similar to PSG1), P11464 (PSG1), and B4DF70 (highly similar to Prdx-2). The levels of B4DTF1 and P11464 were down-regulated, while the level of B4DF70 was up-regulated. Conclusions CD45, PSG1, and Prdx-2, were significantly dysregulated in the samples of ERSA and could become important biomarkers for the prediction and diagnosis of ERSA. Larger‑scale studies are required to confirm the diagnostic value of these biomarkers.


Background
Early recurrent spontaneous abortion (ERSA), also called recurrent pregnancy loss (RPL), is a disease distinct from infertility, defined by two or more failed pregnancies [1]. According to the guidelines of the Royal College of Obstetricians and Gynecologists (RCOG), it is defined as a spontaneous abortion that occurs three times or more within the first 12 weeks of pregnancy and with the same sexual partner [2]. There are still many unknown causes of abortion besides those caused by genetics, autoimmune abnormalities, endocrine, anatomy, or pre-thrombotic state [2]. The incidence of ERSA is about 5% and advancing maternal age and history of multiple miscarriages are high-risk factors for ERSA [1,3]. Approximately in half of the patients with RPL, there is no explanation for their miscarriages [4]. Therefore, early prediction of the potential risk of ERSA is needed to increase the live birth rates in patients with ERSA [5].
Proteomics is an emerging discipline which involves the global analysis of changes in protein expression [6]. The application of proteomics technology had a significant impact on the etiology and pathogenesis assessment of many diseases, especially cancer, cardiovascular disease, diabetes, and neurological disorders [7][8][9]. Klein et al. [10] pointed out that proteomics can be useful for prediction, diagnosis, management, monitoring, and prognosis of several obstetric conditions that are associated with an increased risk of maternal and/or perinatal morbidity and mortality.
Numerous proteomic studies have shown that the human proteome regulates cellular function and

Open Access
Clinical Proteomics *Correspondence: heling118@126.com 2 The Affiliated Hospital and Clinical Institute, Jiangxi University of Traditional Chinese Medicine, No. 445, Bayi Avenue, Donghu District, Nanchang 330006, China Full list of author information is available at the end of the article determines the phenotype; therefore, the identification of relevant proteins is likely to reveal reliable biomarkers for disease prediction [11]. Some potential biomarkers for ERSA have been previously reported. Previous studies using LC-MS/MS and ELISA showed a significant decrease in the levels of insulin-like growth factor-binding protein-related protein 1 (IFGBP-rp1)/IGFBP-7, Dickkopf-related protein 3, the receptor for advanced glycation end products (RAGE), and angiopoietin-2 in patients with RSA [12]. Kim et al. [13] used blood samples from healthy and RPL patients to conduct a comparative proteomic study, they performed 2D-PAGE and the selected spots were analyzed with MALDI-TOF/MS. Their results suggested that inter-α trypsin inhibitorheavy chain 4 (ITI-H4) expression might be used as a biomarker. Using isobaric tags for relative and absolute quantification (iTRAQ) and ingenuity pathway analysis (IPA), Pan et al. [14] observed some altered protein expression in the placental villous tissue of patients with early recurrent miscarriage.
Searching for new biomarkers of ERSA is helpful for diagnosis, safety, and efficacy evaluation of the disease. Advances in proteomics have made this effort more efficient. However, there is no previous study that identified serum RSA biomarkers using parallel reaction monitoring (PRM). Therefore, in this study, except for iTRAQ and bioinformatics analysis, such as protein-protein interaction (PPI) network analysis, GO and KEGG, we used PRM to identify reliable biomarkers for the prediction of RSA.

Patients and controls
From October 2017 to December 2017, in the case group, we recruited 30 patients that had a previous abortion. Our inclusion criteria were based on the consensus of Practice Committee of the American Society for Reproductive Medicine (ASRM) and RCOG [1,2]. All the patients were diagnosed as ERSA except for chromosomal abnormalities, anatomical abnormalities, endocrine diseases, anatomical abnormalities of the genital tract, infections, immunologic diseases, trauma, and internal diseases. Gestational sacs without fetal heart rate were found using transvaginal ultrasound.
Meanwhile, in the control group, we recruited 30 women who terminated their pregnancy and did not have a history of abortion. The inclusion criteria we the following: women who underwent pregnancy termination at a gestational age of 6-10 weeks and had no previous history of recurrent spontaneous abortions, chromosomal abnormalities, anatomical abnormalities, endocrine diseases, anatomical abnormalities of the genital tract, infections, immunologic diseases, trauma, internal diseases, or any chemical agent intake before their pregnancy terminations [15].
Characteristics of participants were summarized in Table 1.

Ethical approval and sample collection
The study was reviewed and approved by the Ethics Committee of Jiangxi Provincial Maternal and Child Health Hospital. All the participants signed the informed consent after proper explanation of the study. Blood samples were collected from each participant. In the case group, blood was collected 1 to 2 months after the abortion. Following centrifugation, the collected serum was stored at − 80 °C until proteomic analysis. In the control group, the sera of the 30 participants were divided into three samples, numbers 113, 114, and 115. In the case group, the sera of the 30 participants were also divided into three samples, numbers 116, 117, and 118.

Protein extraction and peptide enzymatic hydrolysis
Serum pools were depleted of their most abundant proteins using Agilent Human 14/Mouse 3 Multiple Affinity Removal System Column following the manufacturer's protocol [16][17][18] (Agilent Technologies). The supernatant was quantified with the BCA Protein Assay Kit (Bio-Rad, USA). Twenty micrograms of proteins in each sample were mixed with 5× loading buffer and boiled for 5 min. The proteins were separated on 12.5% SDS-PAGE gel (constant current 14 mA, 90 min). Protein bands were visualized with Coomassie Blue R-250 staining. A moderate amount of protein was extracted from each sample, and trypsin enzymatic hydrolysis was performed using the filter aided proteome preparation (FASP) method, then desalting enzymolysis peptides was performed using

iTRAQ labeling
One hundred micrograms of peptide mixture in each sample were labeled using iTRAQ reagent according to the manufacturer's instructions (Applied Biosystems) [19].

Peptide fractionation with strong cation exchange (SCX) chromatography
The iTRAQ labeled peptides were fractionated with SCX chromatography using the AKTA Purifier system (GE Healthcare

LC-MS/MS analysis
Each fraction was injected in the nano-LC-MS/MS for analysis. The peptide mixture was loaded onto a reverse phase trap column (Thermo Scientific Acclaim Pep-Map100, 100 μm * 2 cm, nanoViper C18) connected to the C18 reversed-phase analytical column (Thermo Scientific Easy Column, 10 cm long, 75 μm inner diameter, 3 μm resin) in buffer A (0.1% formic acid) and separated with a linear gradient of buffer B (84% acetonitrile and 0.1% formic acid) at a flow rate of 300 nl/min controlled with IntelliFlow technology. LC-MS/MS analysis was performed on a Q Exactive Mass spectrometer (Thermo Scientific) that was coupled to Easy nLC. The mass spectrometer was operated in positive ion mode. MS data were acquired using the data-dependent top 10 method that chooses the most abundant precursor ions from the survey scan (300-1800 m/z) for HCD fragmentation. Automatic gain control (AGC) target was set at 1e6, and maximum inject time to 10 ms. Dynamic exclusion duration was 40.0 s. Survey scans were acquired at a resolution of 70,000 at m/z 200, resolution for HCD spectra was set to 17,500 at m/z 200, and isolation width was 2 m/z. The normalized collision energy was 30 eV and the underfill ratio, which specifies the minimum percentage of the target value likely to reach the maximum fill time, was defined as 0.1%. The instrument was operating with the peptide recognition mode enabled. MS/MS spectra were searched using the MAS-COT engine (Matrix Science, London, UK; version 2.2) embedded into Proteome Discoverer 1.4 [20]. The protein screening criteria for identification were FDR less than 0.01, and the differentially expressed proteins were screened with multiple changes greater than 1.2 times (iTRAQ labeling) and a P-value less than 0.05.

PRM
Samples identified with the above mass spectrometry were verified by PRM. The experimental procedure was following (refer to Fig. 1). First, a PRM method is established in the original sample. The experiment is performed after the method is determined to be stable and reliable. We take the same number of peptides in each sample, and mix the appropriate amount of stable isotope internal standard peptide. We use the pre-experimental PRM method to detect the target protein in each sample using LC-PRM/MS. The results of PRM mass spectrometry were analyzed with Skyline quantitative analysis. After the internal standard peptide signal was corrected, the expression level of the target protein was obtained in each sample. The expression levels of the target proteins in different groups of samples were analyzed with student's t-test.

Statistics
Clinical data were expressed as mean ± standard error of the mean (S.E.M.). Statistical analysis was performed with SPSS software version 20.0 (SPSS, Inc., Chicago, IL, USA). Student's t-test was applied for comparisons of quantitative data between the two groups, with P < 0.05 showing a significant difference. Multidimensional statistical test were calculated to estimate whether protein expression can predict the type of sample. Fisher's exact test was used for categorical analysis. Data from the iTRAQ experiment and the original PRM test were stored at ProteomeXchange (http://prote omece ntral .prote omexc hange .org/cgi/GetDa taset ). Their IDs are 318751 and 318759, respectively.

Differential protein identification results
In this project, differential proteomics analysis was performed in the serum of pregnant women with recurrent spontaneous abortion and normal pregnancy using the iTRAQ experimental method. The samples of the case group and the control group were labeled with n-label (adopting the 6-label method), and the differential proteomics detection was performed after labeling (Fig. 2).
A total of 977 proteins with unique peptides or polypeptide segments were identified, and a total of 40,855 characteristic peaks were identified ( Table 2). Compared with the control group, in the case group, we found that 47 proteins were significantly down-regulated, while 31 proteins were significantly up-regulated ( Table 3).
The differential protein in the clustering heat map (Fig. 3) can show that the biological repeats in the control and case group are good, and the protein level trends are consistent. Also, up-and down-changes are shown in the protein of both groups, and the screening standard was that multiple changes were greater than 1.    (up-regulation greater than 1.2 or down-regulation less than 0.83) and the P-value was less than 0.05. The differentially expressed proteins were visualized by mapping the volcano map (Fig. 4). Black represents nondifferentiated protein, and red represents differentially expressed protein. The arrow indicates the PRM-validated proteins. In the figure, 4 target proteins are labeled, in which, B4DTF1, P11464, and B4DF70 were selected for PRM.

Gene Ontology (GO) functional annotation and enrichment analysis of differentially expressed proteins
The differentially expressed proteins screened underwent GO function annotation using Blast2Go (https :// www.blast 2go.com/) software. Based on the results of the second level (Level 2), these differentially expressed proteins are primarily involved in cellular process, biological regulation, response to the stimulus, regulation of biological process, and metabolic process. The differentially expressed proteins might have some molecular functions, such as binding, catalytic activity, molecular function regulator, signal transducer activity, or molecular transducer activity (Fig. 5).
As shown in Fig. 6, the GO functional enrichment analysis of differentially expressed proteins using Fisher's exact test method, showed that these differential proteins were involved in critical biological processes, such as kidney epithelium development, nephron development, muscle adaptation, insulin-like growth factor receptor signaling pathway, or regulation of insulin-like growth factor receptor signaling pathway. Significant changes occurred in some molecular functions like SH3 domain binding, growth factor activity, insulin-like growth factor binding, and in localized proteins at cell division site part, cell surface furrow, cleavage furrow, cell division site, and centrosome.

KEGG pathway annotation and enrichment analysis of differentially expressed proteins
KEGG pathway analysis indicated that differentially expressed proteins are located in important pathways such as cell adhesion molecules, Fc gamma R-mediated phagocytosis, PI3K-Akt signaling pathway cytokines, Fig. 3 Cluster analysis of differentially expressed proteins in case_vs_control. Hierarchy clustering results expressed in tree heat maps, each row in the figure represents a protein, each column represents a set of samples, significant differences of protein level in the expression of different samples of numerical value (Log2Expression) to show different color in the heat map, the red represents significant increase protein, green represents significant lower protein, gray part represent quantitative information without protein cytokine receptor interactions, and regulation of actin cytoskeleton (Fig. 7). KEGG pathway enrichment analysis of differentially expressed proteins with Fisher's exact test revealed that significant changes have occurred in some important pathways, such as Fc gamma R-mediated phagocytosis, choline metabolism in cancer, taurine and cell adhesion molecules (CAMs), etc. (Fig. 8).
For example, in the Fc gamma R-mediated phagocytosis, the target proteins with noticeable differences include PTPRC (CD45), Gelsolin and WASP.

Protein-protein interaction (PPI) network analysis
The study of the interaction between proteins and the interaction network is of great significance for revealing the function of proteins. In a network, the number of proteins that interact directly with a protein is called the connectivity of that protein. In general, the greater the connectivity of the protein, the greater the disturbance in the whole system when the protein changes. The protein might be the key to maintain the balance and stability of the system, and it is a candidate protein for subsequent studies. By comparing proteins to STRING, the results showed that known proteins, such as ITIH4, PSGs, PLG, IFGBPs, FGB, APCS and CD45, account for a large weight in the network (Fig. 9).

PRM results
Four proteins related to abortion were found for PRM analysis. In the experiment, PRM quantitative analysis was performed on five peptides of three target proteins in 12 human serum samples, and quantitative information of the target peptides was found in all the 12 samples. The isotope re-labeled peptides were used to normalize the quantitative information, and then the relatively quantitative analysis of the target peptides and target proteins was performed. The results of differential multiples and T-test showed that there were some differences in the levels of the three target proteins under these two different conditions, which was consistent with the results of omics verification. The results showed that B4DTF1and P11464 level was down-regulated, while B4DF70 level was up-regulated (Table 4). B4DTF1 and P11464 have similar efficacy, and they are similar to the function of Pregnancy-specific beta-1-glycoprotein 1 (PSG1), while B4DF70 are similar to Peroxiredoxin-2 (Prdx2), which might be associated with oxidative damage.

Discussion
The use of proteomics to identify key proteins associated with abortion can provide insight into the mechanisms of ERSA. In this study, we found 78 differentially expressed proteins using iTRAQ, and bioinformatics analysis demonstrated that these proteins were implicated in several Fig. 4 Case_vs_control group volcano plots. The fold change and the P-value obtained by T test were used to draw volcanic plots to show the significant differences between the two groups. Abscissa is the difference multiple (logarithmic transformation with base 2), ordinate is the significance of the difference, P-value (logarithmic transformation with base 10), red dots in the figure are the proteins with significant difference (P < 0.05), and black dots are the proteins with no difference biological processes and molecular functions. Six proteins were significantly different and they were consistent with PSG1, Prdx2, CD45, ITI-H4, IGFBP and INHBE, compared with the database. A total of three proteins were selected for PRM.
ITI-H4 and IGFBP have been reported to be associated with miscarriage [12,13], findings similar to our results. INHBE was identified as a novel putative hepatokine with hepatic gene expression that positively correlated with insulin resistance and body mass index in humans [21].
Other similar studies showed that many differentially expressed proteins were identified, which were different from our results. For example, in 2014, Ni et al. [22] extracted all the proteins in the placental villus tissue of 5 patients with early spontaneous abortion and 5 patients with normal pregnancy requiring therapeutic abortion. Fifty-one differentially expressed proteins were identified using HPLC-MS. Bioinformatics analysis of the 12 proteins in these differential proteins might be involved in the biological process of spontaneous abortion, NES, P4HA2, PBXIP1 and GSTM2 are involved in the ability of trophoblasts to infiltrate the endometrium. The difference might be related to different proteomics techniques, different specimen and selected cases.
Moreover, our results indicated that the Fc gamma R-mediated phagocytosis might play an essential role in the mechanism of ERSA. CD45 were significantly downregulated in this pathway. Similar to this result, Lorenzi et al. [23] found that fetal CD100, CD72 and CD45 were expressed in placenta and exhibited different mRNA and protein levels in normal pregnancy and miscarriage, CD45 was down-regulated in miscarriage.
According to PRM, PSG1 and Prdx2 were considered biomarkers of ERSA. In the field of obstetrics, especially in abortion, there are more reports on PSG1, but few reports on Prdx2. Human PSGs were detected in the maternal serum after fertilized eggs have been implanted for 3 days, consistent with the time when the blastocyst adhered to the uterine wall [24]. PSGs are abundant in maternal serum, can induce the transformation of growth factor TGFβ-1, inhibit the function of T-cell, and promote angiogenesis [25]. It is important to have a better understanding of the molecules that control angiogenesis and trophoblastmediated vascular remodeling during pregnancy. Because disorders of blood flow and vascular development in the placenta could affect fetal growth [26]. Angiogenesis occurs at various stages of pregnancy to ensure that the embryo receives adequate nutrients and oxygen [27,28]. First, PSG1 can induce TGFβ-1 and VEGFA through different cell types, including monocytes, macrophages and natural killer cells [29,30]. Second, PSG1 has the ability to interact with endothelial cells, induce angiogenesis, and enhance angiogenic processes [31][32][33]. To the best of our knowledge, the association between PSG1 and RSA has not been reported previously. PSG1 was originally described in the early 1970s, but more research will likely contribute to demonstrate their importance for a successful pregnancy [34].
Prdx2 is an antioxidant protein that uses its redoxsensitive cysteine group to reduce hydrogen peroxide molecules and protect cells from oxidative damage from reactive oxygen species (ROS). Its role in the maternalfetal interface trophoblast has not been elucidated. A current study has shown that the expression of Prdx2 in the trophoblast of the patients with RSA within the first The color of the bar chart represents the significance of enriched GO functional classification, that is, based on Fisher's exact test to calculate the P value. The color gradient represents the size of P value. The color changes from orange to red. The label at the top of the bar chart shows the enrichment factor (richFactor ≤ 1), which represents the proportion of the number of differentially expressed proteins annotated into a GO function category to the number of all identified proteins annotated into the GO function category 3 months of pregnancy is significantly lower than in the healthy control group [35]. Applying Proteomic technology, a 2D-PAGE and MALDI-TOFMS study showed that Prdx2 was down-regulated in placental trophoblasts from patients with preeclampsia [36,37]. However in our study, we found that the level of Prdx2 was up-regulated  The abscissa represents the number of differentially expressed proteins contained in each KEGG pathway. As shown in the bar graph, color represents the significance of enriched KEGG pathways. Fisher's exact test is used to calculate the p-value. Color gradient represents the size of P-value. The label at the top of the bar chart shows enrichment factor (richFactor ≤ 1), which represents the proportion of the number of differentially expressed proteins involved in a KEGG pathway to the number of proteins involved in this pathway among all identified proteins in the case group. We considered that the difference could be related to the difference of specimens.
During our study, we faced with some limitations. For example, the sample size was relatively small, and assessment consistency between the level of target protein with the expression in decidual tissues was not discussed.

Conclusions
In conclusion, the present study used iTRAQ and PRMbased quantitative proteomics to find three biomarkers of ERSA. Compared with other similar studies, this study show improvement in detection techniques. This method can be more effective and accurate in the investigation of alterations in protein profiles. Furthermore, we identified PSG1, Prdx2, and CD45 as new serum biomarkers of ERSA, and their potential application in the maternalfetal interface will require further study. Larger-scale studies will be required to confirm the diagnostic value of these markers.