Proteomics-based diagnostic peptide discovery for severe fever with thrombocytopenia syndrome virus in patients

Severe fever with thrombocytopenia syndrome (SFTS) virus is an emerging infectious virus which causes severe hemorrhage, thrombocytopenia, and leukopenia, with a high fatality rate. Since there is no approved therapeutics or vaccines for SFTS, early diagnosis is essential to manage this infectious disease. Here, we tried to detect SFTS virus in serum samples from SFTS patients by proteomic analysis. Firstly, in order to obtain the reference MS/MS spectral data of SFTS virus, medium from infected Vero cell culture was used for shotgun proteomic analysis. Then, tryptic peptides in sera from SFTS patients were confirmed by comparative analysis with the reference MS/MS spectral data of SFTS virus. Proteomic analysis of culture medium successfully discovered tryptic peptides from all the five antigen proteins of SFTS virus. The comparative spectral analysis of sera of SFTS patients revealed that the N-terminal tryptic peptide of the nucleocapsid (N) protein is the major epitope of SFTS virus detected in the patient samples. The prevalence of the peptides was strongly correlated with the viral load in the clinical samples. Proteomic analysis of SFTS patient samples revealed that nucleocapsid (N) protein is the major antigen proteins in sera of SFTS patients and N-terminal tryptic peptide of the N protein might be a useful proteomic target for direct detection of SFTS virus. These findings suggest that proteomic analysis could be an alternative tool for detection of pathogens in clinical samples and diagnosis of infectious diseases.


Introduction
Severe fever with thrombocytopenia syndrome (SFTS) virus (SFTSV) is a causative agent of SFTS, which is a new emerging infectious disease with no approved therapeutic or vaccines and has high mortality rate (more than 30%) [1]. The major clinical features of SFTS are myalgia, high fever, fatigue, abdominal pain, and nausea/vomiting [2,3]. The SFTS virus has a three segmented genome: the L segment encodes the RNA-dependent RNA polymerase (RdRP), the M segment encodes glycoproteins (Gn and Gc), and the S segment encodes the nucleocapsid (N) and nonstructural proteins (NS). Gn and Gc form a heterodimer on the surface of the virus, and the N proteins function as a scaffold to facilitate packing of virus particles.
Early diagnosis is essential to manage SFTS since the lethality of SFTS is relatively high and the clinical manifestations are non-specific [4]. Several methods are used to diagnose SFTS. In general, molecular diagnostic approaches are preferred due to their high sensitivity and selectivity [5]. Real-time RT-PCR and RT-LAMP tests are developed to detect SFTSV directly [6][7][8][9]. Serological tests are also important diagnostic tools; these tests detect immunoglobulins (IgM and IgG) that target antigenic SFTSV proteins in human serum [10,11]. Recently, monoclonal antibodies specific for the N protein of SFTSV were developed for use in SFTS antigen detection tests [3,12]. Additionally, direct observation of SFTSV by electron microscopy was reported as an alternative diagnostic method [13,14]. However, direct detection of SFTSV in patients by targeted-proteomic was not reported yet.
In general, direct detection of pathogenic viruses in patients using proteomic has not been frequently reported. This is due mainly to technical and clinical difficulties to overcome limit of detection (LOD) because concentration of pathogenic viruses is very low and duration of detectible virus in the host cells is very short. In addition, abundant host cell proteins increase the complexity of clinical samples and hinder specific detection of target viruses. However, direct detection of pathogenic virus is important for understanding the mechanism of infection and for screening of diagnostic antigens. Therefore, improved novel proteomic approaches for direct detection of pathogenic virus are required [15,16]. The information provided by proteomic-based approaches will be valuable for planning strategies regarding selection of target proteins and generation of diagnostic antibodies specific for target peptides or proteins. Recently, many researchers have been paying close attention to methods that enable direct detection of SARS-CoV-2; indeed, SARS-CoV-2 proteins can now be detected in gargle solutions, nasal swabs, and scrapings of the epithelium of COVID-19 patients [17][18][19][20].
Here, we performed proteomic analysis of serum specimens from SFTS patients to detect SFTS virus directly. For this purpose, we used proteomic data derived from analysis of virus culture medium as a reference for comparative spectral analyses. To the best of our knowledge, this is the first report describing a proteomic assay for direct detection of SFTSV in patient serum.

Sample preparation of culture cell and patient's sera
African green monkey kidney cell line (Vero E6 ATCC CRL-1586) was used for the amplification of SFTSV. Cells were cultured in complete media (DMEM with 10% Fetal bovine serum, 1× Penicillin-Streptomycin media) at 37 ℃ with 5% CO 2 . Human Origin SFTSV (KADGH; NCCP43261) was donated from KCDC. SFTS virus was inoculated into monolayer of Vero E6 cell, which was cultured in inoculate culture media (DMEM with 1× penicillin-Streptomycin) for 60 min. Culture dish was mildly shaken at 15 min intervals to increase efficiency of inoculation. Inoculated cells were transferred into new culture media (DMEM with 2% FBS, 1× penicillin-streptomycin) and cultured for 5 days. Supernatant was centrifuged at 15,000 rpm (20,000g) for 10 min at room temperature and precipitates were used for next step of sample preparation. Sera of patients also treated as same procedure. Samples (precipitates of cultured cells and sera) were mixed with same volume of lysate buffer (25 mM ammonium bicarbonate, 4% sodium dodecyl sulfate) and boiled for 10 min. Supernatant of reaction solution was prepared by centrifugation at 15,000 rpm (20,000g) for 10 min. As final step of sample preparation, albumin depletion kit (85160, Thermo Scientific, USA) was used for albumin removal and each sample was used for proteomic analysis of SFTS virus and cultured cell.

Proteomic analysis and bioinformatic analysis
Prepared sample proteins were separated by 12.5% SDS-PAGE and performed in-gel tryptic digestion by previously reported methods [21]. MS/MS analysis was performed using Q Exactive Plus mass spectrometer (Thermo Scientific, USA). MS/MS data of cultured SFTS virus were analyzed by using MASCOT 2.4 with an integrated database that was constructed with Uniprot Human proteome database and integrated SFTS virus database. The integrated SFTSV database was constructed by combining the protein sequences which were downloaded from the ViPR (https:// www. viprb rc. org/) with SFTS virus KACNH3 (Accession No. of NCBI KP663743-KP663745), isolated from Korea, as a reference. In general, proteomic detection of pathogens such as viruses or bacteria in patient serum has serval technical hurdles to overcome. First, pathogens are present at very low concentrations in specimens of patients. Second, the presence of serum abundant proteins (such as albumins, globulins, and fibrinogen), can be a hindrance to detect low copy number of proteins originated from pathogens. Last, but not least, is the lack of a suitable proteomics database and/or informatics tools for precise identification of pathogenic viruses or bacteria. To minimize these difficulties, we used patient serum samples containing different concentrations of virus to identify the optimal virus concentration for clinical proteomics. We also used an albumin depletion kit to reduce the complexity of samples, as well as spectral analysis programs and an in-house database of SFTS viruses. Finally, we used MS/MS spectral data derived from analysis of SFTSV cultured in Vero E6 cells as a reference; this enabled us to obtain accurate spectra data for SFTSV in clinical samples. The peptides of SFTS virus from patients were identified by COSS 1.0, using spectral library of SFTSV cultured in Vero E6 cells [22]. The spectral library for COSS analysis was constructed using the results of PeptideProphet and Spectra ST of the Trans-Proteomic Pipeline (TPP) [23]. The identified spectral peaks of SFTS virus proteins were confirmed by PEAKS Studio 7.0 (Bioinformatics Solution Inc., ON, Canada).

LC-parallel reaction monitoring (PRM) MS/MS analysis
Additional peptide identification was performed using the same instrument in PRM mode. First, 11 peptides of the NP protein were selected as targets (based on uniqueness) for the PRM method. Two target peptides (ELAYEGLDPALIIK, aa 27-40; and GILGPDGVPSR, aa 223-233) were finally selected for LC-PRM MS/MS analysis based on the spectral count, length, hydropathy, reactive residues, and modification motifs, as previously discussed [24]. Their stable isotope-labeled peptides (heavy peptides, Lysine-13C(6)15 N(2) and Arginine-13C(6)15 N(4)) were purchased from AnyGen (Gwangju, Korea) to develop the PRM assay. Second, synthetic peptides (1 pmol/μL each) were analyzed to optimize parameters. Target peptides and a list of transitions were selected from Skyline platform version 21.2 (MacCoss Lab Software; https:// skyli ne. ms). These results were also used to generate spectral libraries. The standard samples were separated with 0.1% formic acid in water and acetonitrile/0.1% formic acid, using a 35 min gradient at a flow rate of 250 nL/min. The maximum acquisition time and automatic gain control were set at 100 ms and 5 × 10 4 , respectively. Thirdly, the standard method was applied for correct peptide identification, as heavy peptides co-elute with the peptides of interest. A blank solvent (0.1% formic acid in water) was injected between samples to prevent sample carryover. The mProphet method was used for peak picking, FDR estimation with a reverse database (filtered using a q-value < 0.01), and data validation [25]. The peak area ratio to heavy peak areas was used for quantification.

Structural analysis of tryptic peptides in overall structure of SFTSV N protein
The 3D structures of SFTSV N protein have been previously reported by Zhou et al. and Jiao et al. [26,27]. We analyzed where the sequences of the patient-derived

Proteomic analysis of culture medium from Vero E6 cells infected with SFTSV
The initial step of proteomic identification of SFTSV was to use culture medium from infected Vero E6 cells as a reference to obtain spectral data. Precipitates of culture medium were subjected to tryptic digestion prior to shotgun proteomic analysis using a Q Exactive Plus mass spectrometer. Sequence coverage of each protein components of SFTS virus was in the range of 61-92%. In particular, the sequence coverage of the N protein was high (92%) (Fig. 1). Considering that each protein component has a unique copy number in SFTS virus, this semi-quantitative proteomic data reveals the abundancy of each protein, as well as which proteins are likely to be the best diagnostic makers. Based on our proteomic results, we suggest the N proteins can be strong candidates for SFTS diagnosis. An interesting point is that each tryptic peptide derived from the N protein showed a different peptide-spectrum match (PSM). In particular, the N-terminal tryptic peptide (7-26th) had the highest score. Next, we selected six SFTS patients and used their sera for proteomic analysis. The clinical characteristics of the six patients and virus CT values are summarized in Table 1 and Additional file 1: Table S1, respectively. The estimated virus copy number for each patient ranged from 2.18 × 10 4 to 3.03 × 10 1 copies/mL, based on Ct values for the M segment. Each tryptic peptide derived from SFTSV proteins detected using the proteomics method is summarized in Table 2. Identified peptides were confirmed by comparative spectral analysis of culture cell medium and patient serum (Additional file 2: Fig. S1). The results showed that tryptic peptides derived from SFTSV were detected mainly in two SFTS patients (SFTS-007 and SFTS-024) with a high viral load. However, few if any tryptic peptides were detected in the other four patients (SFTS-032, SFTS-033, SFTS-034, and SFTS-041). Among the identified tryptic peptides, the N-terminal tryptic peptide (7-26th) derived from the N protein was detected most frequently ( Table 2).

Validation of SFTS NP in serum using LC-PRM MS/MS
The previously identified peptides of NP protein were validated by PRM-MS analysis in thirteen SFTS patients and two normal subjects, including six SFTS patients analyzed in shotgun proteomics. Among the NP proteinderived peptides identified, we selected two proteotypic peptides (ELAYEGLDPALIIK and GILGPDGVPSR) for further investigation. Details about each peptide are provided in Additional file 3: Table S2. We estimated the Table 1 Demographic, clinical characteristics and laboratory findings of SFTS patients SFTS, severe fever with thrombocytopenia syndrome; SD, standard deviation; IQR, interquartile range; WBC, white blood cell; aPTT, activated partial thromboplastin time; AST, aspartate aminotransferase; ALT, alanine aminotransferase; LD, lactate dehydrogenase; CRP, C-reactive protein a Includes myocardial infarction, congestive heart failure, and peripheral vascular disease b Chronic obstructive pulmonary disease, asthma concentration of targets in patient serum using stable isotope-labeled peptides (each at 1 pmol/μL). The chromatogram, retention times, and transition rank order of the detected peptides matched well with those of the heavy peptides (Additional file 4: Fig S2). Also, all dot product (dotp) values were 0.96 or higher, suggesting that the identified peptides were derived from SFTS virus in serum (Fig. 2a). Thus, the two peptides derived from the N protein were identified in all 13 SFTS patients. The peak area ratio to heavy peptide is shown in Fig. 2b. Patient numbers 7, 13, and 24 show higher values than the others, which corresponds with the previous results.

Discussion
Here, we performed proteomic analysis of serum from SFTS patients. Culture medium from Vero cells infected with SFTSV was used as a reference for comparative analysis of mass spectra. Serum samples from six SFTS patients were used to identify pathogen-derived tryptic peptides. We found that the N-terminal tryptic peptide (7-26th) of the N protein was the major SFTSV-derived peptide detected in serum samples from two patients (SFTS-007 and SFTS-024). Analysis of these two patient samples suggested that the LOD of label-free LC-MS/MS shotgun proteomics ranges from 10 2 -10 3 copies/mL. However, the results suggest that consistent and reliable detection of SFTSV using Label-free LC-MS/MS shotgun proteomics requires at least 10 4 copies/mL. Thus, we performed LC-PRM MS/MS to validate N proteins in patients with SFTS. We found that two tryptic peptides (ELAYEGLDPAL-IIK, aa 27-40; and GILGPDGVPSR, aa 223-233) were present in all 13 SFTS patients and two normal subjects. Although the shotgun proteomics method is less sensitive than PRM MS/MS, the peptides identified by the shotgun method were consistent with those identified by PRM MS/MS. It seems that our identifying approaches to finding pathogen-derived peptides were Fig. 2 Qualitative characteristics of the PRM assay. a All PRM transitions for target peptides were compared with their corresponding spectral library transitions. Each color bar represents one transition ion and its relative intensity among the others. The dot product (dotp) annotated above the bar graph is the normalized dot product of the light transition peak areas with the corresponding intensity in the library. b Relative quantification of target peptides in sera specimens of 13 SFTS patients and two normal subjects. The values were calculated based on each heavy peptide (1 pmol) peak area. The red bar denoted ELAYEGLDPALIIK peptides and blue bar denoted GILGPDGVPSR peptides. N.D.*Not detected suitable in Label-free LC-MS/MS shotgun proteomics. However, quantitative results showed inconsistency with respect to Ct values and detection rates of SFTS viral peptides in SFTS patients. Therefore, far more patient samples will be needed to obtain more accurate data regarding the LOD of MS/MS for SFTSV. The proteomics data presented herein raise the question of why tryptic peptides (particularly the N-terminal tryptic peptide) of the N protein are detected more easily than those of other SFTSV proteins. The most likely explanation is the 3D structure of the N-protein. We elucidated the predicted 3D structure of the N-protein using the PyMol program (Additional file 5: Fig. S3). The results suggest that the N-terminal region, which comprises two tryptic peptides (7-26th and 27-40th) of the N protein is exposed on the outside of the structure, making it more prone to denaturation and tryptic digestion. Thus, these two tryptic peptides would be more detectible by MS/MS analysis. In addition, the N-terminal region of the N protein may be highly immunogenic because host immune system. Yu et al. used phage library approach to generate monoclonal antibodies (mAbs) specific for the N protein of SFTSV virus; they found that the N-terminal region of the N protein was the major binding site for most of the mAbs generated [12]. This result supports our assumption that the N-terminal region of SFTSV is more immunogenic and important binding sites with mAbs against N-proteins. Therefore, mAbs specific for the N-terminal region of the N protein will be useful in lateral flow tests or antibody/antigen based-biosensors for detection of SFTSV.
In general, the LOD of lateral flow tests and antibody/ antigen-based biosensors is around 10 2 -10 6 copies/ mL according to the performance of antibody-antigen reaction [28,29]. Therefore, the data presented herein suggest that sensitive proteomics analysis approaches are an alternative tool for detection and diagnosis of SFTSV in clinical samples. Additionally, the results suggest that LC-based proteomics analysis is a useful tool for screening diagnostic peptides-derived from pathogenic viruses.

Conclusions
Proteomic analysis of SFTS patient samples revealed that nucleocapsid (N) protein is the major antigen proteins in sera of SFTS patients and N-terminal tryptic peptide of the N protein might be a useful proteomic target for direct detection of SFTS virus. These findings suggest that proteomic analysis could be an alternative tool for detection of pathogens in clinical samples and diagnosis of infectious diseases.