Patients
Thirty female patients diagnosed with pSS and 15 sex- and age-matched healthy persons were recruited from the Second Affiliated Hospital of Zhejiang University. The enrolled pSS patients met the diagnostic criteria of the 2016 American College of Rheumatology (ACR)–European League against Rheumatism (EULAR). The severity of dryness was evaluated using the patient-reported visual analogue scale (VAS). Results of laboratory tests and salivary gland biopsy were collected, as well as systemic involvement of the patients. The disease activity of pSS patients was evaluated using ESSDAI. The enrolled patients did not have immunosuppressive treatment prior to blood specimen collection. Peripheral venous blood samples (5 mL) were extracted from each participant using BD Vacutainer K2EDTA tubes, and PBMCs were immediately isolated by Ficoll-Hypaque density gradient separation. Trizol was added to PBMC samples immediately, then cells were lysed and subjected to total RNA extraction. This mixture was stored in a − 80 °C refrigerator until RNA extraction. The discovery cohort was composed of 4 pSS patients and 4 healthy controls, and the independent validation cohort was composed of 30 pSS patients and 15 healthy controls.
Prior to the blood sample collection, written informed consent was obtained from the participants. The protocols used in this study were performed following the principles of the Helsinki declaration and were approved by the Ethical Committee.
RNA extraction and purification
Total RNA was extrated from PBMCs using the miRNeasy Mini Kit (Qiagen, Germany) following the manufacturer’s instructions. High-purity RNA was obtained using the RNAClean XP Kit (Beckman Coulter, USA) and RNase-Free DNase Set (Qiagen). RIN number of RNA samples was detected on an Agilent Bioanalyzer 2100 (Agilent Technologies, USA) and NanoDrop ND-2000 spectrophotometer to evaluate RNA integrity. Samples with qualified measurements (2100 RIN ≥7.0, 28S/18S ≥0.7) were included in the subsequent transcriptome sequencing.
Transcriptome sequencing
Eight RNA samples from the discovery cohort were used for transcrptome sequencing. Sequencing RNA sample library construction was performed through a series of steps, including rRNA removal, fragmenting, first-strand cDNA synthesis, second-strand cDNA synthesis, terminal repair, addition of 3′ terminal A overhang, connection, and enrichment. The concentration was determined using a Qubit®2.0 Fluorometer and library size on Agilent 4200.
Cluster generation and first-to-sequencing primer hybridization were conducted on the cBot with the Illumina sequencer following the procedures acquired from the cBot instruction. Sequencing reagents were prepared according to the Illumina user guide, and flow cells carrying clusters were used for double-ended sequencing by the paired-end program. Data collection software of Illumina was controlled and analyzed in real-time the sequencing process.
High throughput sequencing (Illumina Hiseq 2000/2500 and Miseq) was used to complete the cDNA sequencing. Raw sequencing reads were filtered to obtain clean reads. The spliced mapping algorithm of HISAT2 (version 2.0.4) was used for genome mapping of the clean reads, using a genome version of hg38 (ftp://ftp.ensembl.org/pub/release-90/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz). The number of reads was converted to Fragments Per Kilobase of transcript per Million mapped reads (FPKM) values using Perl scripts after fragment counting (StringTie, version 1. 3.0) and normalization (trimmed mean of M values, TMM). Transcriptome sequencing read depths and mapping efficiency was shown in Table S1. Finally, 38,096 lncRNAs and 50,869 mRNAs were investigated.
After eliminating the low-expressed genes in the data (genes whose raw count was 0 in more than 75% samples), genes with a different expression between pSS samples and health samples were analyzed by edgeR. Meanwhile, we calculated the differential expression multiple (fold-change) based on the FPKM value. LncRNA and mRNA expression with a fold-change > 2 and a p-value less than 0.05 were considered to be differentially expressed genes.
Bioinformatics analysis
GO and KEGG functional enrichment analysis of mRNAs with different expression levels was accomplished using the DAVID online tool (https://david.ncifcrf.gov/; version 6.8). The results include Biological Process, Cellular Component, and Molecular Function. The Fisher test was used, and the enrichment threshold was set at p-value < 0.05.
The PPI of differentially expressed genes was obtained using the STRING (http://www.string-db.org/; version 10.0) database. PPI networks of differentially expressed genes were constructed using Cytoscape software (version 3.2.0). In the PPI networks, functional modules were identified by the MCODE plug-in of Cytoscape software, and finally, biologically meaningful protein complexes were obtained. Parameters were as follows:include loops: false degree cutoff: 10, node score cutoff: 0.2, cut: true, FF: false, K-Core: 2, max. Depth from seed: 100. GO and KEGG pathway enrichment analyses of genes in the modules were carried out.
lncRNA target genes were predicted using Trans and Cis regulation. Cytoscape software was used to construct the regulation network of top 15 up-regulation and down-regulation DE lncRNAs. MiRNAs and mRNAs that have interaction relationship with these DE lncRNAs were predicted in the StarBase Database (http://starbase.sysu.edu.cn/; version 2.0). The miRNA targets of DE mRNAs were predicted using the MiRDB Database (http://www.mirdb.org/). PSS related miRNAs were obtained from the HMDD Database (http://www.cuilab.cn/hmdd; version 3.2). Combined the interaction relationship acquired from the above two databases and the 21 pSS related miRNAs, the lncRNA-miRNA-mRNA network, or the ceRNA network, was constructed by cytoscape software. Then, the co-expression relationship between the key lncRNAs in the ceRNA network and the 15 most up-regulated and down-regulated DE mRNAs was analyzed by Spearman’s test.
Real-time PCR
The lncRNAs screened for abnormal expression were validated using real-time PCR in all the 45 extracted RNA samples. The lncRNAs we screened were based on the following criteria: (1) fold change ≥4, (2) average FPKM of the up-regulated groop ≥5, (3) without repeated sequences of mRNA, and (4) lncRNAs without information in databases were excluded.
The purified total RNA was reverse transcribed into cDNA using the ReverTra Ace qPCR Kit (TOYOBO, FSQ-101). The obtained cDNA sample was subjected to real-time quantitative PCR using the Power SYBR Green PCR Master Mix (ABI, USA). A QuantStudio 5 Real-Time PCR System (ABI) was used for the PCR procedures. LncRNA expression was measured and normalized to the mean expression of the housekeeping gene: GAPDH (Table S2). The relative fold change of each sample was calculated in relation to the ∆Ct of a random unstimulated sample (reference) in the health control group according to the formula: fold change = 2−∆∆Ct, where ∆∆Ct = ∆Ct sample-∆Ct reference.
Statistical analyses
GraphPad Prism 7.0 version was used for statistical analysis of the data. Data are represented as fold changes relative to healthy controls. Correlation analysis was performed by Spearman’s test. An analysis based on the two-tailed unpaired t-test or Mann-Whitney U test was performed to evaluate differential expression of genes between groups. The significant difference was the p-value of less than 0.05. False discovery correction was not used to adjust p-value because of the small number of DE lncRNAs and mRNAs.