- Open Access
Genome-wide identification of RNA modification-related single nucleotide polymorphisms associated with rheumatoid arthritis
BMC Genomics volume 24, Article number: 153 (2023)
RNA modification plays important roles in many biological processes, such as gene expression control. The aim of this study was to identify single nucleotide polymorphisms related to RNA modification (RNAm-SNPs) for rheumatoid arthritis (RA) as putative functional variants.
We examined the association of RNAm-SNPs with RA in summary data from a genome-wide association study of 19,234 RA cases and 61,565 controls. We performed eQTL and pQTL analyses for the RNAm-SNPs to find associated gene expression and protein levels. Furthermore, we examined the associations of gene expression and circulating protein levels with RA using two-sample Mendelian randomization analysis methods.
A total of 160 RNAm-SNPs related to m6A, m1A, A-to-I, m7G, m5C, m5U and m6Am modifications were identified to be significantly associated with RA. These RNAm-SNPs were located in 62 protein-coding genes, which were significantly enriched in immune-related pathways. RNAm-SNPs in important RA susceptibility genes, such as PADI2, SPRED2, PLCL2, HLA-A, HLA-B, HLA-DRB1, HLA-DPB1, TRAF1 and TXNDC11, were identified. Most of these RNAm-SNPs showed eQTL effects, and the expression levels of 26 of the modifiable genes (e.g., PADI2, TRAF1, HLA-A, HLA-DRB1, HLA-DPB1 and HLA-B) in blood cells were associated with RA. Circulating protein levels, such as CFB, GZMA, HLA-DQA2, IL21, LRPAP1 and TFF3, were affected by RNAm-SNPs and were associated with RA.
The present study identified RNAm-SNPs in the reported RA susceptibility genes and suggested that RNAm-SNPs may affect RA risk by affecting the expression levels of corresponding genes and proteins.
Rheumatoid arthritis (RA) is a highly prevalent inflammatory arthritis, with an average global prevalence estimated at 0.5–1.0%, mostly in women [1, 2]. RA is a chronic destructive autoimmune arthritis characterized by chronic inflammation of the synovium, especially in the small joints, which usually results in the destruction of juxta-articular bone and articular cartilage and significantly reduces people’s quality of life . It is usually accompanied by systemic manifestations, including osteoporosis, fatigue and anemia. RA is also associated with a 2–3-fold increase in the incidence of cardiovascular disease and shows higher morbidity and mortality in the affected population .
RA is caused by both genetic and environmental factors. Genetic factors account for approximately 60% of the risk of developing RA [4, 5]. Genome-wide association studies (GWASs) have identified more than 100 RA susceptibility loci in different populations [6,7,8,9]. A major issue in the post-GWAS era is the identification of functional (causal) variants in RA susceptibility loci. Sequencing experiments have attempted to identify missense mutations as functional variants for RA [10, 11]. In addition, some studies have focused on genetic variants altering splice sites , variants that are involved in RNA-binding protein-mediated regulation , and variants associated with RNA editing variability .
It is becoming increasingly important to study the epigenetic factors and mechanisms related to the progression and treatment response of RA [15, 16]. RNA modification is modifiable and involved in the regulation of different biological processes in living cells . With the development of sufficiently sensitive high-resolution transcriptomic techniques, more than 170 chemical modification types in RNA molecules have been identified. Some types of RNA modifications have been extensively studied, including m6A (N6-adenosine methylation), m6Am (N6,2′-O-dimethyladenosine), m5C (5-methylcytidin), m5U (5-methyluridine), m7G (N7-methylguanosine), m1A (N1-adenosine methylation), A-to-I RNA editing, Nm (2′-O-ribose-methylation) and pseudouridine. Among these modification types, m6A methylation is the first example. It is a type of reversible and conservative RNA methylation in eukaryotes and is known to us since it is important in the regulation of gene expression . The role of m6A methylation in immunity and RA has been characterized [19, 20].
Genetic variants can affect RNA modifications by changing the modifiable nucleotides or RNA sequences around the target sites [21, 22]. RNA modification-related SNPs (RNAm-SNPs) may disturb the regulation of gene expression by affecting RNA modifications and therefore may be important functional variants. m6A-related SNPs (m6A-SNPs) have been shown to be potential functional variants in RA susceptibility genes . However, the relationships between other types of RNAm-SNPs and RA remain unknown.
Therefore, this study will evaluate the effect of whole-genome RNAm-SNPs on RA for the first time. Then, the impacts of RNAm-SNPs on gene expression were evaluated in quantitative trait locus (QTL) studies, including RNA expression QTL (eQTL) and circulating protein level QTL (pQTL), to support the functionality of the RNAm-SNPs. By applying Mendelian randomization (MR) analysis methods, the associations between gene expression and circulating protein levels and RA were examined, and thus, potential novel risk factors underlying the associations between genetic variants and RA were identified (Fig. 1).
A total of 160 RNAm-SNPs that were significantly associated with RA at P < 5.0 × 10− 8 were identified (Supplementary Table S2), including 135 m6A-, 9 m1A-, 9 A-to-I-, 6 m7G-, 1 m5C-, 1 m5U- and 1 m6Am-related SNPs. Among these RNAm-SNPs, 119 mapped to 62 protein-coding genes, and 41 mapped to lncRNAs or pseudogenes. The 62 protein-coding genes were significantly enriched in immune-related pathways (Fig. 2A) and GO terms of biological processes (Fig. 2B). Most of the RA susceptibility genes contain only one RNAm-SNP, and 23 genes contain two or more RNAm-SNPs. Notably, HLA-DQA1, HLA-DQB1, AHNAK2, HLA-B and HLA-A contain 13, 12, 9, 7 and 5 RNAm-SNPs, respectively.
A total of 135 RA-associated m6A-SNPs were found, 96 of which were located in protein-coding genes (n = 48). Thirty-four (25.2%) of them were functional gain, while 101 (74.8%) were functional loss m6A-SNPs (Fig. 2C). These m6A-SNPs were of three confidence levels: 16 (11.9%) were high confidence, 47 (34.8%) were medium confidence and 72 (55.3%) were low confidence m6A-SNPs (Fig. 2D). Among the 96 m6A-SNPs located in protein-coding genes, 39 (40.6%) were exonic, 27 (28.1%) were in the 3′-UTR, 1 (1.0%) was in the 5′-UTR and 29 (30.2%) were intronic (Fig. 2E). For the exonic m6A-SNPs, 22 were missense and 17 were synonymous mutations.
Importantly, significant m6A-SNPs in well-known RA susceptibility genes were identified (Fig. 3), including rs2076595 in PADI2 (Fig. 4A); rs4836834 in TRAF1 (Fig. 5A); rs9985404 in PLCL2; rs9260149, rs1061235 rs79244404 and rs13488 in HLA-A; rs28367598, rs3177747, rs1057151, rs1056429 and rs709055 in HLA-B; and rs1042136, rs1042151 and rs9277410 in HLA-DPB1. In addition, rs10438246, rs12433815, rs12433837, rs12436986, rs2582511, rs2396457, rs2894636, rs76231332 and rs74090129 in AHNAK2 were identified.
We identified nine functional loss m1A-SNPs that were significantly associated with RA, and all of them belonged to the high or medium confidence categories (Table 1). rs12185577 in SPRED2, rs1061235 in HLA-A and rs41541519 (P = 7.90 × 10− 9) in HLA-B were identified. Three of the m1A-SNPs are exonic: rs2359173 in MAGI3 is a synonymous mutation, rs76018112 (stop codon deletion) in ABCF1 and rs41541519 in HLA-B (missense) are frameshift mutations. Nine functional loss A-to-I-SNPs belonging to the high confidence category were significantly associated with RA (Table 1). rs72850280 in HLA-DRB1 and rs1592572 in TXNDC11 were identified. Six functional loss m7G-SNPs belonging to the medium confidence category were significantly associated with RA (Table 1). The 3′-UTR SNP rs1051336 in HLA-DRA was strongly associated with RA (P = 6.74 × 10− 191); rs71563314 in the 3′-UTR of HLA-B was identified; rs5030798 in VARS1 is a missense mutation. In addition, rs10885 (missense) in PRRC2A is related to m5C modification; rs2074491 in the 5′-UTR of HLA-C is related to m6Am modification; and rs76864766 in tRNA TRY-GTA3–1 is related to m6Am modification (Table 1). The identified RNAm-SNPs are not in linkage disequilibrium with the HLA-DRB1 variant rs17878703  (Supplementary Table S3).
Gene expression associated with the RNAm-SNPs
The main role of RNA modification is to regulate gene expression and mRNA stability and homeostasis, so RNAm-SNPs may be associated with RNA expression levels. By using public data and our own data, we found that 134 (83.8%) of the 160 identified RA-associated RNAm-SNPs were associated with mRNA expression levels. Among the eQTLs, 51 were associated with the expression of their host genes in blood cells (Supplementary Table S4). Significant eQTL signals in well-known RA susceptibility genes were identified. We found that the m6A-SNP rs4836834 in TRAF1 was associated with TRAF1 mRNA levels (P = 8.97 × 10− 72); the m6A-SNPs rs79244404 and rs13488 in HLA-A were associated with HLA-A mRNA levels (P = 6.85 × 10− 16 and 2.00 × 10− 30, respectively); three m6A-SNPs (rs1042136, rs1042151 and rs9277410) in HLA-DPB1 were associated with HLA-DPB1 mRNA levels (P = 5.91 × 10− 18, 7.85 × 10− 16 and 7.71 × 10− 44, respectively); the m6A-SNP rs2076595 in PADI2 was associated with PADI2 mRNA levels (P = 2.19 × 10− 13); the m6A-SNP rs9985404 in PLCL2 was associated with PLCL2 mRNA levels (P = 2.19 × 10− 13); and the A-to-I-SNP rs1592572 in TXNDC11 was associated with TXNDC11 mRNA levels (P = 1.77 × 10− 9). According to our data, an association between the m6A-SNP rs2076595 and PADI2 mRNA levels in PBMCs was observed (Fig. 4B; P = 6.43 × 10− 6). The m6A-SNP rs9277410 in HLA-DPB1 was associated with HLA-DPB1 mRNA levels in PBMCs (Fig. 6A; P = 5.36 × 10− 13).
Gene expression associated with RA
In SMR analysis, we detected significant associations between gene expression in blood cells and RA by using data from three eQTL studies and two GWASs. A total of 74 significant associations for 26 genes in which RNAm-SNPs were identified were detected (PSMR < 5.0 × 10− 6), and most of the signals were replicated by using different datasets (Supplementary Table S5). The expression levels of six known RA susceptibility genes, PADI2 (Fig. 4C), TRAF1 (Fig. 5B), HLA-A, HLA-DRB1, HLA-DPB1 and HLA-B (Supplementary Table S5), in blood cells were significantly associated with RA. RNAm-SNPs were identified in these six genes and were strongly associated with the expression levels of their host genes. By applying the HEIDI test (P > 0.05), we found that rs9277410 in HLA-DPB2, rs4836834 in TRAF1, rs2952151 in PGAP3 and rs9303280 in GSDMB may be causal variants that affect both gene expression and RA (Table 2). Therefore, these RNAm-SNPs and the corresponding genes could be prioritized in follow-up functional studies.
For the 26 genes identified in SMR analysis, we compared their expression levels between RA cases and controls. In synovial tissues, HLA-DQB1 was differentially expressed between RA cases and controls according to GSE1919 data (P = 3.15 × 10− 4). In blood cells, DAXX, HLA-A, HLA-C, HLA-DPB1, HLA-DQA1, HLA-DQB1, PADI2, PHF19, RNASET2 and VARS2 were differentially expressed between RA cases and controls according to GSE15573 and GSE17755 data (P = 1.31 × 10− 9, 2.82 × 10− 7, 5.34 × 10− 6, 3.86 × 10− 13, 9.37 × 10− 11, 2.62 × 10− 25, 6.23 × 10− 20, 1.82 × 10− 4, 4.82 × 10− 5 and 1.09 × 10− 13, respectively). Differential expression of PADI2 (Fig. 4D), HLA-DPB1 (Fig. 6B), HLA-A (Fig. 7A), HSPA1A (Fig. 7B), MICB (Fig. 7C) and TRAF1 (Fig. 7D) in PBMCs between RA cases and controls was also found according to our in-house data (P = 3.21 × 10− 2, 1.42 × 10− 2, 9.83 × 10− 6, 3.40 × 10− 6, 1.94 × 10− 4 and 1.98 × 10− 2, respectively). In addition, the expression levels of HLA-A in PBMCs were associated with the RA GRS (P = 7.44 × 10− 4). In the data of 28 RA cases and 18 controls, we also found that the expression levels of PLCL2, which was not detected in SMR analysis, in PBMCs were differentially expressed (P = 2.89 × 10− 8) and were associated with the RA GRS (P = 7.87 × 10− 3).
Plasma proteins related to the RNAm-SNPs
We further tried to find plasma proteins that were related to the identified RNAm-SNPs. We found 602 pQTL signals (P < 5.0 × 10− 6) for 107 RNAm-SNPs that were significantly associated with RA (Supplementary Table S6). A total of 82 proteins were detected. The m6A-SNP rs7775397 in TSBP1 was associated with plasma levels of 23 proteins, and the m5C-SNP rs10885 in PRRC2A was associated with plasma levels of 20 proteins. The top signals were the associations between rs1130142 and rs1130144 in HLA-DQA1 and circulating levels of HLA-DQA2. Indeed, 39 RNAm-SNPs that were significantly associated with RA were significantly associated with circulating levels of HLA-DQA2. In addition, more than 20 RNAm-SNPs were significantly associated with circulating levels of C4A, MICB, PRSS3, GRIA4, PDE4D, RACGAP1, LRPAP1, IL21 and KIR2DS2.
Six RNAm-SNPs inside known RA susceptibility genes were associated with circulating protein levels, including the m6A-SNPs rs1057151, rs28367598 and rs3177747 in HLA-B and rs2076595 in PADI2, m1A-SNP rs41541519 in HLA-B and the A-to-I-SNP rs72850280 in HLA-DRB1. In total, these six RNAm-SNPs were associated with circulating levels of 16 proteins, including MICB, USP25, MFAP2, PLA2G10, C4A, TFF3, CFB, IL21, IGHE, GZMB, DEFB119, GFRA2, CREB3L4, MMP8, PRSS3 and HLA-DQA2. We tested whether these 16 proteins were genetically associated with RA using several MR methods. We found that the associations between circulating levels of nine proteins and RA were significant in weighted median, IVW, MR-Egger or MR-PRESSO analyses (Table 3). The associations between circulating levels of HLA-DQA2 and PRSS3 and RA were significant in the analyses of all four methods. We further examined the potential causal associations between these nine proteins and RA using 2021 GWAS data. The associations between circulating levels of six proteins, including CFB, CREB3L4, HLA-DQA2, LRPAP1, TFF3 and USP25, and RA were significant in weighted median, IVW, MR-Egger or MR-PRESSO analyses (Table 3). Therefore, the associations between circulating levels of these six proteins and RA were strengthened.
This study examined the associations between RNAm-SNPs and RA and showed that many SNPs in important RA susceptibility genes were related to the RNA modification types of m6A, m1A, A-to-I, m7G, m5C, m5U and m6Am. These RNAm-SNPs showed cis-acting eQTL effects in blood cells, and some of them were found to be associated with circulating protein levels. Moreover, the affected gene expression and protein levels were found to be associated with RA. By applying this study strategy, we identified the relationships among genetic variants, gene expression and RA, i.e., the RNAm-SNPs may affect RNA modification, which controls gene expression, and the altered RNA expression or protein levels result in RA.
Although hundreds of RA-related genomic loci have been identified by GWASs, many of the SNPs inside the loci may not be causal variants affecting RA. The causal variants are as yet undiscovered. Previous sequencing experimental studies have detected potential functional variations that can alter amino acid sequences [10, 11]. However, it is much more than that. RNAm-SNPs in the modification target sites may interrupt the modification functions (gain or loss) and interfere with gene expression regulation . RNA modification plays a critical role in immune cell development [26, 27] and is associated with the occurrence of RA [19, 28]. Therefore, RNAm-SNPs are potential functional variants for RA [21, 22]. In this study, we identified many RA-related RNAm-SNPs and showed that RNAm-SNPs affect genes associated with specific biological functions that are highly associated with RA. Not only were m6A-SNPs identified, but many SNPs related to m1A, A-to-I, m7G, m5C, m5U and m6Am modification types were also identified. More importantly, RNAm-SNPs in important RA susceptibility genes were identified. Therefore, this study showed that RA-related genomic loci contain RNAm-SNPs and showed that the identification of RNAm-SNPs in RA susceptibility genes is a way to determine causal variants and therefore helps to explain the findings of GWASs.
The RNAm-SNPs could interfere with the modifications of the RNA molecules and then change their expression levels, thus affecting the risk of RA. However, further evidence is needed to prove that the gene expression affected by these RNAm-SNPs is associated with RA. One method for linking an associated risk variant to a causal gene is to look at its correlation with gene expression. In our study, further eQTL analysis, SMR analysis and differential expression analysis confirmed that some RNAm-SNPs were associated with gene expression levels in blood cells and that the gene expression levels were associated with RA, including the expression levels of PADI2, TRAF1, HLA-A, HLA-DRB1, HLA-DPB1 and HLA-B. The HLA region has long been known to be a genetic contributor to RA susceptibility . Peptidylarginine deiminases (PADs) play a role in the onset and progression of RA owing to their ability to generate the citrullinated protein targets of anti-citrullinated protein antibodies. Anti-PAD antibodies are possible biomarkers for RA diagnosis and prognosis . Among the PAD enzyme isoforms, PAD2 and PAD4 are most strongly implicated in RA. TRAF1encodes TNF receptor-associated factor 1, which regulates the activation of NF-kappa-B and JNK . The association between genetic polymorphisms in TRAF1 and RA has been widely studied. The serum concentration of TRAF1 in RA patients was higher than that in healthy controls and is associated with autoantibodies and the disease activity of RA . m6A-modified TRAF1 has recently been shown to promote sunitinib resistance by modulating apoptotic and angiogenic pathways . Therefore, the findings of this study showed that RNAm-SNPs in GWAS-identified RA loci may be functional variants and that RNAm-SNPs may affect RA risk by altering RNA expression levels.
In addition, pQTL analysis also found that these RNAm-SNPs affected circulating levels of proteins, such as CFB, GZMA, HLA-DQA2, IL21, LRPAP1 and TFF3, that were related to RA. Take TFF3 as an example. The pQTL analysis showed that seven RNAm-SNPs were associated with circulating levels of TFF3, including a m7G-SNP rs2263318 in HCP5 and six m6A-SNPs, rs1051790 in MICA, rs28366151 in PRRC2A, rs28367598 in HLA-B, rs3176007 in HLA-C, rs7774954 in HLA-DQB2 and rs9266689 in ZDHHC20P2. Meanwhile, circulating levels of TFF3 were associated with RA in our MR analyses. TFF peptides are important for the maintenance and repair of intestinal mucosa  and are involved in the immune response . A study showed that TFF3 protein levels in RA samples of synovial fluid were significantly lower than those in healthy samples . In addition, circulating levels of TFF3 were significantly increased in patients with Sjögren’s syndrome secondary to RA compared with healthy controls . In addition to TFF3, increased levels of soluble GZMA in both the plasma and synovial fluid of RA patients have been reported . An increase in serum IL21 levels is associated with markers of B-cell activation and radiographic progression in patients with RA . In summary, the findings of our study indicated that RNAm-SNPs may also be involved in the pathogenesis of RA by changing the circulating levels of proteins that are critical in RA.
The present study has some potential limitations. First, we did not test whether the identified RNAm-SNPs functionally affected the RNA modifications experimentally. RNA modifications themselves may not be the true and independent causative mechanism of RA. Second, the relationships between protein molecules and RA have not been verified experimentally. Although the relationships between several proteins and RA have been reported, further studies are needed to find evidence to support the functional relevance of the molecules in RA.
In summary, this study identified RNAm-SNPs in many reported RA susceptibility genes (e.g., PADI2, SPRED2, PLCL2, HLA-A, HLA-B, HLA-DRB1, HLA-DPB1, TRAF1 and TXNDC11) and elucidated the relationships between RNAm-SNPs, gene expression and protein levels and RA. The findings helped with the translation of GWAS signals into causal mechanisms and clinical applications. The results also indicated that RNA modification may play important roles in RA. Except for m6A methylation, no previous study has shown the relationships between RNA modifications (e.g., m1A, A-to-I, m7G, m5C, m5U and m6Am) and RA. Therefore, this study may add new clues for further understanding the functional mechanism underlying the development of RA.
Determination of RNAm-SNPs for RA
In this study, we used new RNA modification annotations to obtain functional explanations for the results of the RA GWAS . The summary statistics of associations between 6.6 million SNPs and RA can be downloaded at http://plaza.umin.ac.jp/~yokada/datasource/software.htm. This GWAS included 19,234 cases of RA and 61,565 controls. Among them, 43,923 controls and 14,361 RA cases were from 18 European studies, and 17,642 controls and 4873 RA cases were from 4 Asian studies. In addition, data from the newest, largest-ever trans-ancestral meta-analysis GWAS of RA were also obtained . In this GWAS, genome-wide RA association summary statistics in three large case–control collections consisting of 311,292 individuals of Korean, Japanese and European populations were used in an inverse-variance-weighted fixed-effects meta-analysis. Summary statistics of 13,810,675 SNPs were available for this GWAS.
We obtained information on RNAm-SNPs in the RMVar database (https://rmvar.renlab.org/), which contains 1,678,126 RNAm-SNPs for nine RNA modification types . RNAM-SNPs in the RMVar database are classified into three confidence levels: high, medium and low confidence levels. The RNAm-SNPs derived from single base resolution experiments were classified into high confidence levels. Samples with medium confidence levels were obtained from MeRIP-Seq and m6A-Seal-seq experiments. The m6A-related variants predicted by the statistical model were defined as having low confidence. Based on the annotation of the RNAm-SNP sets, we labeled the genome-wide SNPs with RNA modification types in the GWAS summary datasets, and then, RNAm-SNPs significantly associated with RA were selected (P < 5.0 × 10− 8). Functional enrichments of the modifiable genes were tested by using the DAVID analysis tool , and a false positive rate less than 0.05 was considered significant.
eQTL analysis for the RNAm-SNPs
eQTL analysis is an effective method to describe correlations between genetic variants and gene expression at a genome-wide scale [41,42,43]. RA-associated RNAm-SNPs may regulate gene expression and lead to variations in mRNA levels. We performed cis-acting eQTL analysis in peripheral blood cells to obtain functional evidence for the identified RNAm-SNPs. The eQTL analysis was performed by searching data in the HaploReg browser (http://archive.broadinstitute.org/mammals/haploreg/haploreg.php) . The results from three eQTL studies were obtained. Westra et al. performed the largest eQTL meta-analysis thus far in peripheral blood samples of 5311 healthy European individuals . The genetic architecture of gene expression (GAGE) study detected eQTLs in peripheral blood in 2765 European individuals . The cis-eQTL summary data from the GTEx whole blood cells  were also used.
We attempted to determine whether the interference of RNAm-SNPs on gene expression affects RA. We conducted a summary data–based Mendelian randomization (SMR)  study to identify pleiotropic associations between gene expression levels and RA. The eQTL and RA GWAS datasets used in the SMR analysis are described above. The files containing eQTL summary data in binary format for the three eQTL studies can be found at http://cnsgenomics.com/software/smr/#DataResource. Using the genotype data of HapMap r23 CEU as the reference panel, we calculated the linkage disequilibrium association matrix. The parameters are left as the default setting in the analysis. The significance threshold in SMR analysis was set to 5.0 × 10− 6. We further conducted the heterogeneity in dependent instruments (HEIDI) test to examine whether the identified gene expression and RA are affected by the same underlying causal variant (i.e., RNAm-SNP). The HEIDI test uses multiple SNPs in a cis-eQTL region to distinguish pleiotropy from linkage . To achieve this purpose, we restricted the SNPs to the RNAm-SNPs in SMR analysis by applying the “--target-snp” option of the SMR program. PHEIDI > 0.05 indicated that the RNAm-SNP is the causal variant that affects the corresponding gene expression and RA.
Differential expression analysis
We further examined the differential expression of the identified genes in peripheral blood mononuclear cells (PBMCs) in our in-house dataset of 28 RA patients and 18 controls. The basic characteristics of the study subjects have been described in a previous study . Genome-wide RNA expression was profiled using lncRNA&mRNA Human Gene Expression Microarray V4.0 (CapitalBio Corp, Beijing, China) according to the manufacturer’s instructions. Differential expression of a total of 21,323 mRNA probes between the RA cases and controls was assessed by t tests.
Affymetrix Genome-Wide Human SNP Array 6.0 chips were employed for SNP genotyping. A weighted genetic risk score (GRS) was created based on genome-wide significant (P < 5.0 × 10− 8) independent SNPs identified in the RA GWAS . The effect estimates of 67 SNPs were used in GRS construction (Supplementary Table S1). The variants in each SNP were harmonized for consistent directions of association, and each of them in the GRS was weighted by its relative effect size in the GWAS, with effects combined in an additive model. The association between RNA expression and the RA GRS was examined.
In addition, we also detected differential expression based on the expression profile data available in public databases. Three gene expression datasets, GSE15573 , GSE17755  and GSE1919 , were downloaded from the GEO database (http://www.ncbi.nlm.nih.gov/geo). The average gene expression signals of cases and controls were compared by t test to assess the differential expression.
pQTL analysis for the RA-associated RNAm-SNPs
RNAm-SNPs may also affect RA by regulating gene expression at the protein level. Circulating proteins play important roles in many biological processes and are important therapeutic targets [53, 54]. Therefore, pQTL analysis was further applied to identify circulating proteins associated with the identified RNAm-SNPs. The data used for pQTL analysis were collected from the INTERVAL pQTL study . This study enrolled 3301 individuals of European descent and examined the associations between 10.6 million imputed autosomal variants and circulating levels of 2994 proteins (http://www.phpc.cam.ac.uk/ceu/proteins/).
MR analysis of proteins
To obtain further evidence to support the proteins identified in pQTL analysis, we used four MR methods, including inverse-variance weighted (IVW) , weighted median , MR-Egger  and MR pleiotropy residual sum and outlier (MR-PRESSO) , to test the causal relationships between circulating protein levels and RA. We used the “MendelianRandomization” R package to perform weighted median, IVW and MR-Egger analyses . We applied the MR-PRESSO (https://github.com/rondolab/MR-PRESSO) program to examine the causal estimates of outlier correction and horizontal multiplicity . The default parameters are used in the MR-PRESSO analysis. Data used in these MR analyses are from the GWAS and pQTL studies described above. In the pQTL summary data, SNPs with P values less than 5.0 × 10− 6 were selected as potential instrumental variables. We used the “clump_data” function in the “TwoSampleMR” R package to clump SNPs (linkage disequilibrium r2 < 0.01 in the range of 10,000 kb) according to the data of the Europeans 1000 Genomes project to select independent instrumental variables . The effect allele of each SNP in the RA GWAS and pQTL studies was manually checked for consistency, as we previously reported [62, 63]. All methods were carried out in accordance with relevant guidelines.
Availability of data and materials
The 2014 RA GWAS dataset is available at http://plaza.umin.ac.jp/~yokada/datasource/software.htm. The 2021 RA GWAS dataset is available at https://datadryad.org/stash/dataset/doi:10.5061/dryad.ns1rn8pr0. The eQTL datasets for SMR analysis are available at http://cnsgenomics.com/software/smr/#DataResource. Three gene expression datasets (GSE15573, GSE17755 and GSE1919) for differential expression analysis were downloaded from the GEO database (http://www.ncbi.nlm.nih.gov/geo). The INTERVAL pQTL dataset is available at http://www.phpc.cam.ac.uk/ceu/proteins/. The expression data in PBMCs have not yet been deposited and are available from the corresponding author on reasonable request.
Aletaha D, Smolen JS. Diagnosis and Management of Rheumatoid Arthritis: a review. JAMA. 2018;320(13):1360–72.
Shapira Y, Agmon-Levin N, Shoenfeld Y. Geoepidemiology of autoimmune rheumatic diseases. Nat Rev Rheumatol. 2010;6(8):468–76.
Smolen JS, Aletaha D, Barton A, Burmester GR, Emery P, Firestein GS, et al. Rheumatoid arthritis. Nat Rev Dis Primers. 2018;4:18001.
Viatte S, Plant D, Raychaudhuri S. Genetics and epigenetics of rheumatoid arthritis. Nat Rev Rheumatol. 2013;9(3):141–53.
Felson DT, Klareskog L. The genetics of rheumatoid arthritis: new insights and implications. JAMA. 2015;313(16):1623–4.
Stahl EA, Raychaudhuri S, Remmers EF, Xie G, Eyre S, Thomson BP, et al. Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci. Nat Genet. 2010;42(6):508–14.
Okada Y, Terao C, Ikari K, Kochi Y, Ohmura K, Suzuki A, et al. Meta-analysis identifies nine new loci associated with rheumatoid arthritis in the Japanese population. Nat Genet. 2012;44(5):511–6.
Eyre S, Bowes J, Diogo D, Lee A, Barton A, Martin P, et al. High-density genetic mapping identifies new susceptibility loci for rheumatoid arthritis. Nat Genet. 2012;44(12):1336–40.
Okada Y, Wu D, Trynka G, Raj T, Terao C, Ikari K, et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature. 2014;506(7488):376–81.
Diogo D, Kurreeman F, Stahl EA, Liao KP, Gupta N, Greenberg JD, et al. Rare, low-frequency, and common variants in the protein-coding sequence of biological candidate genes from GWASs contribute to risk of rheumatoid arthritis. Am J Hum Genet. 2013;92(1):15–27.
Bang SY, Na YJ, Kim K, Joo YB, Park Y, Lee J, et al. Targeted exon sequencing fails to identify rare coding variants with large effect in rheumatoid arthritis. Arthritis Res Ther. 2014;16(5):447.
Wu X, Hurst LD. Determinants of the usage of splice-associated cis-motifs predict the distribution of human pathogenic SNPs. Mol Biol Evol. 2016;33(2):518–29.
Mao F, Xiao L, Li X, Liang J, Teng H, Cai W, et al. RBP-Var: a database of functional variants involved in regulation mediated by RNA-binding proteins. Nucleic Acids Res. 2016;44(D1):D154–63.
Ramaswami G, Deng P, Zhang R, Anna Carbone M, Mackay TFC, Billy Li J. Genetic mapping uncovers cis-regulatory landscape of RNA editing. Nat Commun. 2015;6:8194.
Ai R, Laragione T, Hammaker D, Boyle DL, Wildberg A, Maeshima K, et al. Comprehensive epigenetic landscape of rheumatoid arthritis fibroblast-like synoviocytes. Nat Commun. 2018;9(1):1921.
Ham S, Bae JB, Lee S, Kim BJ, Han BG, Kwok SK, et al. Epigenetic analysis in rheumatoid arthritis synoviocytes. Exp Mol Med. 2019;51(2):1–13.
Malbec L, Zhang T, Chen Y-S, Zhang Y, Sun B-F, Shi B-Y, et al. Dynamic methylome of internal mRNA N7-methylguanosine and its regulatory role in translation. Cell Res. 2019;29(11):927–41.
Meyer KD, Jaffrey SR. The dynamic epitranscriptome: N6-methyladenosine and gene expression control. Nat Rev Mol Cell Biol. 2014;15(5):313–26.
Wu S, Li XF, Wu YY, Yin SQ, Huang C, Li J. N (6) -Methyladenosine and rheumatoid arthritis: a comprehensive review. Front Immunol. 2021;12:731842.
Boulias K, Greer EL. Biological roles of adenine methylation in RNA. Nat Rev Genet. 2023;24(3):143-60.
Zheng Y, Nie P, Peng D, He Z, Liu M, Xie Y, et al. m6AVar: a database of functional variants involved in m6A modification. Nucleic Acids Res. 2018;46(D1):D139–45.
Luo X, Li H, Liang J, Zhao Q, Xie Y, Ren J, et al. RMVar: an updated database of functional variants involved in RNA modifications. Nucleic Acids Res. 2021;49(D1):D1405–12.
Mo XB, Zhang YH, Lei SF. Genome-wide identification of N (6)-Methyladenosine (m (6) a) SNPs associated with rheumatoid arthritis. Front Genet. 2018;9:299.
Raychaudhuri S, Sandor C, Stahl EA, Freudenberg J, Lee HS, Jia X, et al. Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis. Nat Genet. 2012;44(3):291–6.
Lan Q, Liu PY, Bell JL, Wang JY, Hüttelmaier S, Zhang XD, et al. The emerging roles of RNA m (6) a methylation and demethylation as critical regulators of tumorigenesis, drug sensitivity, and resistance. Cancer Res. 2021;81(13):3431–40.
Zheng Z, Zhang L, Cui XL, Yu X, Hsu PJ, Lyu R, et al. Control of early B cell development by the RNA N (6)-Methyladenosine methylation. Cell Rep. 2020;31(13):107819.
Shulman Z, Stern-Ginossar N. The RNA modification N (6)-methyladenosine as a novel regulator of the immune system. Nat Immunol. 2020;21(5):501–12.
Vlachogiannis NI, Gatsiou A, Silvestris DA, Stamatelopoulos K, Tektonidou MG, Gallo A, et al. Increased adenosine-to-inosine RNA editing in rheumatoid arthritis. J Autoimmun. 2020;106:102329.
Curran AM, Naik P, Giles JT, Darrah E. PAD enzymes in rheumatoid arthritis: pathogenic effectors and autoimmune targets. Nat Rev Rheumatol. 2020;16(6):301–15.
Kato T Jr, Gotoh Y, Hoffmann A, Ono Y. Negative regulation of constitutive NF-kappaB and JNK signaling by PKN1-mediated phosphorylation of TRAF1. Genes Cells. 2008;13(5):509–20.
Cheng T, Sun X, Wu J, Wang M, Eisenberg RA, Chen Z. Increased serum levels of tumor necrosis factor receptor-associated factor 1 (TRAF1) correlate with disease activity and autoantibodies in rheumatoid arthritis. Clin Chim Acta. 2016;462:103–6.
Chen Y, Lu Z, Qi C, Yu C, Li Y, Huan W, et al. N (6)-methyladenosine-modified TRAF1 promotes sunitinib resistance by regulating apoptosis and angiogenesis in a METTL14-dependent manner in renal cell carcinoma. Mol Cancer. 2022;21(1):111.
Hoffmann W. Trefoil factor family (TFF) peptides: regulators of mucosal regeneration and repair, and more. Peptides. 2004;25(5):727–30.
Cook GA, Familari M, Thim L, Giraud AS. The trefoil peptides TFF2 and TFF3 are expressed in rat lymphoid tissues and participate in the immune response. FEBS Lett. 1999;456(1):155–9.
Popp J, Schicht M, Garreis F, Klinger P, Gelse K, Sesselmann S, et al. Human synovia contains trefoil factor family (TFF) peptides 1-3 although synovial membrane only produces TFF3: implications in osteoarthritis and rheumatoid arthritis. Int J Mol Sci. 2019;20(23):6015.
Zinkevičienė A, Dumalakienė I, Mieliauskaitė D, Vilienė R, Narkevičiūtė I, Girkontaitė I. sICAM-1 as potential additional parameter in the discrimination of the Sjögren syndrome and non-autoimmune sicca syndrome: a pilot study. Clin Rheumatol. 2019;38(10):2803–9.
Tak PP, Spaeny-Dekking L, Kraan MC, Breedveld FC, Froelich CJ, Hack CE. The levels of soluble granzyme a and B are elevated in plasma and synovial fluid of patients with rheumatoid arthritis (RA). Clin Exp Immunol. 1999;116(2):366–70.
Gottenberg JE, Dayer JM, Lukas C, Ducot B, Chiocchia G, Cantagrel A, et al. Serum IL-6 and IL-21 are associated with markers of B cell activation and structural progression in early rheumatoid arthritis: results from the ESPOIR cohort. Ann Rheum Dis. 2012;71(7):1243–8.
Ha E, Bae SC, Kim K. Large-scale meta-analysis across east Asian and European populations updated genetic architecture and variant-driven biology of rheumatoid arthritis, identifying 11 novel susceptibility loci. Ann Rheum Dis. 2021;80(5):558–65.
Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57.
Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo V, et al. Genetics of gene expression surveyed in maize, mouse and man. Nature. 2003;422(6929):297–302.
Huan T, Rong J, Liu C, Zhang X, Tanriverdi K, Joehanes R, et al. Genome-wide identification of microRNA expression quantitative trait loci. Nat Commun. 2015;6:6601.
Kumar V, Westra HJ, Karjalainen J, Zhernakova DV, Esko T, Hrdlickova B, et al. Human disease-associated genetic variation impacts large intergenic non-coding RNA expression. PLoS Genet. 2013;9(1):e1003201.
Ward LD, Kellis M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 2012;40(Database issue):D930–4.
Westra HJ, Peters MJ, Esko T, Yaghootkar H, Schurmann C, Kettunen J, et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat Genet. 2013;45(10):1238–43.
Lloyd-Jones LR, Holloway A, McRae A, Yang J, Small K, Zhao J, et al. The genetic architecture of gene expression in peripheral blood. Am J Hum Genet. 2017;100(2):228–37.
Battle A, Brown CD, Engelhardt BE, Montgomery SB. Genetic effects on gene expression across human tissues. Nature. 2017;550(7675):204–13.
Zhu Z, Zhang F, Hu H, Bakshi A, Robinson MR, Powell JE, et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat Genet. 2016;48(5):481–7.
Mo XB, Wu LF, Zhu XW, Xia W, Wang L, He P, et al. Identification and evaluation of lncRNA and mRNA integrative modules in human peripheral blood mononuclear cells. Epigenomics. 2017;9(7):943–54.
Teixeira VH, Olaso R, Martin-Magniette ML, Lasbleiz S, Jacq L, Oliveira CR, et al. Transcriptome analysis describing new immunity and defense genes in peripheral blood mononuclear cells of rheumatoid arthritis patients. PLoS One. 2009;4(8):e6803.
Lee HM, Sugino H, Aoki C, Nishimoto N. Underexpression of mitochondrial-DNA encoded ATP synthesis-related genes and DNA repair genes in systemic lupus erythematosus. Arthritis Res Ther. 2011;13(2):R63.
Ungethuem U, Haeupl T, Witt H, Koczan D, Krenn V, Huber H, et al. Molecular signatures and new candidates to target the pathogenesis of rheumatoid arthritis. Physiol Genomics. 2010;42A(4):267–82.
Imming P, Sinning C, Meyer A. Drugs, their targets and the nature and number of drug targets. Nat Rev Drug Discov. 2006;5(10):821–34.
Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, et al. A comprehensive map of molecular drug targets. Nat Rev Drug Discov. 2017;16(1):19–34.
Sun BB, Maranville JC, Peters JE, Stacey D, Staley JR, Blackshaw J, et al. Genomic atlas of the human plasma proteome. Nature. 2018;558(7708):73–9.
Burgess S, Butterworth A, Thompson SG. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol. 2013;37(7):658–65.
Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016;40(4):304–14.
Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through egger regression. Int J Epidemiol. 2015;44(2):512–25.
Verbanck M, Chen C-Y, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018;50(5):693–8.
Yavorska OO, Burgess S. MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. Int J Epidemiol. 2017;46(6):1734–9.
Hemani G, Zheng J, Elsworth B, Wade KH, Haberland V, Baird D, et al. The MR-base platform supports systematic causal inference across the human phenome. Elife. 2018;7:e34408.
Wu J, Wang M, Han L, Zhang H, Lei S, Zhang Y, et al. RNA modification-related variants in genomic loci associated with body mass index. Hum Genomics. 2022;16(1):25.
Li R, Zhang H, Tang F, Duan C, Liu D, Wu N, et al. Coronary artery disease risk factors affected by RNA modification-related genetic variants. Front Cardiovasc Med. 2022;9:985121.
The study was supported by the National Natural Science Foundation of China (82173597, 82073636, 81773508), the Startup Fund from Soochow University (Q413900313, Q413900412), and a Project of the Priority Academic Program Development of Jiangsu Higher Education Institutions.
Ethics approval and consent to participate
All methods were carried out in accordance with relevant guidelines and regulations. Human participants were involved in the study. Written informed consent was obtained from all participants. The research described herein received approval from the Soochow University Institutional Review Board.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: Supplementary Table S1.
Summary statistics of SNPs used in GRS construction. Supplementary Table S2. RNAm_SNPs identified for RA. Supplementary Table S3. The linkage disequilibrium of the identified RNAm-SNPs with the HLA-DRB1 SNP. Supplementary Table S4. Associations between RNAm-SNPs and gene expressions in blood cells. Supplementary Table S5. Associations between gene expressions in blood cells and RA identified in SMR analysis. Supplementary Table S6. Associations between RA-associated RNAm-SNPs and plasma protein levels.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Wang, M., Wu, J., Lei, S. et al. Genome-wide identification of RNA modification-related single nucleotide polymorphisms associated with rheumatoid arthritis. BMC Genomics 24, 153 (2023). https://doi.org/10.1186/s12864-023-09227-2
- Rheumatoid arthritis
- RNA modification
- Genome-wide association study
- Gene expression