Through deep learning and clustering analysis of real-world clinical data, we identified four distinct RA phenotypes at baseline: foot-dominant (JIP-foot), seropositive oligoarticular (JIP-oligo), seronegative hand-dominant (JIP-hand), and polyarticular (JIP-poly) disease. While our hypothesis-free approach enabled detection of novel non-linear clinical signatures, we mitigated the risk of spurious correlations through rigorous validation, including stability testing, clinical outcome validation, independent cohort replication, and synovial histological correlation.
These results show promising directions for advancing RA management. While not yet clinically applicable, the identified clusters show clinical value as prognostic markers, with hand-foot differentiation proving particularly valuable for predicting clinical outcomes. Both foot-dominant clusters (JIP-foot, JIP-poly) showed higher MTX failure and lower remission rates compared to JIP-hand, independent of baseline joint involvement, symptom duration, or treatment timing. Notably, the impact of foot involvement on treatment failure rivaled that of ACPA-positivity as an independent risk factor, indicating that a more comprehensive joint assessment should be considered in clinical practice.
While this research underscores the importance of foot joints, current joint-specific treatment research often relies on 28-joint counts27, overlooking the feet despite their common involvement in RA28,29,30. Notably, our findings validate previous cross-sectional observations of poor prognosis in feet/ankle-involved disease28 and support Ciurea et al.‘s observation that foot involvement appears more persistent compared to other anatomical regions31. However, this study uniquely demonstrates this association in treatment-naïve patients at clinical presentation.
Nevertheless, dedicated trials are needed to determine whether tailoring therapy to specific patient groups—such as those with foot involvement—can improve outcomes compared to standard care. Recent studies demonstrate that synovial fibroblasts exhibit distinct transcriptomic and epigenomic profiles based on anatomical location, potentially explaining the biological underpinnings responsible for these joint-specific treatment outcomes32,33,34.
Contrary to common assumptions, we did not observe a clear ACPA dichotomy. This finding aligns with previous baseline studies that demonstrate that, despite known differences in risk factors and prognosis between ACPA-positive and ACPA-negative patients, the clinical phenotype at initial diagnosis is similar for both groups25,35,36. ACPA prevalence was lowest in typical RA clusters (JIP-hand, JIP-poly) and highest in the oligoarticular cluster (JIP-oligo). While this pattern might partially reflect classification criteria37, our use of 1 year diagnosis validation and physician diagnosis in sets A and C minimizes misclassification bias. The high ACPA positivity in JIP-foot corroborates recent findings of increased foot involvement in ACPA-positive patients38.
Although seronegative patients have traditionally been considered to have a milder disease course39, our study shows this is not the case for the JIP-poly cluster, which is associated with poorer outcomes. In contrast, the seronegative JIP-hand cluster aligns with the expected pattern, showing a milder disease course. Recent literature also challenges the historical assumption of a uniformly mild prognosis in seronegative patients. For example, Duong et al. identified ACPA-positivity as a predictor of better treatment response40, and the ARCTIC trial by Haavardsholm et al. found that ACPA-negative patients exhibited slower treatment responses41. While our findings suggest that ACPA status may still play a role in disease progression, the relationship appears to be more nuanced and warrants further investigation.
The treatment response disparity between hand and foot clusters was most pronounced in ACPA-positive patients. Across replication sets, we consistently observed significant differences between JIP-hand and JIP-poly, with two of three datasets also showing increased response in JIP-hand versus JIP-foot. For remission, we found that the different cluster-associated outcomes vanished in ACPA-positive patients, possibly reflecting the impact of targeted therapy intensification protocols particularly for ACPA positive patients.
Patients with hand-dominant joint inflammation (JIP-hand) showed surprisingly good outcomes, prompting us to explore several possible explanations. We found that these positive results could not be explained solely by how long patients had symptoms, their disease activity at baseline, or their lower prevalence of ACPA in that cluster. Additionally, we found no increased parvovirus positivity in the JIP-hand group compared to other clusters, indicating that misdiagnosis of RA due to self-limiting reactive arthritis was unlikely42,43. One possible explanation is that these patients represent a phenotype of seronegative RA characterized by particular hand and wrist involvement, as reported by Burns et al.44. Alternatively, hand-dominant presentations may appear more responsive due to the widespread use of simplified disease indexes in trials (DAS28, CDAI) excluding foot and ankle joints28,30, potentially limiting the generalizability to atypical presentations.
Our clusters captured previously described age-related subsets, including elderly-onset RA (EORA) in JIP-hand, characterized by higher inflammation markers, lower female prevalence, and reduced autoantibody positivity10,11. However, our analysis revealed more granular subtypes beyond the EORA/YORA dichotomy.
Further analysis of synovial tissue samples revealed distinct histological differences. Both the JIP-poly and JIP-hand groups exhibited severe synovitis. Specifically, JIP-poly was characterized by increased lining hyperplasia and sublining leukocytic infiltration, whereas JIP-hand showed greater stromal density. In contrast, the JIP-oligo group predominantly displayed mild inflammation. These histopathological differences were statistically significant. Notably, after adjusting for disease activity levels, the difference in stromal density was no longer significant, while the differences in inflammatory infiltrate and lining hyperplasia remained significant.
The striking histological differences between our patient clusters further supports the notion that these may represent distinct disease subtypes. The aggressive inflammatory patterns observed in JIP-poly and JIP-hand groups contrast sharply with the mild inflammation seen in JIP-oligo, suggesting different pathological mechanisms.
However, to fully understand the biological basis of these differences, research must extend beyond histopathological features alone. Molecular profiling of synovial tissue offers the potential for deeper mechanistic insights14. This approach has proven successful in previous work by Rivellese et al.45 who identified distinct molecular pathotypes that predicted differential responses to tocilizumab and rituximab.
A limitation of our study is that we defined MTX success based on changes in medication, which could include switches due to side effects, though this likely underestimates rather than overestimates the observed associations. Center-specific therapeutic approaches varied but did not affect cluster-outcome associations. While temporal cluster stability was not directly assessed, previous evidence of consistent joint involvement patterns and cross-sectional associations support stability over time46.
Due to the real-world nature of the data, we could not account for all possible factors of influence. Most notably, missing BMI data may have limited insights into disease heterogeneity. Overweight RA patients show higher disease activity scores and worse treatment outcomes but less radiographic damage47,48, suggesting adiposity-related inflammatory pathways and cytokine profiles47,48,49,50. However, obesity may also inflate subjective clinical measures independent of actual inflammatory activity48,51, warranting future investigation.
Another limitation is that we solely focused on knee biopsies due to insufficient data for other joint locations. This prevented us from exploring and comparing different tissue environments even though they might be crucial for understanding the different phenotypes. Additionally, knee biopsies may reflect synovial features from concurrent non-inflammatory conditions such as osteoarthritis, which commonly affects this joint. However, a previous study by Alivernini et al.3 demonstrates that Krenn synovitis scores are significantly lower in osteoarthritis biopsies (1.70 ± 0.15), suggesting minimal confounding impact on our assessment of inflammation.
Important to underline is that our identified clusters are not set in stone. Though we observed a high robustness of our clusters, patients lie on a gradient (Supplementary Fig. 4) and did not segregate in clearly separable modules. The cluster structure that we identified could also be summarized into more or fewer clusters and the clusters might become clearer when more layers of information are added. Such types of information could be genetics, gene expression patterns and molecular profiles from blood14,52. Despite these limitations, our study demonstrates the value of unsupervised, data-driven approaches in uncovering hidden disease patterns, with joint involvement patterns emerging as a major axis of variation.
In conclusion, our clustering analysis identified four baseline RA phenotypes, each defined by distinct patterns of hand and foot involvement that predict 1 year clinical outcomes and correspond to histological differences. This data-driven approach offers greater granularity than conventional age- or ACPA-based stratifications, suggesting the existence of distinct underlying etiologies that merit further biological investigation.
