Introduction
Total knee arthroplasty (TKA) has long been considered a cost-effective and reliable treatment for end-stage knee osteoarthritis (Daigle et al. 2012; Konopka et al. 2018; Evans et al. 2019). Patient satisfaction after TKA ranges from 75%-92%, and implant survival is reportedly 82% at 25 years (Choi and Ra 2016; Evans et al. 2019). The number of procedures performed each year in the United States continues to increase (S. Kurtz et al. 2007; S. M. Kurtz et al. 2014) with patient reported outcome measures (PROMs) more commonly being used to evaluate TKA satisfaction and gauge overall procedural success.
PROMs are used in the clinical and research settings to understand the effect of treatments on patient function and satisfaction (Swiontkowski et al. 1999; Poolman et al. 2009; Jackowski and Guyatt 2003). PROMs must demonstrate reliability, validity, and responsiveness (Jackowski and Guyatt 2003; Swiontkowski et al. 1999; Smith et al. 2012; Roach 2006). The Knee Injury and Osteoarthritis Outcome Score Joint Replacement (KOOS JR) was developed as a short-form survey of the full Knee Injury and Osteoarthritis Outcome Score (KOOS) PROM while retaining the validity of the latter (Lyman et al. 2016). The validity of KOOS JR as a questionnaire has been proven and is commonly used to evaluate patients before and after TKA (Lyman et al. 2016).
PROMs can also be used to determine changes in patient function and satisfaction over time. While statistical significance using p-values is often reported when comparing PROMs among groups of patients, it does not always equate to clinical significance (Maltenfort 2017). The minimum clinically important difference (MCID) is a statistical measure that can help determine the utility of PROMs in clinical practice (Maltenfort 2017). Two commonly used ways to determine the MCID are distribution-based and anchor-based methods, but neither has gained universal acceptance (Wyrwich et al. 2005; Revicki et al. 2008). Distribution-based methodology relies on standard error of measurement that is inherent to the testing instrument, and it assumes a normal response distribution among survey respondents (Berliner et al. 2016; Chesworth et al. 2008; Norman, Sloan, and Wyrwich 2003; Quintana et al. 2005; Wyrwich et al. 2005). Anchor-based methodology uses an external criterion (an anchor question or survey result) to determine MCID (Maltenfort 2017).
Large variations have been observed in calculating the KOOS JR MCID depending on the sample population and calculation methodology (Hung et al. 2018; Lyman et al. 2018). The aim of this study is thus to calculate and compare the KOOS JR MCID at various time points and to compare methodologies (anchoring vs distribution) for calculating KOOS JR MCID.
Methods
Data was prospectively collected from a patient-reported outcome database at a large healthcare system in a major metropolitan area for patients undergoing TKA between 2017 and 2018. Patients that 1) completed post-operative clinic follow-up at 3-months and 1-year and 2) patients who completed a general Patient-Reported Outcomes Measure Information System (PROMIS10) and KOOS JR survey at the preoperative baseline, 3-month post-operative, and 1-year post-operative designated time intervals were included in the study. Patients with an American Society of Anesthesiologists (ASA) score greater than or equal to four or who did not complete follow-up and surveys at each time points were excluded. Institutional review board at the institution of record found this investigation to be exempt status.
The MCID in this study was calculated for two intervals, baseline to 3-months and baseline to 1-year, using both anchoring and distribution methods. The distribution MCID method is calculated by halving the standard deviation of the change in preoperative (baseline) to the designated follow-up time point. The KOOS JR consists of 40 items, scored from 0 to 100, where a score of 0 indicates the worst level of pain and functioning (Nilsdotter et al. 2003; Lyman et al. 2016). Standard deviation (SD) and KOOS JR scores were calculated according to predetermined scoring algorithms for the KOOS JR outcome instrument as previously defined in the literature (Nilsdotter et al. 2003; Lyman et al. 2016).
The anchoring method for MCID calculation utilizes an anchor question that distinguishes patients with and without a change in their overall health state at post-operative follow-up. Two anchor questions were selected from the PROMIS10 quality-of-life instrument. The PROMIS10 was chosen to provide anchoring questions as it has been shown to reliably measure patient-reported physical health and quality-of-life outcomes (Hays et al. 2009; Cella et al. 2007, 2010; Fidai et al. 2018). We first queried a patient’s assessment of their overall physical health: “In general, how would you rate your physical health?” rated on a 5-point Likert scale (Poor, Fair, Good, Very Good, or Excellent). The second question queried a patient’s assessment of their overall quality-of-life: “In general, would you say your quality of life is:” rated on the 5-point Likert scale. For this study, the anchoring MCID was calculated for patients who reported a one- or two-point increase for their respective anchoring questions (Tubach, Ravaud, et al. 2005; Tubach, Wells, et al. 2005). The anchoring MCID was subsequently averaged for each time interval from baseline to 3 months, and baseline to 1-year intervals. We report the average MCID with standard deviation. All analyses were conducted using STATA (SE version 15.0; StataCorp College Station, TX, USA).
Results
956 patients were included in the study (Table 1); 590 (61.7%) were female. The average age was 66 years. Average KOOS JR scores at preoperative, 3-month, and 1-year follow-up were 51.7 ± 11.7, 69.2 ± 12.0, and 76.3 ± 14.5, respectively. This reflects a 3-month and 1-year change in KOOS JR scores of 17.5 and 24.6. For the physical health anchoring question, a total of 293 (30.6%) and 309 (32.3%) patients reported a one or two-point increase at 3-months and 1-year time intervals, respectively. For the quality-of-life anchoring question, a total of 369 (38.6%) and 375 (39.2%) patients reported a one or two-point increase at 3-month and 1-year time point intervals.
Based on the PROMIS10 physical health anchoring method, the KOOS JR MCID was 21.5 ± 14.9 and 27.9 ± 16.0 for the 3-month and 1-year time intervals, respectively. Similarly, for the PROMIS10 quality-of-life anchoring method, the KOOS JR MCID was 21.2 ± 14.9 and 28.9 ± 15.8 for the 3-month and 1-year time intervals, respectively. The distribution method derived MCID were 7.4 at 3-months and 8.2 at 1-year.
At 3-months, 726 (75.9%) patients achieved the distribution-based MCID (≥7.4 points), while only 325 (34%) patients achieved the physical health anchor-based MCID (≥21.5 points), and 334 (34.9%) patients achieved the quality-of-life anchor-based MCID (≥21.2 points) (Figure 1). At 1-year, 801 (83.8%) patients achieved the distribution-based MCID (≥8.2 points), 383 (40.1%) of patients achieved the physical health anchor-based MCID (≥27.9 points), and 354 (37%) patients achieved the quality-of-life anchor-based MCID (≥28.9 points).
Discussion
TKA is the gold standard treatment for end-stage osteoarthritis. However, PROM interpretation for this procedure requires further investigation partially due to variability in MCID values (Berliner et al. 2017). Our study evaluated the KOOS JR MCID at the 3-month and 1-year time intervals to evaluate MCIDs in-relation to both time intervals and calculation methodology. The distribution method MCID were 7.4 and 8.2 at 3-months and 1-year, respectively. The anchor-method MCIDs were 21.5 ± 14.9 and 27.9 ± 16.0 at 3-months and 1-year, respectively, for the PROMIS10 physical health anchor. The anchor-method MCIDs were 21.2 ± 14.9 and 28.9 ± 15.8 at 3-months and 1-year, respectively, for the PROMIS10 quality-of-life anchor. Patients more frequently achieved the 3-month and 1-year distribution-based KOOS JR MCID compared to either of the anchoring-based question MCIDs at 3-months and 1-year.
Various KOOS JR MCID values have been reported in the literature. Similar to our study, MCID values calculated using the distribution method consistently result in smaller values than anchor-based methods (Kuo et al. 2020). Lyman et al. derived a distribution-based KOOS JR MCID value of 6 and anchor-based value of 14 at two-year follow-up in 2630 patients undergoing TKA at a large, tertiary care center (Lyman et al. 2018). Our distribution-based MCID values of 7.4 and 8.2 (at 3-month and 1-year postoperatively) were similar to this study, while the anchor-based values calculated in our study, albeit at different time points and calculated utilizing different anchor questions, generated MCID values one and half to twice the value obtained by Lyman et al.. Our study obtained anchor questions from the PROMIS10 to derive the anchor-based MCID, other studies have used anchor questions from the Hospital for Special Surgery (HSS) Satisfaction Survey(Lyman et al. 2018; Maratt et al. 2015) and the Self-Administered Patient Satisfaction Scale (SAPS) (Kuo et al. 2020), but none of these studies evaluated how MCID may vary at different time intervals. Prior literature has suggested that variance in patient perspective occurs over time and analytical method may influence MCID (Mills et al. 2016; de Vet et al. 2007; McCreary et al. 2020). Our study adds to the literature by incorporating anchor- and distribution-based methods to calculate MCID at multiple time intervals.
Additionally, MCID is reported in the literature in various ways. Wright et al. identified nine methods of calculating MCID in the literature(Wright et al. 2012) and grouped these methods into two categories, anchor-based or distribution-based (Wright et al. 2012). Distribution MCID values appear to be less responsive to changes in patient MCID over time. MCIDs calculated using this methodology assumes a normal response distribution among survey respondents (Berliner et al. 2016; Chesworth et al. 2008; Norman, Sloan, and Wyrwich 2003; Quintana et al. 2005; Wyrwich et al. 2005) and are not connected to a quality-of-life measure. MCID values calculated via distribution method are typically less than anchor method values. This lowers the threshold needed for a patient to reach the MCID (Goodman et al. 2020; Lyman et al. 2018; Berliner et al. 2017; Kuo et al. 2020; Copay et al. 2018). Our study supports this finding in that more patients achieved the distribution MCID value at each time point compared to the anchor based MCID values. Greater than 75% of patients achieved the KOOS JR MCID calculated via the distribution method while 30-40% of patients reached the MCID value for the anchor method. Patients obtaining an artificially low MCID could lead to their falsely being identified as having achieved a clinically meaningful change, akin to a type I statistical error.
The anchor-based method is advantageous compared to the distribution method, because it relies on a PROM. PROMs are increasingly popular, and a method reliant on a PROM may be able to most accurately capture MCID for individual patients at different time points. However, the anchor-based method is dependent on an anchoring question chosen by the surgeon or the assessor which may introduce a source of bias or inaccuracy. Furthermore, both MCID methodologies currently report MCID values at single time points, and it is unknown if they can accurately determine clinical improvement over time.
Assessment of MCID as a function of time has not been extensively reported in the literature. McCreary et al. calculated Patient-Rated Wrist Evaluation (PRWE) MCID at 6-weeks and 12-weeks post-operatively using a distribution- and anchor-based method in a population of 197 patients undergoing treatment of a distal radius fracture (McCreary et al. 2020). MCID values in their study increased over time from 26.8 ± 24.7 at 6-weeks to 42.6 ± 23.2 at 12-weeks (McCreary et al. 2020). Mills et al examined the KOOS MCID and found MCID varied at 26-week and 52-week time points on pain and quality-of-life KOOS subscales (Mills et al. 2016). Our study supports these findings that MCID values may change or increase with the time interval from surgical intervention. This may be due to changes in patient perception of their recovery over time. Patients may perceive a greater improvement early in the recovery process. As time from surgery increases, a greater MCID value may be necessary for patients to perceive a clinically meaningful difference.
Our study identifies that both time and calculation modality effect the MCID values, but may still fail to fully capture the complex patient parameters dictating the success of surgical interventions for osteoarthritis. PROMs are designed to be patient-specific constructs, yet MCIDs are computed on a population level. Patients with higher baseline PROM scores, due to their higher pre-operative functionality or quality-of-life, may lack the mathematical possibility of achieving the numerical changes necessary to reach the MCID calculated for a population. However, a higher preoperative functionality should not exclude these patients from surgical intervention, albeit not being able to meet the determined MCID. The one-size fits all approach to MCID, configured on a population-level, is inconsistent with a patient-specific evaluation. For instance, the MCID for individuals with mental health illnesses may require unique considerations compared to a group of patients with diabetes or cardiovascular risks. Beyond the temporal components, it would be clinically relevant to adjust the MCID for patient-specific factors: BMI, history of mental illness, medical comorbidities, and patient’s historical physical activity. Patients could subsequently be stratified to calculate patient specific MCID values to accurately interpret KOOS JR or other PROMs.
Though MCID has been mostly limited to research thus far, there has been increasing focus on understanding its potential and limitations in the context of real world clinical practice. One clinically relevant example that could possibly benefit from MCID is the use of PROMs in pre-operative evaluation to promote shared decision-making. In this scenario, providers can present MCID data to their patients with the goal of creating an opportunity to discuss patient goals of treatment as well as expectations throughout care. Because pre-operative PROM scores will vary from patient to patient, it is important to individualize MCID data as much as possible. Determination of MCID at the patient level can allow surgeons to cater to the pre-operative functional baseline of the patient and can encourage patients to actively participate in their decision-making. Another way that MCID can influence clinical practice is via post-operative follow up. Patient perception and expectations of recovery are fluid over time. Therefore, more granular understanding of MCID and how it varies over time can help surgeons identify patients that may require closer follow up during their recovery from surgery, similar to a growth chart in pediatrics. This can help surgeons engage more effectively with their patients throughout the entire cycle of care, possibly leading to improved outcomes. Despite the aforementioned potential, this study highlights that more work needs to be done to elucidate the relationship between MCIDs, PROMs, and clinical practice. As we continue to see policies such as prior authorization use PROMs to prevent certain patients from undergoing TKA without a clear understanding of these concepts, it is our duty as surgeons to further investigate how we can use these tools for the benefit of our patients.
There are multiple limitations of this study. Data was obtained from a single institution and only represents the patient population in one metropolitan area. We also excluded patients with ASA score of 4 or greater. These points may thus lack generalizability to other institutions and patient populations. Also, the data was collected from patients receiving TKA from multiple different surgeons. Outcomes may vary by surgeon and a single surgeon study could allow for better assessment of patient outcomes. As mentioned, there are numerous limitations of MCID calculation. A single best practice for choosing an anchoring tool and question for the purpose of calculating an anchoring method MCID does not exist at present. Our chosen anchor-based methodology uses reliable, valid, and responsive general physical health and quality of life PROMs in contrast to other more disease-specific scales that have been reported in the literature. In addition, we do not report other covariates of interest that may influence MCID calculations such as mental health and medical comorbidities. Lastly, this study did not collect information on rehabilitation protocols which may result in variable patient recovery. Nevertheless, we evaluated a large study sample and determined KOOS JR MCID over time using two different methodologies.
Conclusion
This study calculated the KOOS JR MCID using both the distribution- and anchor-based analytical methods at 3-months and 1-year after TKA. Nearly a three-fold difference exist in KOOS JR MCID values when comparing the anchor and distribution methods. Additionally, there is variability in the KOOS JR MCID over time. Improved strategy for calculating or standardizing MCID is required to better guide use of KOOS JR and other PROMs in clinical decision-making.