Introduction
Self-reported questionnaires known as Patient Reported Outcome Measures (PROMs) allow clinicians to address the shortcomings of traditional measures, such as fusion rate and neurological deficits, which often do not accurately represent a patient’s current health status (McCormick, Werner, and Shimer 2013; Hung et al. 2014). As with any self-reported questionnaire, PROMs are vulnerable to low completion rates. For example, Pink et al. reported that only 30% of responders in a study of patients treated for whiplash injuries completed the 12-Item Short Form Survey Physical Component Scale (SF-12 PCS) at all assessed time points (Pink et al. 2014). “Legacy PROMs” have become a constant in clinical care, with the Visual Analogue Scale (VAS), SF-12 and Neck Disability Index (NDI) being used to assess pain, general health status, and disability, respectively (Vaishnav et al. 2020; MacDowall et al. 2018). While these are valid and reliable measures for assessment of postoperative improvement, collectively, they may be cumbersome to complete, especially when multiple surveys are sequentially administered. Additionally, these legacy PROMs may be vulnerable to disease bias and have been reported to demonstrate significant floor and ceiling effects (Hung et al. 2014; Patel et al. 2018).
To address these shortcomings, the Patient-Reported Outcomes Measures Information System (PROMIS) was initiated by the National Institute of Health (NIH) in 2004. The PROMIS system incorporates computer adaptive testing (CAT), a method that utilizes the previous question to determine an appropriate subsequent question (Patel et al. 2018). Thus, while the PROMIS Physical Function (PF) measure includes over 170 possible question items, only 4-12 questions are typically selected as most appropriate and administered to a given patient to assess factors such as coordination, mobility, strength, and dexterity (“PROMIS - Physical Function,” n.d.). This allows for increased efficiency and shorter completion times as compared to traditional PROMs (Haws et al. 2019; Boody et al. 2018a). Previous studies have reported a mean time of 1.1 minutes to complete the PROMIS Physical Function (PROMIS-PF) compared to 3.4 minutes for the NDI and 4.1 minutes for the SF-12 (Boody et al. 2018b). Within the same study, PROMIS demonstrated responsiveness and validity comparable to legacy PROMs with minimal floor and ceiling effects (Boody et al. 2018b). As healthcare providers and researchers continue to look for ways to increase the quality of care, PROMIS has the ability to become an important aspect of assessing patient reported outcomes while decreasing patient burden. However, it is critical to look closely at the completion rates of these questionnaires. Parrish et al. found significant demographic and perioperative variables to be associated with completion rates of PROMIS surveys (Parrish et al. 2020), emphasizing an important demographic imbalance associated with low response rates.
As the number of questionnaires used in the clinical setting continues to grow, it is important to acknowledge the response burden associated with an increasing number of questions to complete (U.S. Department of Health and Human Services FDA Center for Drug Evaluation and Research, U.S. Department of Health and Human Services FDA Center for Biologics Evaluation and Research, and U.S. Department of Health and Human Services FDA Center for Devices and Radiological Health 2006). This burden may result in lower rates of completion, potentially introducing non-response bias to research and clinical care. A past study examining response rates in hip arthroplasty patients determined that non-responders tended to have worse outcomes, suggesting that satisfied patients were overrepresented in their data set (Imam et al. 2014). Low response rates and non-response bias can lead to poorer quality of research, less power for data analyses, and less representative samples. Additionally, limited questionnaire data makes preoperative counseling and postoperative tracking of patients challenging. As clinicians and researchers decide which questionnaires are best to administer to their patients, studies are needed to compare questionnaire completion rates, especially as PROMIS becomes increasingly popular across various disciplines. Therefore, our study aims to compare completion rates for PROMIS-PF) with rates of legacy PROM completion following cervical spine procedures. We hypothesize that because of PROMIS’s usage of CAT and thus increased efficiency compared to legacy PROMs, PROMIS-PF will demonstrate higher completion rates than NDI, VAS, and SF-12.
Methods
Patient Population
Institutional review board approval (ORA #14051301) and written, informed patient consent was obtained prior to study onset. A retrospective review was conducted using a prospectively maintained surgical registry for primary elective cervical spine procedures performed from May 2015 to June 2020. Exclusion criteria were revision procedures and procedures indicated for trauma, infection, or malignancy. All procedures were performed by a single attending spine surgeon at the same academic institution. Patients were divided into PROMIS-PF (administered via CAT for all patients) and legacy PROM groups. Patients were not randomly assigned as this was not a blinded process. Instead, this was a retrospective cohort study which grouped subjects based on survey completed at the time of administration (legacy group versus PROMIS-PF). In other words, the data were reviewed retrospectively.
Data Collection
Patient demographics were recorded, including age, gender, body mass index (BMI), smoking status, diabetic status, American Society of Anesthesiologists classification (ASA, classified as ≤ 2 or >2), Charlson Comorbidity Index (CCI), ethnicity, and insurance/payment type received. Preoperative spinal pathology was classified as herniated nucleus pulposus, central stenosis, and/or myeloradiculopathy. Preoperative duration of symptoms was characterized in terms of the number of months from self-reported date of symptom onset (assessed at the preoperative visit) to the day of surgery. Perioperative characteristics were recorded in terms of operative duration (in minutes), estimated blood loss (EBL, in mL), and postoperative length of stay (in hours).
PROMs including VAS neck, VAS arm, NDI, SF-12 PCS (legacy PROMs), and PROMIS-PF were administered at the preoperative and 6-week, 12-week, 6-month, 1-year, and 2-year postoperative timepoints. PROMs and PROMIS were administered through a secure, online portal (OBERD, Columbia, MO) for all patients and completed by patients using either a handheld electronic tablet in the clinic or a personal device at home via a link sent by email. Patients were asked to complete PROMs by clinical staff at their appointments and received both telephone and email reminders from research staff at regular intervals reminding them of the requested PROMs. No individual PROM was specifically referenced or requested throughout any of these communications. All PROMs were assigned and became accessible to patients simultaneously at each timepoint and the order of completion was at the patients’ discretion. Rates of completion were determined for each PROM at each timepoint based on the number of completed surveys out of the total consented patients to whom the survey had been assigned.
Statistical Analysis
All calculations and statistical tests were performed using StataMP 16.1 (StataCorp, College Station, TX). Descriptive statistics were performed for patient demographics, baseline PROM scores, preoperative spinal diagnoses, and perioperative variables. These variables were compared between patients with follow-up at the latest timepoint and patients who were lost to follow-up using Student’s t-test and chi-square analysis for continuous and categorical variables, respectively. Descriptive statistics were also performed for all PROM scores and postoperative improvement from preoperative baseline was assessed at each timepoint using a paired Student’s t-test. McNemar’s test was used to compare completion rates for each legacy PROM with rates for PROMIS-PF at each timepoint. The overall percentages of surveys completed by each patient from preoperative to 2-year timepoints for PROMIS-PF was compared with that of each legacy PROM using a paired Student’s t-test. Statistical significance was set to a value of p≤0.050 for all tests.
Results
A total of 302 patients were included. The study cohort had a mean age of 50.0 years, 36.8% were female, 42.4% were obese (BMI ≥30 kg/m2), 11.3% were smokers and 9.3% were diabetic (Table 1). Overall, 85.3% had an ASA ≤2 and the mean CCI was 1.4. A majority of patients were of Caucasian ethnicity (75.4%) and provided payment through private insurance (73.2%). No demographic characteristic was significantly associated with patient follow-up (all p>0.05). Mean baseline PROM scores were as follows: 39.9 (PROMIS-PF), 6.1 (VAS Neck), 5.8 (VAS Arm), 38.7 (NDI), and 34.8 (SF-12 PCS). PROMIS-PF was significantly lower for patients lost to follow up (38.8 versus 40.8) (p=0.040). Overall, 59.2% of patients underwent anterior cervical discectomy and fusion, 29.4% underwent cervical disc arthroplasty, and 10.0% underwent posterior cervical laminoplasty. The remainder of patients underwent posterior cervical fusion and/or decompression (less than 1% each). Surgery type was not significantly associated with patient follow-up (p=0.225). The most common spinal pathology was myeloradiculopathy at 88.3% of the cohort. Mean operative duration was 67.4 minutes, mean EBL was 46.9 mL, and mean length of stay was 15.3 hours (Table 2). No preoperative spinal diagnosis or perioperative characteristic was significantly associated with patient follow-up (all p>0.05).
Mean postoperative PROM scores ranged from 2.7 to 3.4 for VAS neck, 2.7 to 3.3 for VAS arm, 20.2 to 30.1 for NDI, 36.4 to 43.1 for SF-12 PCS, and 42.2 to 47.7 for PROMIS-PF. The absolute value of skewness exceeded 1.0 only for NDI from 6 months through 2 years, demonstrating a relatively normal distribution for all other scores. Statistically significant improvement from preoperative baseline was demonstrated for all PROMs at all timepoints (p≤0.005, all) (Table 3). Preoperative completion rates for VAS neck (88.4%), VAS arm (87.8%), NDI (87.1%), and SF-12 (82.8%) were significantly greater than PROMIS-PF through 6 months (all p<0.001) (Table 4). Completion rates did not significantly differ between PROMIS-PF and any legacy PROM at 1-year (all p>0.05). At 2 years, completion rates were greater for PROMIS-PF (38.3%) than for VAS neck (18.1%), VAS arm (19.1%), NDI (19.1%), and SF-12 PCS (29.0%) (p<0.001).
Overall completion percentages for PROMIS-PF (53.0% ± 32.0%) were significantly lower than those of VAS neck (63.0% ± 23.5%), VAS arm (62.1% ± 24.4%), NDI (61.9% ± 24.5%), and SF-12 PCS (57.6% ± 27.2%) (all p<0.001) (Table 5). In total, 31 patients (10.3%) completed no VAS neck questionnaires, 33 (10.9%) completed no VAS arm questionnaires, 34 (11.3%) completed no NDI questionnaires, 45 (14.9%) completed no SF-12 PCS questionnaires, and 73 (24.2%) completed no PROMIS-PF questionnaires. There was substantial overlap between patients that did not complete one questionnaire versus another.
Discussion
Over the last several decades, PROMs have played an increasingly important role in quantifying the long-term success of spine surgery. More recently, PROMIS surveys have gained favor due, in part, to their focused, efficient administration of questions through the use of CAT. With the growing reliance on PROMs, poor completion rates can have significant negative consequences for both research and clinical practice. Low response rates can increase the risk of non-response bias and cause studies to be underpowered. In the clinical setting, non-response can cause physicians to lose touch with their patients and have less accurate information upon which to base their decisions. As patients become increasingly familiar with PROMIS surveys, our hope is that their greater efficiency and relevance of questions will lead to increased completion rates (Boody et al. 2018a; Brodke, Saltzman, and Brodke 2016).
The PROMIS system offers a number of advantages over more traditional “legacy” PROMs. Most notably the use of CAT can allow for more timely and relevant question administration so that a greater degree of pertinent information can be collected in a shorter period of time. A previous study involving lumbar decompression patients has demonstrated that the PROMIS-PF survey can be completed in a significantly shorter period of time compared to SF-12 PCS (Cha et al. 2021). In spite of this shorter time required for completion, the clinical validity of PROMIS-PF has been verified in comparison to other measures of physical function in a variety of spine surgery populations (Boody et al. 2018a; Pennings et al. 2020; Parrish et al. 2021; Jenkins et al. 2020). Additionally, beyond the study of physical function, a wide range of PROMIS metrics exist to explore outcomes for pain, depression, anxiety, and sleep interference, though these measures have not been as thoroughly validated in the spine surgery population (“Intro to PROMIS,” n.d.). Therefore, it is likely that PROMIS could be used to effectively replace several of the legacy measures of physical function, such as SF-12 PCS and perhaps even other PROMs such as VAS and NDI.
As is often the case in studies of surgical patients, we observed a relatively steady decline in survey participation as patients progressed from the preoperative timepoint. Preoperatively, completion rates were quite high, particularly for VAS neck, VAS arm, and NDI, which demonstrated rates of 88.4%, 87.8%, and 87.1%, respectively. Legacy PROMs maintained completion rates above (VAS neck, VAS arm, and NDI) or close to (SF-12) 50% through 6 months postoperatively, while PROMIS maintained ≥ 50% completion only through 12 weeks. A study by Makhni et al. demonstrated a similar decline in response rates over time following arthroscopic shoulder surgery, reporting preoperative completion rates of 76% that fell to 57% at 6 months, and 45% at 12 months (Makhni et al. 2017). In their study of 5300 patients undergoing joint reconstruction procedures, Pronk et al. similarly demonstrated significantly lower rates of survey response at 12-months follow-up (53%) compared to the preoperative timepoint (86%) using an automated, electronic PROM collection system (Pronk et al. 2019).
Although the primary focus of our study was the comparison of completion rates for PROMIS versus legacy PROMs, we felt it was important to identify any characteristics associated with overall patient survey follow-up. Interestingly, preoperative PROMIS-PF scores were the only factor that varied significantly on the basis of subsequent patient follow-up. Our analysis demonstrated that patients who were lost to follow-up tended to have significantly poorer preoperative PROMIS-PF scores. This finding suggests that patients with lower preoperative physical function may be at increased risk for loss to follow-up.
At the preoperative and short-term postoperative timepoints, patients are likely to be quite engaged with their treatment process considering that they have decided to seek out and schedule an elective surgical procedure, which is no small undertaking even in the best of circumstances. In the short-term postoperative period, patients may experience a substantial amount of anxiety or even excitement regarding their surgical outcome and may continue to feel eager to stay engaged with their clinical team. However, as patients continue to recover from surgery and feel their postoperative progress begin to plateau or perhaps feel they have fully recovered, they may believe it is less necessary to remain engaged with their spine surgeon. Lower completion rates at these more long-term postoperative timepoints may be explained by a decreased sense of urgency on the part of patients, or perhaps by a greater temporal distancing from their procedure as it becomes an increasingly distant part of their past.
Parrish et al. previously explored which demographic and perioperative factors were associated with completion rates for PROMIS-PF surveys and reported several important findings (Parrish et al. 2020). First they documented significant differences in survey completion on the basis of ethnicity, with African American and Hispanic patients demonstrating significantly lower rates of PROMIS completion. In terms of operative characteristics, they found that patients undergoing surgery in outpatient settings and with high preoperative extremity pain were less likely to complete PROMIS surveys, while greater narcotic consumption in the immediate postoperative period was associated with greater rates of PROMIS survey completion. Interestingly, they also observed that patients demonstrating depressive symptoms were more likely to complete PROMIS surveys.
As an increasing number of questionnaires and surveys are assigned to many patients undergoing surgical procedures, the risk of response burden becomes a consideration. Literature regarding response burden or “questionnaire burnout” is relatively scarce in populations undergoing spine surgery. A study conducted among oncology patients completing a large battery of PROMs indicated that while most patients did not feel they were assigned too many questionnaires, a substantial portion of the cohort indicated that they felt some questions were repetitive or unimportant (Atkinson et al. 2019). PROMIS surveys may be particularly well-suited to address this issue, as the use of CAT can allow for a more appropriate, personalized array of questions to be posed to each patient (Brodke, Saltzman, and Brodke 2016; Bhatt et al. 2019).
At preoperative and short- to mid-term postoperative timepoints, completion rates were higher for legacy PROMs than for PROMIS. At these earlier timepoints, it is possible that patients were more likely to complete legacy PROMs because they were more familiar with these measures, possibly having completed them as part of previous medical encounters. Legacy PROMs such as VAS, NDI, and SF-12 have been used for several decades in spine surgery and in other patient populations, so the likelihood that a patient may have previously encountered one or more of these surveys is ostensibly higher.
Interestingly, by the 1-year timepoint, completion rates equalized between PROMIS and legacy PROMs, and by 2 years, the trend in completion rates reversed, with PROMIS demonstrating a significantly higher rate of completion than any of the 4 legacy PROMs. Before data collection began, we hypothesized that completion rates for PROMIS would be greater because its use of CAT allows it to assess patients using fewer and less redundant questions. For example, patients may find it frustrating to continually be asked how their spine condition limits them in various ways, if they have already indicated that their spinal condition does not generally limit them. PROMIS addresses this issue by tailoring each subsequent question based on the patient’s previous response(s). However, given that it is a relatively newer survey, patients may not have been familiar with PROMIS prior to their preoperative appointment and may not have fully appreciated its benefits until having completed it alongside several legacy PROMs at earlier timepoints. Once they had become more familiar with PROMIS and completed it several times, patients may have increasingly favored this survey over legacy PROMs, reflected in the (relatively) greater completion rates at 2 years. Completion rates were generally quite low by the 2-year timepoint, with a rate of 38.3% for PROMIS and rates ranging from 18.1-29.0% for legacy PROMs. Given that patients were relatively parsimonious in terms of their completion of long-term surveys, they may have chosen to limit their participation to what they perceived as a briefer, more efficient measure.
When comparing the overall percentage of surveys completed by each patient from preoperative to 2-year postoperative timepoints, significantly fewer PROMIS surveys were completed compared to all included legacy PROMs. Although PROMIS completion rates may become more favorable at long-term follow-up, the lower rates of completion at earlier timepoints may substantially limit the data provided by patients. With this in mind, increased efforts should be made to familiarize patients with PROMIS at earlier timepoints in order to educate them as to its benefits over more traditional outcome measures.
The relatively low long-term response rates demonstrated by our study and others highlight the need to improve longitudinal survey compliance in clinical research. A variety of methods have been proposed to increase completion of PROMs throughout long-term timepoints. Direct monetary incentives for survey completion have been demonstrated to significantly increase response rates (Gates et al. 2009; Brealey et al. 2007). Interestingly, however, Warwick et al. observed that “social incentives” in the form of a donation made on behalf of the patient were not effective in increasing survey completion (Warwick et al. 2019). In addition to incentivization, more personalized “outreach” has also been demonstrated to improve rates of survey completion. Pronk et al. were able to significantly increase both short-term and long-term response rates by reaching out to non-responders in person and/or by mail with paper-based PROM forms (Pronk et al. 2019). Along similar lines, Makhni et al. demonstrated that reminders and outreach performed by research assistants increased survey compliance by approximately 20% (Makhni et al. 2017). Given the importance of PROM data for both research and clinical practice, it will be important to find ways to effectively increase completion rates without unreasonably increasing overhead costs or placing undue burden on patients.
Limitations
Several distinct limitations were inherent to this study. All procedures were performed by a single attending spine surgeon at the same academic institution. This may limit the generalizability of our results to other patient populations. Furthermore, a variety of demographic and perioperative characteristics have been demonstrated to affect survey completion rates. Although our study methodology availed itself to the use of internal controls, the wide variety of significant predictors of survey completion reported by previous studies prevented us from implementing a straightforward stratification to assess whether these various factors would affect our findings. Additionally, while all surveys were acquired from May 2015 to June 2020, PROMIS surveys were more recently provided versus legacy PROM surveys, potentially adding bias to our findings. Also, survey compliance in patients who completed the online survey via tablet in clinic versus at home was not evaluated. However, this would be a suitable avenue for future studies and may provide greater insight on whether the setting of survey completion influences compliance. Finally, we were unable to systematically assess the order of survey completion and what, if any, effect that had on completion rates. We do not believe this to be a significant source of bias, as patients had the opportunity to complete surveys in any order of their choosing and were not prompted in a specific order. However, future studies should take steps to quantify survey completion in order to document any related effects.
Conclusion
Completion rates for both legacy PROMs and PROMIS surveys consistently decreased as patients progressed from the preoperative timepoint. At preoperative and short-term postoperative timepoints, legacy PROMs were completed at significantly greater rates than PROMIS surveys. However, this difference equalized at the 1-year timepoint and by the 2-year timepoint, and although completion rates were universally rather low, a significantly greater proportion of patients completed PROMIS surveys than legacy PROMs. Overall completion percentage was significantly lower for PROMIS surveys than for any of the included legacy PROMs. While patients were more likely to complete legacy PROMs at earlier timepoints, as they become increasingly familiar with PROMIS surveys at longer-term postoperative
MISSING TEXT (file seems to be corrupted as a large gray box was covering this section…I managed to retrieve SOME of the text)