Original Research

Development and validation of the tool to assess inpatient satisfaction with care from hospitalists

Abstract

OBJECTIVES

To develop and validate a new inpatient satisfaction metric to assess patients' perceptions of hospitalist performance.

PATIENTS AND METHODS

We developed the Tool to Assess Inpatient Satisfaction with Care from Hospitalists (TAISCH) by building upon the theoretical underpinnings of the quality of care measures that the Society of Hospital Medicine endorses. TAISCH was completed by inpatients at an academic institution between September 2012 and December 2012 after they had been cared for by the same hospitalist provider for at least 2 consecutive days. Content, internal structure, and convergent/discriminant validity evidence were assessed for TAISCH.

RESULTS

A total of 203 patients each rated 1 of our 29 hospitalists (patient response rate: 88%). Factor analyses resulted in a single factor with 15 items. Reliability of TAISCH was good (Cronbach's α = .88). The hospitalists' average TAISCH score ranged from 3.25 to 4.28 (mean [standard deviation] = 3.82 [0.24]; possible score range: 1–5). The relationship between TAISCH with a validated empathy scale and a global provider satisfaction question revealed significant positive associations (β = 12.2, and β = 11.2 respectively, both P < 0.001). At the provider level, no significant correlation was noted between the Press Ganey Physician score and TAISCH (r = 0.91, P = 0.51).

CONCLUSION

TAISCH collects patient satisfaction data that are attributable to specific hospitalist providers. The timeliness of the TAISCH data collection also makes real‐time service recovery possible, which is unachievable with other commonly used patient satisfaction metrics. Journal of Hospital Medicine 2014;9:553–558. © 2014 Society of Hospital Medicine

© 2014 Society of Hospital Medicine

Patient satisfaction scores are being reported publicly and will affect hospital reimbursement rates under Hospital Value Based Purchasing.[1] Patient satisfaction scores are currently obtained through metrics such as Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS)[2] and Press Ganey (PG)[3] surveys. Such surveys are mailed to a variable proportion of patients following their discharge from the hospital, and ask patients about the quality of care they received during their admission. Domains assessed regarding the patients' inpatient experiences range from room cleanliness to the amount of time the physician spent with them.

The Society of Hospital Medicine (SHM), the largest professional medical society representing hospitalists, encourages the use of patient satisfaction surveys to measure hospitalist providers' quality of patient care.[4] Ideally, accurate information would be delivered as feedback to individual providers in a timely manner in hopes of improving performance; however, the current methodology has shortcomings that limit its usefulness. First, several hospitalists and consultants may be involved in the care of 1 patient during the hospital stay, but the score can only be tied to a single physician. Current survey methods attribute all responses to that particular doctor, usually the attending of record, although patients may very well be thinking of other physicians when responding to questions. Second, only a few questions on the surveys ask about doctors' performance. Aforementioned surveys have 3 to 8 questions about doctors' care, which limits the ability to assess physician performance comprehensively. Finally, the surveys are mailed approximately 1 week after the patient's discharge, usually without a name or photograph of the physician to facilitate patient/caregiver recall. This time lag and lack of information to prompt patient recall likely lead to impreciseness in assessment. In addition, the response rates to these surveys are typically low, around 25% (personal oral communication with our division's service excellence stakeholder Dr. L.P. in September 2013). These deficiencies limit the usefulness of such data in coaching individual providers about their performance because they cannot be delivered in a timely fashion, and the reliability of the attribution is suspect.

With these considerations in mind, we developed and validated a new survey metric, the Tool to Assess Inpatient Satisfaction with Care from Hospitalists (TAISCH). We hypothesized that the results would be different from those collected using conventional methodologies.

PATIENTS AND METHODS

Study Design and Subjects

Our cross‐sectional study surveyed inpatients under the care of hospitalist physicians working without the support of trainees or allied health professionals (such as nurse practitioners or physician assistants). The subjects were hospitalized at a 560‐bed academic medical center on a general medical floor between September 2012 and December 2012. All participating hospitalist physicians were members of a division of hospital medicine.

TAISCH Development

Several steps were taken to establish content validity evidence.[5] We developed TAISCH by building upon the theoretical underpinnings of the quality of care measures that are endorsed by the SHM Membership Committee Guidelines for Hospitalists Patient Satisfaction.[4] This directive recommends that patient satisfaction with hospitalist care should be assessed across 6 domains: physician availability, physician concern for patients, physician communication skills, physician courteousness, physician clinical skills, and physician involvement of patients' families. Other existing validated measures tied to the quality of patient care were reviewed, and items related to the physician's care were considered for inclusion to further substantiate content validity.[6, 7, 8, 9, 10, 11, 12] Input from colleagues with expertise in clinical excellence and service excellence was also solicited. This included the director of Hopkins' Miller Coulson Academy of Clinical Excellence and the grant review committee members of the Johns Hopkins Osler Center for Clinical Excellence (who funded this study).[13, 14]

The preliminary instrument contained 17 items, including 2 conditional questions, and was first pilot tested on 5 hospitalized patients. We assessed the time it took to administer the surveys as well as patients' comments and questions about each survey item. This resulted in minor wording changes for clarification and changes in the order of the questions. We then pursued a second phase of piloting using the revised survey, which was administered to >20 patients. There were no further adjustments as patients reported that TAISCH was clear and concise.

From interviews with patients after pilot testing, it became clear that respondents were carefully reflecting on the quality of care and performance of their treating physician, thereby generating response process validity evidence.[5]

Data Collection

To ensure that patients had perspective upon which to base their assessment, they were only asked to appraise physicians after being cared for by the same hospitalist provider for at least 2 consecutive days. Patients who were on isolation, those who were non‐English speaking, and those with impaired decision‐making capacity (such as mental status change or dementia) were excluded. Patients were enrolled only if they could correctly name their doctor or at least identify a photograph of their hospitalist provider on a page that included pictures of all division members. Those patients who were able to name the provider or correctly select the provider from the page of photographs were considered to have correctly identified their provider. In order to ensure the confidentiality of the patients and their responses, all data collections were performed by a trained research assistant who had no patient‐care responsibilities. The survey was confidential, did not include any patient identifiers, and patients were assured that providers would never see their individual responses. The patients were given options to complete TAISCH either by verbally responding to the research assistant's questions, filling out the paper survey, or completing the survey online using an iPad at the bedside. TAISCH specifically asked the patients to rate their hospitalist provider's performance along several domains: communication skills, clinical skills, availability, empathy, courteousness, and discharge planning; 5‐point Likert scales were used exclusively.

In addition to the TAISCH questions, we asked patients (1) an overall satisfaction question, I would recommend Dr. X to my loved ones should he or she need hospitalization in the future (response options: strongly disagree, disagree, neutral, agree, strongly agree), (2) their pain level using the Wong‐Baker pain scale,[15] and (3) the Jefferson Scale of Patient's Perceptions of Physician Empathy (JSPPPE).[16, 17] Associations between TAISCH and these variables (as well as PG data) would be examined to ascertain relations to other variables validity evidence.[5] Specifically, we sought to ascertain discriminant and convergent validity where the TAISCH is associated positively with constructs where we expect positive associations (convergent) and negatively with those we expect negative associations (discriminant).[18] The Wong‐Baker pain scale is a recommended pain‐assessment tool by the Joint Commission on Accreditation of Healthcare Organizations, and is widely used in hospitals and various healthcare settings.[19] The scale has a range from 0 to 10 (0 for no pain and 10 indicating the worst pain). The hypothesis was that the patients' pain levels would adversely affect their perception of the physician's performance (discriminant validity). JSPPPE is a 5‐item validated scale developed to measure patients' perceptions of their physicians' empathic engagement. It has significant correlations with the American Board of Internal Medicine's patient rating surveys, and it is used in standardized patient examinations for medical students.[20] The hypothesis was that patient perception about the quality of physician care would correlate positively with their assessment of the physician's empathy (convergent validity).

Although all of the hospitalist providers in the division consented to participate in this study, only hospitalist providers for whom at least 4 patient surveys were collected were included in the analysis. The study was approved by our institutional review board.

Data Analysis

All data were analyzed using Stata 11 (StataCorp, College Station, TX). Data were analyzed to determine the potential for a single comprehensive assessment of physician performance with confirmatory factor analysis (CFA) using maximum likelihood extraction. Additional factor analyses examined the potential for a multiple factor solution using exploratory factor analysis (EFA) with principle component factor analysis and varimax rotation. Examination of scree plots, factor loadings for individual items greater than 0.40, eigenvalues greater than 1.0, and substantive meaning of the factors were all taken into consideration when determining the number of factors to retain from factor analytic models.[21] Cronbach's s were calculated for each factor to assess reliability. These data provided internal structure validity evidence (demonstrated by acceptable reliability and factor structure) to TAISCH.[5]

After arriving at the final TAISCH scale, composite TAISCH scores were computed. Associations between composite TAISCH scores with the Wong‐Baker pain scale, the JSPPPE, and the overall satisfaction question were assessed using linear regression with the svy command in Stata to account for the nested design of having each patient report on a single hospitalist provider. Correlation between composite TAISCH score and PG physician care scores (comprised of 5 questions: time physician spent with you, physician concern with questions/worries, physician kept you informed, friendliness/courtesy of physician, and skill of physician) were assessed at the provider level when both data were available.

RESULTS

A total of 330 patients were considered to be eligible through medical record screening. Of those patients, 73 (22%) were already discharged by the time the research assistant attempted to enroll them after 2 days of care by a single physician. Of 257 inpatients approached, 30 patients (12%) refused to participate. Among the 227 consented patients, 24 (9%) were excluded as they were unable to correctly identify their hospitalist provider. A total of 203 patients were enrolled, and each patient rated a single hospitalist; a total of 29 unique hospitalists were assessed by these patients. The patients' mean age was 60 years, 114 (56%) were female, and 61 (30%) were of nonwhite race (Table 1). The hospitalist physicians' demographic information is also shown in Table 1. Two hospitalists with fewer than 4 surveys collected were excluded from the analysis. Thus, final analysis included 200 unique patients assessing 1 of the 27 hospitalists (mean=7.4 surveys per hospitalist).

Characteristics of the 203 Patients and 29 Hospitalist Physicians Studied
CharacteristicsValue
  • NOTE: Abbreviations: SD, standard deviation.

Patients, N=203
Age, y, mean (SD)60.0 (17.2)
Female, n (%)114 (56.1)
Nonwhite race, n (%)61 (30.5)
Observation stay, n (%)45 (22.1)
How are you feeling today? n (%)
Very poor11 (5.5)
Poor14 (7.0)
Fair67 (33.5)
Good71 (35.5)
Very good33 (16.5)
Excellent4 (2.0)
Hospitalists, N=29
Age, n (%)
2630 years7 (24.1)
3135 years8 (27.6)
3640 years12 (41.4)
4145 years2 (6.9)
Female, n (%)11 (37.9)
International medical graduate, n (%)18 (62.1)
Years in current practice, n (%)
<19 (31.0)
127 (24.1)
346 (20.7)
565 (17.2)
7 or more2 (6.9)
Race, n (%)
Caucasian4 (13.8)
Asian19 (65.5)
African/African American5 (17.2)
Other1 (3.4)
Academic rank, n (%)
Assistant professor9 (31.0)
Clinical instructor10 (34.5)
Clinical associate/nonfaculty10 (34.5)
Percentage of clinical effort, n (%)
" indentlevel="2">>70%6 (20.7)
50%70%19 (65.5)
<50%4 (13.8)

Validation of TAISCH

On the 17‐item TAISCH administered, the 2 conditional questions (When I asked to see Dr. X, s/he came within a reasonable amount of time. and If Dr. X interacted with your family, how well did s/he deal with them?) were applicable to fewer than 40% of patients. As such, they were not included in the analysis.

Internal Structure Validity Evidence

Results from factor analyses are shown in Table 2. The CFA modeling of a single factor solution with 15 items explained 42% of the total variance. The 27 hospitalists' average 15‐item TAISCH score ranged from 3.25 to 4.28 (mean [standard deviation]=3.82 [0.24]; possible score range: 15). Reliability of the 15‐item TAISCH was appropriate (Cronbach's =0.88).

Factor Loadings for 15‐Item TAISCH Measure Based on Confirmatory Factor Analysis
TAISCH (Cronbach's =0.88)Factor Loading
  • NOTE: Abbreviations: TAISCH, Tool to Assess Inpatient Satisfaction with Care from Hospitalists. *Response category: below average, average, above average, top 10% of all doctors, the very best of any doctor I have come across. Response category: none, a little, some, a lot, tremendously. Response category: strongly disagree, disagree, neutral, agree, strongly agree. Response category: poor, fair, good, very good, excellent. Response category: never, rarely, sometimes, most of the time, every single time.

Compared to all other physicians that you know, how do you rate Dr. X's compassion, empathy, and concern for you?*0.91
Compared to all other physicians that you know, how do you rate Dr. X's ability to communicate with you?*0.88
Compared to all other physicians that you know, how do you rate Dr. X's skill in diagnosing and treating your medical conditions?*0.88
Compared to all other physicians that you know, how do you rate Dr. X's fund of knowledge?*0.80
How much confidence do you have in Dr. X's plan for your care?0.71
Dr. X kept me informed of the plans for my care.0.69
Effectively preparing patients for discharge is an important part of what doctors in the hospital do. How well has Dr. X done in getting you ready to be discharged from the hospital?0.67
Dr. X let me talk without interrupting.0.60
Dr. X encouraged me to ask questions.0.59
Dr. X checks to be sure I understood everything.0.55
I sensed Dr. X was in a rush when s/he was with me. (reverse coded)0.55
Dr. X showed interest in my views and opinions about my health.0.54
Dr. X discusses options with me and involves me in decision making.0.47
Dr. X asked permission to enter the room and waited for an answer.0.25
Dr. X sat down when s/he visited my bedside.0.14

As shown in Table 2, 2 variables had factor loadings below the minimum threshold of 0.40 in the CFA for the 15‐item TAISCH when modeling a single factor solution. Both items were related to physician etiquette: Dr. X asked permission to enter the room and waited for an answer. and Dr. X sat down when he/she visited my bedside.

When CFA was executed again, as a single factor omitting the 2 items that demonstrated lower factor loadings, the 13‐item single factor solution explained 47% of the total variance, and the Cronbach's was 0.92.

EFA models were also explored for potential alternate solutions. These analyses resulted in lesser reliability (low Cronbach's ), weak construct operationalization, and poor face validity (as judged by the research team).

Both the 13‐ and 15‐item single factor solutions were examined further to determine whether associations with criterion variables (pain, empathy) differed substantively. Given that results were similar across both solutions, subsequent analyses were completed with the 15‐item single factor solution, which included the etiquette‐related variables.

Relationship to Other Variables Validity Evidence

The association between the 15‐item TAISCH and JSPPPE was significantly positive (=12.2, P<0.001). Additionally, there was a positive and significant association between TAISCH and the overall satisfaction question: I would recommend Dr. X to my loved ones should they need hospitalization in the future. (=11.2, P<0.001). This overall satisfaction question was also associated positively with JSPPPE (=13.2, P<0.001). There was a statistically significant negative association between TAISCH and Wong‐Baker pain scale (=2.42, P<0.05).

The PG data from the same period were available for 24 out of 27 hospitalists. The number of PG surveys collected per provider ranged from 5 to 30 (mean=14). At the provider level, there was not a statistically significant correlation between PG and the 15‐item TAISCH (P=0.51). Of note, PG was also not significantly correlated with the overall satisfaction question, JSPPPE, or the Wong‐Baker pain scale (all P>0.10).

DISCUSSION

Our new metric, TAISCH, was found to be a reliable and valid measurement tool to assess patient satisfaction with the hospitalist physician's care. Because we only surveyed patients who could correctly identify their hospitalist physicians after interacting for at least 2 consecutive days, the attribution of the data to the individual hospitalist is almost certainly correct. The high participation rate indicates that the patients were not hesitant about rating their hospitalist provider's quality of care, even when asked while they were still in the hospital.

The majority of the patients approached were able to correctly identify their hospitalist provider. This rate (91%) was much higher than the rate previously reported in the literature where a picture card was used to improve provider recognition.[22] It is also likely that 1 physician, rather than a team of physicians, taking care of patients make it easier for patients to recall the name and recognize the face of their inpatient provider.

The CFA of TAISCH showed good fit but suggests that 2 variables, both from Kahn's etiquette‐based medicine (EtBM) checklist,[9] may not load in the same way as the other items. Tackett and colleagues reported that hospitalists who performed more EtBM behaviors scored higher on PG evaluations.[23] Such results, along with the comparable explanation of variance and reliability, convinced us to retain these 2 items in the final 15‐item TAISCH as dictated by the CFA. Although the literature supports the fact that physician etiquette is related to perception of high‐quality care, it is possible that these 2 questions were answered differently (and thereby failed to load the same way), because environmental limitations may be preventing physicians' ability to perform them consistently. We prefer the 15‐item version of TAISCH and future studies may provide additional information about its performance as compared to the 13‐item adaptation.

The significantly negative association between the Wong‐Baker pain scale and TAISCH stresses the importance of adequately addressing and treating the patient's pain. Hanna et al. showed that the patients' perceptions of pain control was associated with their overall satisfaction score measured by HCAHPS.[24] The association seen in our study was not unexpected, because TAISCH is administered while the patients are acutely ill in the hospital, when pain is likely more prevalent and severe than it is during the postdischarge settings (when the HCAHPS or PG surveys are administered). Interestingly, Hanna et al. discovered that the team's attention to controlling pain was more strongly correlated with overall satisfaction than was the actual pain control.[24] These data, now confirmed by our study, should serve to remind us that a hospitalist's concern and effort to relieve pain may augment patient satisfaction with the quality of care, even when eliminating the pain may be difficult or impossible.

TAISCH was found not to be correlated with PG scores. Several explanations for this deserve consideration. First, the postdischarge PG survey that is used for our institution does not list the name of the specific hospitalist providers for the patients to evaluate. Because patients encounter multiple physicians during their hospital stay (eg, emergency department physicians, hospitalist providers, consultants), it is possible that patients are not reflecting on the named doctor when assessing the the attending of record on the PG mailed questionnaire. Second, the representation of patients who responded to TAISCH and PG were different; almost all patients completed TAISCH as opposed to a small minority who decide to respond to the PG survey. Third, TAISCH measures the physicians' performance more comprehensively with a larger number of variables. Last, it is possible that we were underpowered to detect significant correlation, because there were only 24 providers who had data from both TAISCH and PG. However, our results endorse using caution in interpreting PG scores for individual hospitalist's performance, particularly for high‐stakes consequences (including the provision of incentives to high performer and the insistence on remediation for low performers).

Several limitations of this study should be considered. First, only hospitalist providers from a single division were assessed. This may limit the generalizability of our findings. Second, although patients were assured about confidentiality of their responses, they might have provided more favorable answers, because they may have felt uncomfortable rating their physician poorly. One review article of the measurement of healthcare satisfaction indicated that impersonal (mailed) methods result in more criticism and lower satisfaction than assessments made in person using interviews. As the trade‐off, the mailed surveys yield lower response rates that may introduce other forms of bias.[25] Even on the HCHAPS survey report for the same period from our institution, 78% of patients gave top box ratings for our doctors' communication skills, which is at the state average.[26] Similarly, a study that used postdischarge telephone interviews to collect patients' satisfaction with hospitalists' care quality reported an average score of 4.20 out of 5.[27] These findings confirm that highly skewed ratings are common for these types of surveys, irrespective of how or when the data are collected.

Despite the aforementioned limitations, TAISCH use need not be limited to hospitalist physicians. It may also be used to assess allied health professionals or trainees performance, which cannot be assessed by HCHAPS or PG. Applying TAISCH in different hospital settings (eg, emergency department or critical care units), assessing hospitalists' reactions to TAISCH, learning whether TAISCH leads to hospitalists' behavior changes or appraising whether performance can improve in response to coaching interventions for those performing poorly are all research questions that merit additional consideration.

CONCLUSION

TAISCH allows for obtaining patient satisfaction data that are highly attributable to specific hospitalist providers. The data collection method also permits high response rates so that input comes from almost all patients. The timeliness of the TAISCH assessments also makes it possible for real‐time service recovery, which is impossible with other commonly used metrics assessing patient satisfaction. Our next step will include testing the most effective way to provide feedback to providers and to coach these individuals so as to improve performance.

Acknowledgements

The authors would like to thank Po‐Han Chen at the BEAD Core for his statistical analysis support.

Disclosures: This study was supported by the Johns Hopkins Osler Center for Clinical Excellence. Dr. Wright is a Miller‐Coulson Family Scholar and is supported through the Johns Hopkins Center for Innovative Medicine. The authors report no conflicts of interest.

References

Online-Only Materials

   Comments ()