User login
Using Video to Validate Handoff Quality
Over the last decade, there has been an unprecedented focus on physician handoffs in US hospitals. One major reason for this are the reductions in residency duty hours that have been mandated by the American Council for Graduate Medical Education (ACGME), first in 2003 and subsequently revised in 2011.[1, 2] As residents work fewer hours, experts believe that potential safety gains from reduced fatigue are countered by an increase in the number of handoffs, which represent a risk due to the potential miscommunication. Prior studies show that critical patient information is often lost or altered during this transfer of clinical information and professional responsibility, which can result in patient harm.[3, 4] As a result of these concerns, the ACGME now requires residency programs to ensure and monitor effective, structured hand‐over processes to facilitate both continuity of care and patient safety. Programs must ensure that residents are competent in communicating with team members in the hand‐over process.[2] Moreover, handoffs have also been a major improvement focus for organizations with broader scope than teaching hospitals, including the World Health Organization, Joint Commission, and the Society for Hospital Medicine (SHM).[5, 6, 7]
Despite this focus on handoffs, monitoring quality of handoffs has proven challenging due to lack of a reliable and validated tool to measure handoff quality. More recently, the Accreditation Council of Graduate Medical Education's introduction of the Next Accreditation System, with its focus on direct observation of clinical skills to achieve milestones, makes it crucial for residency educators to have valid tools to measure competence in handoffs. As a result, it is critical that instruments to measure handoff performance are not only created but also validated.[8]
To help fill this gap, we previously reported on the development of a 9‐item Handoff Clinical Examination Exercise (CEX) assessment tool. The Handoff CEX, designed for use by those participating in the handoff or by a third‐party observer, can be used to rate the quality of patient handoffs in domains such as professionalism and communication skills between the receiver and sender of patient information.[9, 10] Despite prior demonstration of feasibility of use, the initial tool was perceived as lengthy and redundant. In addition, although the tool has been shown to discriminate between performance of novice and expert nurses, the construct validity of this tool has not been established.[11] Establishing construct validity is important to ensuring that the tool can measure the construct in question, namely whether it detects those who are actually competent to perform handoffs safely and effectively. We present here the results of the development of a shorter Handoff Mini‐CEX, along with the formal establishment of its construct validity, namely its ability to distinguish between levels of performance in 3 domains of handoff quality.
METHODS
Adaption of the Handoff CEX and Development of the Abbreviated Tool
The 9‐item Handoff CEX is a paper‐based instrument that was created by the investigators (L.I.H., J.M.F., V.M.A.) to evaluate either the sender or the receiver of handoff communications and has been used in prior studies (see Supporting Information, Appendix 1, in the online version of this article).[9, 10] The evaluation may be conducted by either an observer or by a handoff participant. The instrument includes 6 domains: (1) setting, (2) organization and efficiency, (3) communication skills, (4) content, (5) clinical judgment, and (6) humanistic skills/professionalism. Each domain is graded on a 9‐point rating scale, modeled on the widely used Mini‐CEX (Clinical Evaluation Exercise) for real‐time observation of clinical history and exam skills in internal medicine clerkships and residencies (13=unsatisfactory, 46=marginal/satisfactory, 79=superior).[12] This familiar 9‐point scale is utilized in graduate medical education evaluation of the ACGME core competencies.
To standardize the evaluation, the instrument uses performance‐based anchors for evaluating both the sender and the receiver of the handoff information. The anchors are derived from functional evaluation of the roles of senders and receivers in our preliminary work at both the University of Chicago and Yale University, best practices in other high‐reliability industries, guidelines from the Joint Commission and the SHM, and prior studies of effective communication in clinical systems.[5, 6, 13]
After piloting the Handoff CEX with the University of Chicago's internal medicine residency program (n=280 handoff evaluations), a strong correlation was noted between the measures of content (medical knowledge), patient care, clinical judgment, organization/efficiency, and communication skills. Moreover, the Handoff CEX's Cronbach , or measurement of internal reliability and consistency, was very high (=0.95). Given the potential of redundant items, and to increase ease of use of the instrument, factor analysis was used to reduce the instrument to yield a shorter 3‐item tool, the Handoff Mini‐CEX, that assessed 3 of the initial items: setting, communication skills, and professionalism. Overall, performance on these 3 items were responsible for 82% of the variance of overall sign‐out quality (see Supporting Information, Appendix 2, in the online version of this article).
Establishing Construct Validity of the Handoff Mini‐CEX
To establish construct validity of the Handoff Mini‐CEX, we adapted a protocol used by Holmboe and colleagues to report the construct validity of the Handoff Mini‐CEX, which is based on the development and use of video scenarios depicting varying levels of clinical performance.[14] A clinical scenario script, based on prior observational work, was developed, which represented an internal medicine resident (the sender) signing out 3 different patients to colleagues (intern [postgraduate year 1] and resident). This scenario was developed to explicitly include observable components of professionalism, communication, and setting. Three levels of performancesuperior, satisfactory, and unsatisfactorywere defined and described for the 3 domains. These levels were defined, and separate scripts were written using this information, demonstrating varying levels of performance in each of the domains of interest, using the descriptive anchors of the Handoff Mini‐CEX.
After constructing the superior, or gold standard, script that showcases superior communication, professionalism, and setting, individual domains of performance were changed (eg, to satisfactory or unsatisfactory), while holding the other 2 constant at the superior level of performance. For example, superior communication requires that the sender provides anticipatory guidance and includes clinical rationale, whereas unsatisfactory communication includes vague language about overnight events and a disorganized presentation of patients. Superior professionalism requires no inappropriate comments by the sender about patients, family, and staff as well as a presentation focused on the most urgent patients. Unsatisfactory professionalism is shown by a hurried and inattentive sign‐out, with inappropriate comments about patients, family, and staff. Finally, a superior setting is one in which the receiver is listening attentively and discourages interruptions, whereas an unsatisfactory setting finds the sender or receiver answering pages during the handoff surrounded by background noise. We omitted the satisfactory level for setting due to the difficulties in creating subtleties in the environment.
Permutations of each of these domains resulted in 6 scripts depicting different levels of sender performance (see Supporting Information, Appendix 3, in the online version of this article). Only the performance level of the sender was changed, and the receivers of the handoff performance remained consistent, using best practices for receivers, such as attentive listening, asking questions, reading back, and taking notes during the handoff. The scripts were developed by 2 investigators (V.M.A., S.B.), then reviewed and edited independently by other investigators (J.M.F., P.S.) to achieve consensus. Actors were recruited to perform the video scenarios and were trained by the physician investigators (J.M.F., V.M.A.). The part of the sender was played by a study investigator (P.S.) with prior acting experience, and who had accrued over 40 hours of experience observing handoffs to depict varying levels of handoff performance. The digital video recordings ranged in length from 2.00 minutes to 4.08 minutes. All digital videos were recorded using a Sony XDCAM PMW‐EX3 HD camcorder (Sony Corp., Tokyo, Japan.
Participants
Faculty from the University of Chicago Medical Center and Yale University were included. At the University of Chicago, faculty were recruited to participate via email by the study investigators to the Research in Medical Education (RIME) listhost, which includes program directors, clerkship directors, and medical educators. Two sessions were offered and administered. Continuing medical education (CME) credit was provided for participation, as this workshop was given in conjunction with the RIME CME conference. Evaluations were deidentified using a unique identifier for each rater. At Yale University, the workshop on handoffs was offered as part of 2 seminars for program directors and chief residents from all specialties. During these seminars, program directors and chief residents used anonymous evaluation rating forms that did not capture rater identifiers. No other incentive was provided for participation. Although neither faculty at the University of Chicago nor Yale University received any formal training on handoff evaluation, they did receive a short introduction to the importance of handoffs and the goals of the workshop. The protocol was deemed exempt by the institutional review board at the University of Chicago.
Workshop Protocol
After a brief introduction, faculty viewed the tapes in random order on a projected screen. Participants were instructed to use the Handoff Mini‐CEX to rate whichever element(s) of handoff quality they believed they could suitably evaluate while watching the tapes. The videos were rated on the Handoff Mini‐CEX form, and participants anonymously completed the forms independently without any contact with other participants. The lead investigators proctored all sessions. At University of Chicago, participants viewed and rated all 6 videos over the course of an hour. At Yale University, due to time constraints in the program director and chief resident seminars, participants reviewed 1 of the videos in seminar 1 (unsatisfactory professionalism) and 2 in the other seminar (unsatisfactory communication, unsatisfactory professionalism) (Table 1).
Unsatisfactory | Satisfactory | Superior | |
---|---|---|---|
| |||
Communication | Script 3 (n=36)a | Script 2 (n=13) | Script 1 (n=13) |
Uses vague language about overnight events, missing critical patient information, disorganized. | Insufficient level of clinical detail, directions are not as thorough, handoff is generally on task and sufficient. | Anticipatory guidance provided, rationale explained; important information is included, highlights sick patients. | |
Look in the record; I'm sure it's in there. And oh yeah, I need you to check enzymes and finish ruling her out. | So the only thing to do is to check labs; you know, check CBC and cardiac enzymes. | So for today, I need you to check post‐transfusion hemoglobin to make sure it's back to the baseline of 10. If it's under 10, then transfuse her 2 units, but hopefully it will be bumped up. Also continue to check cardiac enzymes; the next set is coming at 2 pm, and we need to continue the rule out. If her enzymes are positive or she has other ECG changes, definitely call the cardio fellow, since they'll want to take her to the CCU. | |
Professionalism | Script 5 (n=39)a | Script 4 (n=22)a | Script 1 |
Hurried, inattentive, rushing to leave, inappropriate comments (re: patients, family, staff). | Some tangential comments (re: patients, family, staff). | Appropriate comments (re: patients, family, staff), focused on task. | |
[D]efinitely call the cards fellow, since they'll want to take her to the CCU. And let me tell you, if you don't call her, she'll rip you a new one. | Let's breeze through them quickly so I can get out of here, I've had a rough day. I'll start with the sickest first, and oh my God she's a train wreck! | ||
Setting | Script 6 (n=13) | Script 1 | |
Answering pages during handoff, interruptions (people entering room, phone ringing). | Attentive listening, no interruptions, pager silenced. |
Data Collection and Statistical Analysis
Using combined data from University of Chicago and Yale University, descriptive statistics were reported as raw scores on the Handoff Mini‐CEX. To assess internal consistency of the tool, Cronbach was used. To assess inter‐rater reliability of these attending physician ratings on the tool, we performed a Kendall coefficient of concordance analysis after collapsing the ratings into 3 categories (unsatisfactory, satisfactory, superior). In addition, we also calculated intraclass correlation coefficients for each item using the raw data and generalizability analysis to calculate the number of raters that would be needed to achieve a desired reliability of 0.95. To ascertain if faculty were able to detect varying levels of performance depicted in the video, an ordinal test of trend on the communication, professionalism, and setting scores was performed.
To assess for rater bias, we were able to use the identifiers on the University of Chicago data to perform a 2‐way analysis of variance (ANOVA) to assess if faculty scores were associated with performance level after controlling for faculty. The results of the faculty rater coefficients and P values in the 2‐way ANOVA were also examined for any evidence of rater bias. All calculations were performed in Stata 11.0 (StataCorp, College Station, TX) with statistical significance defined as P<0.05.
RESULTS
Forty‐seven faculty members (14=site 1; 33=site 2) participated in the validation workshops (2 at the University of Chicago, and 2 at Yale University), which were held in August 2011 and September 2011, providing a total of 172 observations of a possible 191 (90%).
The overall handoff quality ratings for the superior, gold standard video (superior communication, professionalism, and communication) ranged from 7 to 9 with a mean of 8.5 (standard deviation [SD] 0.7). The overall ratings for the video depicting satisfactory communication (satisfactory communication, superior professionalism and setting) ranged from 5 to 9 with a mean of 7.3 (SD 1.1). The overall ratings for the unsatisfactory communication (unsatisfactory communication, superior professionalism and setting) video ranged from 1 to 7 with a mean of 2.6 (SD 1.2). The overall ratings for the satisfactory professionalism video (satisfactory professionalism, superior communication and setting) ranged from 4 to 8 with a mean of 5.7 (SD 1.3). The overall ratings for the unsatisfactory professionalism (unsatisfactory professionalism, superior communication and setting) video ranged from 2 to 5 with a mean of 2.4 (SD 1.03). Finally, the overall ratings for the unsatisfactory setting (unsatisfactory setting, superior communication and professionalism) video ranged from 1 to 8 with a mean of 3.1 (SD 1.7).
Figure 1 demonstrates that for the domain of communication, the raters were able to discern the unsatisfactory performance but had difficulty reliably distinguishing between superior and satisfactory performance. Figure 2 illustrates that for the domain of professionalism, raters were able to detect the videos' changing levels of performance at the extremes of behavior, with unsatisfactory and superior displays more readily identified. Figure 3 shows that for the domain of setting, the raters were able to discern the unsatisfactory versus superior level of the changing setting. Of note, we also found a moderate significant correlation between ratings of professionalism and communication (r=0.47, P<0.001).
The Cronbach , or measurement of internal reliability and consistency, for the Handoff Mini‐CEX (3 items plus overall) was 0.77, indicating high internal reliability and consistency. Using data from University of Chicago, where raters were labeled with a unique identifier, the Kendall coefficient of concordance was calculated to be 0.79, demonstrating high inter‐rater reliability of the faculty raters. High inter‐rater reliability was also seen using intraclass coefficients for each domain: communication (0.84), professionalism (0.68), setting (0.83), and overall (0.89). Using generalizability analysis, the average reliability was determined to be above 0.9 for all domains (0.99 for overall).
Last, the 2‐way ANOVA (n=75 observations from 13 raters) revealed no evidence of rater bias when examining the coefficient for attending rater (P=0.55 for professionalism, P=0.45 for communication, P=0.92 for setting). The range of scores for each video, however, was broad (Table 2).
Unsatisfactory | Satisfactory | Superior | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Mean | Median | Range | Mean | Median | Range | Mean | Median | Range | Pb | |
| ||||||||||
Professionalism | 2.3 | 2 | 14 | 4.4 | 4 | 38 | 7.0 | 7 | 39 | 0.026 |
Communication | 2.8 | 3 | 16 | 7 | 8 | 59 | 6.6 | 7 | 19 | 0.005 |
Setting | 3.1 | 3 | 18 | 7.5 | 8 | 29 | 0.005 |
DISCUSSION
This study demonstrates that valid conclusions on handoff performance can be drawn using the Handoff CEX as the instrument to rate handoff quality. Utilizing standardized videos depicting varying levels of performance communication, professionalism, and setting, the Handoff Mini‐CEX has demonstrated potential to discern between increasing levels of performance, providing evidence for the construct validity of the instrument.
We observed that faculty could reliably detect unsatisfactory professionalism with ease, and that there was a distinct correlation between faculty ratings and the internally set levels of performance displayed in the videos. This trend demonstrated that faculty were able to discern different levels of professionalism using the Handoff Mini‐CEX. It became more difficult, however, for faculty to detect superior professionalism when the domain of communication was permuted. If the sender of the handoff was professional but the information delivered was disorganized, inaccurate, and missing crucial pieces of information, the faculty perceived this ineffective communication as unprofessional. Prior literature on professionalism has found that communication is a necessary component of professional behavior, and consequently, being a competent communicator is necessary to fulfill ones duty as a professional physician.[15, 16]
This is of note because we did find a moderate significant correlation between ratings of professionalism and communication. It is possible that this distinction would be made clearer with formal rater training in the future prior to any evaluations. However, it is also possible that professionalism and communication, due to a synergistic role between the 2 domains, cannot be separated. If this is the case, it would be important to educate clinicians to present patients in a concise, clear, and accurate way with a professional demeanor. Acknowledging professional responsibility as an integral piece of patient care is also critical in effectively communicating patient information.[5]
We also noted that faculty could detect unsatisfactory communication consistently; however, they were unable to differentiate between satisfactory and superior communication reliably or consistently. Because the unsatisfactory professionalism, unsatisfactory setting, and satisfactory professionalism videos all demonstrated superior communication, we believe that the faculty penalized communication when distractions, in the form of interruptions and rude behavior by the resident giving the handoff, interrupted the flow of the handoff. Thus, the wide ranges in scores observed by some raters may be attributed to this interaction between the Handoff Mini‐CEX domains. In the future, definitions of the anchors, including at the middle spectrum of performance, and rater training may improve the ability of raters to distinguish performance between each domain.
The overall value of the Handoff Mini‐CEX is in its ease of use, in part due to its brevity, as well as evidence for its validity in distinguishing between varying levels of performance. Given the emphasis on monitoring handoff quality and performance, the Handoff Mini‐CEX provides a standard foundation from which baseline handoff performance can be easily measured and improved. Moreover, it can also be used to give individual feedback to a specific practicing clinician on their practices and an opportunity to improve. This is particularly important given current recommendations by the Joint Commission, that handoffs are standardized, and by the ACGME, that residents are competent in handoff skills. Moreover, given the creation of the SHM's handoff recommendations and handoffs as a core competency for hospitalists, the tool provides the ability for hospitalist programs to actually assess their handoff practices as baseline measurements for any quality improvement activities that may take place.
Faculty were able to discern the superior and unsatisfactory levels of setting with ease. After watching and rating the videos, participants said that the chaotic scene of the unsatisfactory setting video had significant authenticity, and that they were constantly interrupted during their own handoffs by pages, phone calls, and people entering the handoff space. System‐level fixes, such as protected time and dedicated space for handoffs, and discouraging pages to be sent during the designated handoff time, could mitigate the reality of unsatisfactory settings.[17, 18]
Our study has several limitations. First, although this study was held at 2 sites, it included a small number of faculty, which can impact the generalizability of our findings. Implementation varied at Yale University and the University of Chicago, preventing use of all data for all analyses. Furthermore, institutional culture may also impact faculty raters' perceptions, so future work aims at repeating our protocol at partner institutions, increasing both the number and diversity of participants. We were also unable to compare the new shorter Handoff Mini‐CEX to the larger 9‐item Handoff CEX in this study.
Despite these limitations, we believe that the Handoff Mini‐CEX, has future potential as an instrument with which to make valid and reliable conclusions about handoff quality, and could be used to both evaluate handoff quality and as an educational tool for trainees and faculty on effective handoff communication.
Disclosures
This work was supported by the National Institute on Aging Short‐Term Aging‐Related Research Program (5T35AG029795), Agency for Healthcare Research and Quality (1 R03HS018278‐01), and the University of Chicago Department of Medicine Excellence in Medical Education Award. Dr. Horwitz is supported by the National Institute on Aging (K08 AG038336) and by the American Federation for Aging Research through the Paul B. Beeson Career Development Award Program. Dr. Arora is funded by National Institute on Aging Career Development Award K23AG033763. Prior presentations of these data include the 2011 Association of American Medical Colleges meeting in Denver, Colorado, the 2012 Association of Program Directors of Internal Medicine meeting in Atlanta, Georgia, and the 2012 Society of General Internal Medicine Meeting in Orlando, Florida.
- The new recommendations on duty hours from the ACGME task force. New Engl J Med. 2010;363(2):e3. , , .
- ACGME common program requirements. Effective July 1, 2011. Available at: http://www.acgme.org/acgmeweb/Portals/0/PDFs/Common_Program_Requirements_07012011[2].pdf. Accessed February 8, 2014.
- Consequences of inadequate sign‐out for patient care. Arch Intern Med. 2008;168(16):1755–1760. , , , , .
- Communication failures in patient sign‐out and suggestions for improvement: a critical incident analysis. Qual Saf Healthcare. 2005;14(6):401–407. , , , , .
- Hospitalist handoffs: a systematic review and task force recommendations. J Hosp Med. 2009;4(7):433–440. , , , , , .
- A model for building a standardized hand‐off protocol. Jt Comm J Qual Patient Saf. 2006;32(11):646–655. , .
- World Health Organization Collaborating Centre for Patient Safety. Solutions on communication during patient hand‐overs. 2007; Volume 1, Solution 1. Available at: http://www.who.int/patientsafety/solutions/patientsafety/PS‐Solution3.pdf. Accessed February 8, 2014.
- Patient handoffs: standardized and reliable measurement tools remain elusive. Jt Comm J Qual Patient Saf. 2010;36(2):52–61. , .
- Development of a handoff evaluation tool for shift‐to‐shift physician handoffs: the handoff CEX. J Hosp Med. 2013;8(4):191–200. , , , et al.
- Hand‐off education and evaluation: piloting the observed simulated hand‐off experience (OSHE). J Gen Intern Med. 2010;25(2):129–134. , , , et al.
- Validation of a handoff tool: the Handoff CEX. J Clin Nurs. 2013;22(9‐10):1477–1486. , , , , , .
- The mini‐CEX: a method for assessing clinical skills. Ann Intern Med. 2003;138(6):476–481. , , , .
- Handoff strategies in settings with high consequences for failure: lessons for health care operations. Int J Qual Health Care. 2004;16(2):125–132. , , , , .
- Construct validity of the miniclinical evaluation exercise (miniCEX). Acad Med. 2003;78(8):826–830. , , , , .
- Third‐year medical students' participation in and perceptions of unprofessional behaviors. Acad Med. 2007;82(10 suppl):S35–S39. , , , et al.
- Professionalism—the next wave. N Engl J Med. 2006;355(20):2151–2152. .
- Interns overestimate the effectiveness of their hand‐off communication. Pediatrics. 2010;125(3):491–496. , , , , .
- Characterising physician listening behaviour during hospitalist handoffs using the HEAR checklist. BMJ Qual Saf. 2013;22(3):203–209. , , , , .
Over the last decade, there has been an unprecedented focus on physician handoffs in US hospitals. One major reason for this are the reductions in residency duty hours that have been mandated by the American Council for Graduate Medical Education (ACGME), first in 2003 and subsequently revised in 2011.[1, 2] As residents work fewer hours, experts believe that potential safety gains from reduced fatigue are countered by an increase in the number of handoffs, which represent a risk due to the potential miscommunication. Prior studies show that critical patient information is often lost or altered during this transfer of clinical information and professional responsibility, which can result in patient harm.[3, 4] As a result of these concerns, the ACGME now requires residency programs to ensure and monitor effective, structured hand‐over processes to facilitate both continuity of care and patient safety. Programs must ensure that residents are competent in communicating with team members in the hand‐over process.[2] Moreover, handoffs have also been a major improvement focus for organizations with broader scope than teaching hospitals, including the World Health Organization, Joint Commission, and the Society for Hospital Medicine (SHM).[5, 6, 7]
Despite this focus on handoffs, monitoring quality of handoffs has proven challenging due to lack of a reliable and validated tool to measure handoff quality. More recently, the Accreditation Council of Graduate Medical Education's introduction of the Next Accreditation System, with its focus on direct observation of clinical skills to achieve milestones, makes it crucial for residency educators to have valid tools to measure competence in handoffs. As a result, it is critical that instruments to measure handoff performance are not only created but also validated.[8]
To help fill this gap, we previously reported on the development of a 9‐item Handoff Clinical Examination Exercise (CEX) assessment tool. The Handoff CEX, designed for use by those participating in the handoff or by a third‐party observer, can be used to rate the quality of patient handoffs in domains such as professionalism and communication skills between the receiver and sender of patient information.[9, 10] Despite prior demonstration of feasibility of use, the initial tool was perceived as lengthy and redundant. In addition, although the tool has been shown to discriminate between performance of novice and expert nurses, the construct validity of this tool has not been established.[11] Establishing construct validity is important to ensuring that the tool can measure the construct in question, namely whether it detects those who are actually competent to perform handoffs safely and effectively. We present here the results of the development of a shorter Handoff Mini‐CEX, along with the formal establishment of its construct validity, namely its ability to distinguish between levels of performance in 3 domains of handoff quality.
METHODS
Adaption of the Handoff CEX and Development of the Abbreviated Tool
The 9‐item Handoff CEX is a paper‐based instrument that was created by the investigators (L.I.H., J.M.F., V.M.A.) to evaluate either the sender or the receiver of handoff communications and has been used in prior studies (see Supporting Information, Appendix 1, in the online version of this article).[9, 10] The evaluation may be conducted by either an observer or by a handoff participant. The instrument includes 6 domains: (1) setting, (2) organization and efficiency, (3) communication skills, (4) content, (5) clinical judgment, and (6) humanistic skills/professionalism. Each domain is graded on a 9‐point rating scale, modeled on the widely used Mini‐CEX (Clinical Evaluation Exercise) for real‐time observation of clinical history and exam skills in internal medicine clerkships and residencies (13=unsatisfactory, 46=marginal/satisfactory, 79=superior).[12] This familiar 9‐point scale is utilized in graduate medical education evaluation of the ACGME core competencies.
To standardize the evaluation, the instrument uses performance‐based anchors for evaluating both the sender and the receiver of the handoff information. The anchors are derived from functional evaluation of the roles of senders and receivers in our preliminary work at both the University of Chicago and Yale University, best practices in other high‐reliability industries, guidelines from the Joint Commission and the SHM, and prior studies of effective communication in clinical systems.[5, 6, 13]
After piloting the Handoff CEX with the University of Chicago's internal medicine residency program (n=280 handoff evaluations), a strong correlation was noted between the measures of content (medical knowledge), patient care, clinical judgment, organization/efficiency, and communication skills. Moreover, the Handoff CEX's Cronbach , or measurement of internal reliability and consistency, was very high (=0.95). Given the potential of redundant items, and to increase ease of use of the instrument, factor analysis was used to reduce the instrument to yield a shorter 3‐item tool, the Handoff Mini‐CEX, that assessed 3 of the initial items: setting, communication skills, and professionalism. Overall, performance on these 3 items were responsible for 82% of the variance of overall sign‐out quality (see Supporting Information, Appendix 2, in the online version of this article).
Establishing Construct Validity of the Handoff Mini‐CEX
To establish construct validity of the Handoff Mini‐CEX, we adapted a protocol used by Holmboe and colleagues to report the construct validity of the Handoff Mini‐CEX, which is based on the development and use of video scenarios depicting varying levels of clinical performance.[14] A clinical scenario script, based on prior observational work, was developed, which represented an internal medicine resident (the sender) signing out 3 different patients to colleagues (intern [postgraduate year 1] and resident). This scenario was developed to explicitly include observable components of professionalism, communication, and setting. Three levels of performancesuperior, satisfactory, and unsatisfactorywere defined and described for the 3 domains. These levels were defined, and separate scripts were written using this information, demonstrating varying levels of performance in each of the domains of interest, using the descriptive anchors of the Handoff Mini‐CEX.
After constructing the superior, or gold standard, script that showcases superior communication, professionalism, and setting, individual domains of performance were changed (eg, to satisfactory or unsatisfactory), while holding the other 2 constant at the superior level of performance. For example, superior communication requires that the sender provides anticipatory guidance and includes clinical rationale, whereas unsatisfactory communication includes vague language about overnight events and a disorganized presentation of patients. Superior professionalism requires no inappropriate comments by the sender about patients, family, and staff as well as a presentation focused on the most urgent patients. Unsatisfactory professionalism is shown by a hurried and inattentive sign‐out, with inappropriate comments about patients, family, and staff. Finally, a superior setting is one in which the receiver is listening attentively and discourages interruptions, whereas an unsatisfactory setting finds the sender or receiver answering pages during the handoff surrounded by background noise. We omitted the satisfactory level for setting due to the difficulties in creating subtleties in the environment.
Permutations of each of these domains resulted in 6 scripts depicting different levels of sender performance (see Supporting Information, Appendix 3, in the online version of this article). Only the performance level of the sender was changed, and the receivers of the handoff performance remained consistent, using best practices for receivers, such as attentive listening, asking questions, reading back, and taking notes during the handoff. The scripts were developed by 2 investigators (V.M.A., S.B.), then reviewed and edited independently by other investigators (J.M.F., P.S.) to achieve consensus. Actors were recruited to perform the video scenarios and were trained by the physician investigators (J.M.F., V.M.A.). The part of the sender was played by a study investigator (P.S.) with prior acting experience, and who had accrued over 40 hours of experience observing handoffs to depict varying levels of handoff performance. The digital video recordings ranged in length from 2.00 minutes to 4.08 minutes. All digital videos were recorded using a Sony XDCAM PMW‐EX3 HD camcorder (Sony Corp., Tokyo, Japan.
Participants
Faculty from the University of Chicago Medical Center and Yale University were included. At the University of Chicago, faculty were recruited to participate via email by the study investigators to the Research in Medical Education (RIME) listhost, which includes program directors, clerkship directors, and medical educators. Two sessions were offered and administered. Continuing medical education (CME) credit was provided for participation, as this workshop was given in conjunction with the RIME CME conference. Evaluations were deidentified using a unique identifier for each rater. At Yale University, the workshop on handoffs was offered as part of 2 seminars for program directors and chief residents from all specialties. During these seminars, program directors and chief residents used anonymous evaluation rating forms that did not capture rater identifiers. No other incentive was provided for participation. Although neither faculty at the University of Chicago nor Yale University received any formal training on handoff evaluation, they did receive a short introduction to the importance of handoffs and the goals of the workshop. The protocol was deemed exempt by the institutional review board at the University of Chicago.
Workshop Protocol
After a brief introduction, faculty viewed the tapes in random order on a projected screen. Participants were instructed to use the Handoff Mini‐CEX to rate whichever element(s) of handoff quality they believed they could suitably evaluate while watching the tapes. The videos were rated on the Handoff Mini‐CEX form, and participants anonymously completed the forms independently without any contact with other participants. The lead investigators proctored all sessions. At University of Chicago, participants viewed and rated all 6 videos over the course of an hour. At Yale University, due to time constraints in the program director and chief resident seminars, participants reviewed 1 of the videos in seminar 1 (unsatisfactory professionalism) and 2 in the other seminar (unsatisfactory communication, unsatisfactory professionalism) (Table 1).
Unsatisfactory | Satisfactory | Superior | |
---|---|---|---|
| |||
Communication | Script 3 (n=36)a | Script 2 (n=13) | Script 1 (n=13) |
Uses vague language about overnight events, missing critical patient information, disorganized. | Insufficient level of clinical detail, directions are not as thorough, handoff is generally on task and sufficient. | Anticipatory guidance provided, rationale explained; important information is included, highlights sick patients. | |
Look in the record; I'm sure it's in there. And oh yeah, I need you to check enzymes and finish ruling her out. | So the only thing to do is to check labs; you know, check CBC and cardiac enzymes. | So for today, I need you to check post‐transfusion hemoglobin to make sure it's back to the baseline of 10. If it's under 10, then transfuse her 2 units, but hopefully it will be bumped up. Also continue to check cardiac enzymes; the next set is coming at 2 pm, and we need to continue the rule out. If her enzymes are positive or she has other ECG changes, definitely call the cardio fellow, since they'll want to take her to the CCU. | |
Professionalism | Script 5 (n=39)a | Script 4 (n=22)a | Script 1 |
Hurried, inattentive, rushing to leave, inappropriate comments (re: patients, family, staff). | Some tangential comments (re: patients, family, staff). | Appropriate comments (re: patients, family, staff), focused on task. | |
[D]efinitely call the cards fellow, since they'll want to take her to the CCU. And let me tell you, if you don't call her, she'll rip you a new one. | Let's breeze through them quickly so I can get out of here, I've had a rough day. I'll start with the sickest first, and oh my God she's a train wreck! | ||
Setting | Script 6 (n=13) | Script 1 | |
Answering pages during handoff, interruptions (people entering room, phone ringing). | Attentive listening, no interruptions, pager silenced. |
Data Collection and Statistical Analysis
Using combined data from University of Chicago and Yale University, descriptive statistics were reported as raw scores on the Handoff Mini‐CEX. To assess internal consistency of the tool, Cronbach was used. To assess inter‐rater reliability of these attending physician ratings on the tool, we performed a Kendall coefficient of concordance analysis after collapsing the ratings into 3 categories (unsatisfactory, satisfactory, superior). In addition, we also calculated intraclass correlation coefficients for each item using the raw data and generalizability analysis to calculate the number of raters that would be needed to achieve a desired reliability of 0.95. To ascertain if faculty were able to detect varying levels of performance depicted in the video, an ordinal test of trend on the communication, professionalism, and setting scores was performed.
To assess for rater bias, we were able to use the identifiers on the University of Chicago data to perform a 2‐way analysis of variance (ANOVA) to assess if faculty scores were associated with performance level after controlling for faculty. The results of the faculty rater coefficients and P values in the 2‐way ANOVA were also examined for any evidence of rater bias. All calculations were performed in Stata 11.0 (StataCorp, College Station, TX) with statistical significance defined as P<0.05.
RESULTS
Forty‐seven faculty members (14=site 1; 33=site 2) participated in the validation workshops (2 at the University of Chicago, and 2 at Yale University), which were held in August 2011 and September 2011, providing a total of 172 observations of a possible 191 (90%).
The overall handoff quality ratings for the superior, gold standard video (superior communication, professionalism, and communication) ranged from 7 to 9 with a mean of 8.5 (standard deviation [SD] 0.7). The overall ratings for the video depicting satisfactory communication (satisfactory communication, superior professionalism and setting) ranged from 5 to 9 with a mean of 7.3 (SD 1.1). The overall ratings for the unsatisfactory communication (unsatisfactory communication, superior professionalism and setting) video ranged from 1 to 7 with a mean of 2.6 (SD 1.2). The overall ratings for the satisfactory professionalism video (satisfactory professionalism, superior communication and setting) ranged from 4 to 8 with a mean of 5.7 (SD 1.3). The overall ratings for the unsatisfactory professionalism (unsatisfactory professionalism, superior communication and setting) video ranged from 2 to 5 with a mean of 2.4 (SD 1.03). Finally, the overall ratings for the unsatisfactory setting (unsatisfactory setting, superior communication and professionalism) video ranged from 1 to 8 with a mean of 3.1 (SD 1.7).
Figure 1 demonstrates that for the domain of communication, the raters were able to discern the unsatisfactory performance but had difficulty reliably distinguishing between superior and satisfactory performance. Figure 2 illustrates that for the domain of professionalism, raters were able to detect the videos' changing levels of performance at the extremes of behavior, with unsatisfactory and superior displays more readily identified. Figure 3 shows that for the domain of setting, the raters were able to discern the unsatisfactory versus superior level of the changing setting. Of note, we also found a moderate significant correlation between ratings of professionalism and communication (r=0.47, P<0.001).
The Cronbach , or measurement of internal reliability and consistency, for the Handoff Mini‐CEX (3 items plus overall) was 0.77, indicating high internal reliability and consistency. Using data from University of Chicago, where raters were labeled with a unique identifier, the Kendall coefficient of concordance was calculated to be 0.79, demonstrating high inter‐rater reliability of the faculty raters. High inter‐rater reliability was also seen using intraclass coefficients for each domain: communication (0.84), professionalism (0.68), setting (0.83), and overall (0.89). Using generalizability analysis, the average reliability was determined to be above 0.9 for all domains (0.99 for overall).
Last, the 2‐way ANOVA (n=75 observations from 13 raters) revealed no evidence of rater bias when examining the coefficient for attending rater (P=0.55 for professionalism, P=0.45 for communication, P=0.92 for setting). The range of scores for each video, however, was broad (Table 2).
Unsatisfactory | Satisfactory | Superior | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Mean | Median | Range | Mean | Median | Range | Mean | Median | Range | Pb | |
| ||||||||||
Professionalism | 2.3 | 2 | 14 | 4.4 | 4 | 38 | 7.0 | 7 | 39 | 0.026 |
Communication | 2.8 | 3 | 16 | 7 | 8 | 59 | 6.6 | 7 | 19 | 0.005 |
Setting | 3.1 | 3 | 18 | 7.5 | 8 | 29 | 0.005 |
DISCUSSION
This study demonstrates that valid conclusions on handoff performance can be drawn using the Handoff CEX as the instrument to rate handoff quality. Utilizing standardized videos depicting varying levels of performance communication, professionalism, and setting, the Handoff Mini‐CEX has demonstrated potential to discern between increasing levels of performance, providing evidence for the construct validity of the instrument.
We observed that faculty could reliably detect unsatisfactory professionalism with ease, and that there was a distinct correlation between faculty ratings and the internally set levels of performance displayed in the videos. This trend demonstrated that faculty were able to discern different levels of professionalism using the Handoff Mini‐CEX. It became more difficult, however, for faculty to detect superior professionalism when the domain of communication was permuted. If the sender of the handoff was professional but the information delivered was disorganized, inaccurate, and missing crucial pieces of information, the faculty perceived this ineffective communication as unprofessional. Prior literature on professionalism has found that communication is a necessary component of professional behavior, and consequently, being a competent communicator is necessary to fulfill ones duty as a professional physician.[15, 16]
This is of note because we did find a moderate significant correlation between ratings of professionalism and communication. It is possible that this distinction would be made clearer with formal rater training in the future prior to any evaluations. However, it is also possible that professionalism and communication, due to a synergistic role between the 2 domains, cannot be separated. If this is the case, it would be important to educate clinicians to present patients in a concise, clear, and accurate way with a professional demeanor. Acknowledging professional responsibility as an integral piece of patient care is also critical in effectively communicating patient information.[5]
We also noted that faculty could detect unsatisfactory communication consistently; however, they were unable to differentiate between satisfactory and superior communication reliably or consistently. Because the unsatisfactory professionalism, unsatisfactory setting, and satisfactory professionalism videos all demonstrated superior communication, we believe that the faculty penalized communication when distractions, in the form of interruptions and rude behavior by the resident giving the handoff, interrupted the flow of the handoff. Thus, the wide ranges in scores observed by some raters may be attributed to this interaction between the Handoff Mini‐CEX domains. In the future, definitions of the anchors, including at the middle spectrum of performance, and rater training may improve the ability of raters to distinguish performance between each domain.
The overall value of the Handoff Mini‐CEX is in its ease of use, in part due to its brevity, as well as evidence for its validity in distinguishing between varying levels of performance. Given the emphasis on monitoring handoff quality and performance, the Handoff Mini‐CEX provides a standard foundation from which baseline handoff performance can be easily measured and improved. Moreover, it can also be used to give individual feedback to a specific practicing clinician on their practices and an opportunity to improve. This is particularly important given current recommendations by the Joint Commission, that handoffs are standardized, and by the ACGME, that residents are competent in handoff skills. Moreover, given the creation of the SHM's handoff recommendations and handoffs as a core competency for hospitalists, the tool provides the ability for hospitalist programs to actually assess their handoff practices as baseline measurements for any quality improvement activities that may take place.
Faculty were able to discern the superior and unsatisfactory levels of setting with ease. After watching and rating the videos, participants said that the chaotic scene of the unsatisfactory setting video had significant authenticity, and that they were constantly interrupted during their own handoffs by pages, phone calls, and people entering the handoff space. System‐level fixes, such as protected time and dedicated space for handoffs, and discouraging pages to be sent during the designated handoff time, could mitigate the reality of unsatisfactory settings.[17, 18]
Our study has several limitations. First, although this study was held at 2 sites, it included a small number of faculty, which can impact the generalizability of our findings. Implementation varied at Yale University and the University of Chicago, preventing use of all data for all analyses. Furthermore, institutional culture may also impact faculty raters' perceptions, so future work aims at repeating our protocol at partner institutions, increasing both the number and diversity of participants. We were also unable to compare the new shorter Handoff Mini‐CEX to the larger 9‐item Handoff CEX in this study.
Despite these limitations, we believe that the Handoff Mini‐CEX, has future potential as an instrument with which to make valid and reliable conclusions about handoff quality, and could be used to both evaluate handoff quality and as an educational tool for trainees and faculty on effective handoff communication.
Disclosures
This work was supported by the National Institute on Aging Short‐Term Aging‐Related Research Program (5T35AG029795), Agency for Healthcare Research and Quality (1 R03HS018278‐01), and the University of Chicago Department of Medicine Excellence in Medical Education Award. Dr. Horwitz is supported by the National Institute on Aging (K08 AG038336) and by the American Federation for Aging Research through the Paul B. Beeson Career Development Award Program. Dr. Arora is funded by National Institute on Aging Career Development Award K23AG033763. Prior presentations of these data include the 2011 Association of American Medical Colleges meeting in Denver, Colorado, the 2012 Association of Program Directors of Internal Medicine meeting in Atlanta, Georgia, and the 2012 Society of General Internal Medicine Meeting in Orlando, Florida.
Over the last decade, there has been an unprecedented focus on physician handoffs in US hospitals. One major reason for this are the reductions in residency duty hours that have been mandated by the American Council for Graduate Medical Education (ACGME), first in 2003 and subsequently revised in 2011.[1, 2] As residents work fewer hours, experts believe that potential safety gains from reduced fatigue are countered by an increase in the number of handoffs, which represent a risk due to the potential miscommunication. Prior studies show that critical patient information is often lost or altered during this transfer of clinical information and professional responsibility, which can result in patient harm.[3, 4] As a result of these concerns, the ACGME now requires residency programs to ensure and monitor effective, structured hand‐over processes to facilitate both continuity of care and patient safety. Programs must ensure that residents are competent in communicating with team members in the hand‐over process.[2] Moreover, handoffs have also been a major improvement focus for organizations with broader scope than teaching hospitals, including the World Health Organization, Joint Commission, and the Society for Hospital Medicine (SHM).[5, 6, 7]
Despite this focus on handoffs, monitoring quality of handoffs has proven challenging due to lack of a reliable and validated tool to measure handoff quality. More recently, the Accreditation Council of Graduate Medical Education's introduction of the Next Accreditation System, with its focus on direct observation of clinical skills to achieve milestones, makes it crucial for residency educators to have valid tools to measure competence in handoffs. As a result, it is critical that instruments to measure handoff performance are not only created but also validated.[8]
To help fill this gap, we previously reported on the development of a 9‐item Handoff Clinical Examination Exercise (CEX) assessment tool. The Handoff CEX, designed for use by those participating in the handoff or by a third‐party observer, can be used to rate the quality of patient handoffs in domains such as professionalism and communication skills between the receiver and sender of patient information.[9, 10] Despite prior demonstration of feasibility of use, the initial tool was perceived as lengthy and redundant. In addition, although the tool has been shown to discriminate between performance of novice and expert nurses, the construct validity of this tool has not been established.[11] Establishing construct validity is important to ensuring that the tool can measure the construct in question, namely whether it detects those who are actually competent to perform handoffs safely and effectively. We present here the results of the development of a shorter Handoff Mini‐CEX, along with the formal establishment of its construct validity, namely its ability to distinguish between levels of performance in 3 domains of handoff quality.
METHODS
Adaption of the Handoff CEX and Development of the Abbreviated Tool
The 9‐item Handoff CEX is a paper‐based instrument that was created by the investigators (L.I.H., J.M.F., V.M.A.) to evaluate either the sender or the receiver of handoff communications and has been used in prior studies (see Supporting Information, Appendix 1, in the online version of this article).[9, 10] The evaluation may be conducted by either an observer or by a handoff participant. The instrument includes 6 domains: (1) setting, (2) organization and efficiency, (3) communication skills, (4) content, (5) clinical judgment, and (6) humanistic skills/professionalism. Each domain is graded on a 9‐point rating scale, modeled on the widely used Mini‐CEX (Clinical Evaluation Exercise) for real‐time observation of clinical history and exam skills in internal medicine clerkships and residencies (13=unsatisfactory, 46=marginal/satisfactory, 79=superior).[12] This familiar 9‐point scale is utilized in graduate medical education evaluation of the ACGME core competencies.
To standardize the evaluation, the instrument uses performance‐based anchors for evaluating both the sender and the receiver of the handoff information. The anchors are derived from functional evaluation of the roles of senders and receivers in our preliminary work at both the University of Chicago and Yale University, best practices in other high‐reliability industries, guidelines from the Joint Commission and the SHM, and prior studies of effective communication in clinical systems.[5, 6, 13]
After piloting the Handoff CEX with the University of Chicago's internal medicine residency program (n=280 handoff evaluations), a strong correlation was noted between the measures of content (medical knowledge), patient care, clinical judgment, organization/efficiency, and communication skills. Moreover, the Handoff CEX's Cronbach , or measurement of internal reliability and consistency, was very high (=0.95). Given the potential of redundant items, and to increase ease of use of the instrument, factor analysis was used to reduce the instrument to yield a shorter 3‐item tool, the Handoff Mini‐CEX, that assessed 3 of the initial items: setting, communication skills, and professionalism. Overall, performance on these 3 items were responsible for 82% of the variance of overall sign‐out quality (see Supporting Information, Appendix 2, in the online version of this article).
Establishing Construct Validity of the Handoff Mini‐CEX
To establish construct validity of the Handoff Mini‐CEX, we adapted a protocol used by Holmboe and colleagues to report the construct validity of the Handoff Mini‐CEX, which is based on the development and use of video scenarios depicting varying levels of clinical performance.[14] A clinical scenario script, based on prior observational work, was developed, which represented an internal medicine resident (the sender) signing out 3 different patients to colleagues (intern [postgraduate year 1] and resident). This scenario was developed to explicitly include observable components of professionalism, communication, and setting. Three levels of performancesuperior, satisfactory, and unsatisfactorywere defined and described for the 3 domains. These levels were defined, and separate scripts were written using this information, demonstrating varying levels of performance in each of the domains of interest, using the descriptive anchors of the Handoff Mini‐CEX.
After constructing the superior, or gold standard, script that showcases superior communication, professionalism, and setting, individual domains of performance were changed (eg, to satisfactory or unsatisfactory), while holding the other 2 constant at the superior level of performance. For example, superior communication requires that the sender provides anticipatory guidance and includes clinical rationale, whereas unsatisfactory communication includes vague language about overnight events and a disorganized presentation of patients. Superior professionalism requires no inappropriate comments by the sender about patients, family, and staff as well as a presentation focused on the most urgent patients. Unsatisfactory professionalism is shown by a hurried and inattentive sign‐out, with inappropriate comments about patients, family, and staff. Finally, a superior setting is one in which the receiver is listening attentively and discourages interruptions, whereas an unsatisfactory setting finds the sender or receiver answering pages during the handoff surrounded by background noise. We omitted the satisfactory level for setting due to the difficulties in creating subtleties in the environment.
Permutations of each of these domains resulted in 6 scripts depicting different levels of sender performance (see Supporting Information, Appendix 3, in the online version of this article). Only the performance level of the sender was changed, and the receivers of the handoff performance remained consistent, using best practices for receivers, such as attentive listening, asking questions, reading back, and taking notes during the handoff. The scripts were developed by 2 investigators (V.M.A., S.B.), then reviewed and edited independently by other investigators (J.M.F., P.S.) to achieve consensus. Actors were recruited to perform the video scenarios and were trained by the physician investigators (J.M.F., V.M.A.). The part of the sender was played by a study investigator (P.S.) with prior acting experience, and who had accrued over 40 hours of experience observing handoffs to depict varying levels of handoff performance. The digital video recordings ranged in length from 2.00 minutes to 4.08 minutes. All digital videos were recorded using a Sony XDCAM PMW‐EX3 HD camcorder (Sony Corp., Tokyo, Japan.
Participants
Faculty from the University of Chicago Medical Center and Yale University were included. At the University of Chicago, faculty were recruited to participate via email by the study investigators to the Research in Medical Education (RIME) listhost, which includes program directors, clerkship directors, and medical educators. Two sessions were offered and administered. Continuing medical education (CME) credit was provided for participation, as this workshop was given in conjunction with the RIME CME conference. Evaluations were deidentified using a unique identifier for each rater. At Yale University, the workshop on handoffs was offered as part of 2 seminars for program directors and chief residents from all specialties. During these seminars, program directors and chief residents used anonymous evaluation rating forms that did not capture rater identifiers. No other incentive was provided for participation. Although neither faculty at the University of Chicago nor Yale University received any formal training on handoff evaluation, they did receive a short introduction to the importance of handoffs and the goals of the workshop. The protocol was deemed exempt by the institutional review board at the University of Chicago.
Workshop Protocol
After a brief introduction, faculty viewed the tapes in random order on a projected screen. Participants were instructed to use the Handoff Mini‐CEX to rate whichever element(s) of handoff quality they believed they could suitably evaluate while watching the tapes. The videos were rated on the Handoff Mini‐CEX form, and participants anonymously completed the forms independently without any contact with other participants. The lead investigators proctored all sessions. At University of Chicago, participants viewed and rated all 6 videos over the course of an hour. At Yale University, due to time constraints in the program director and chief resident seminars, participants reviewed 1 of the videos in seminar 1 (unsatisfactory professionalism) and 2 in the other seminar (unsatisfactory communication, unsatisfactory professionalism) (Table 1).
Unsatisfactory | Satisfactory | Superior | |
---|---|---|---|
| |||
Communication | Script 3 (n=36)a | Script 2 (n=13) | Script 1 (n=13) |
Uses vague language about overnight events, missing critical patient information, disorganized. | Insufficient level of clinical detail, directions are not as thorough, handoff is generally on task and sufficient. | Anticipatory guidance provided, rationale explained; important information is included, highlights sick patients. | |
Look in the record; I'm sure it's in there. And oh yeah, I need you to check enzymes and finish ruling her out. | So the only thing to do is to check labs; you know, check CBC and cardiac enzymes. | So for today, I need you to check post‐transfusion hemoglobin to make sure it's back to the baseline of 10. If it's under 10, then transfuse her 2 units, but hopefully it will be bumped up. Also continue to check cardiac enzymes; the next set is coming at 2 pm, and we need to continue the rule out. If her enzymes are positive or she has other ECG changes, definitely call the cardio fellow, since they'll want to take her to the CCU. | |
Professionalism | Script 5 (n=39)a | Script 4 (n=22)a | Script 1 |
Hurried, inattentive, rushing to leave, inappropriate comments (re: patients, family, staff). | Some tangential comments (re: patients, family, staff). | Appropriate comments (re: patients, family, staff), focused on task. | |
[D]efinitely call the cards fellow, since they'll want to take her to the CCU. And let me tell you, if you don't call her, she'll rip you a new one. | Let's breeze through them quickly so I can get out of here, I've had a rough day. I'll start with the sickest first, and oh my God she's a train wreck! | ||
Setting | Script 6 (n=13) | Script 1 | |
Answering pages during handoff, interruptions (people entering room, phone ringing). | Attentive listening, no interruptions, pager silenced. |
Data Collection and Statistical Analysis
Using combined data from University of Chicago and Yale University, descriptive statistics were reported as raw scores on the Handoff Mini‐CEX. To assess internal consistency of the tool, Cronbach was used. To assess inter‐rater reliability of these attending physician ratings on the tool, we performed a Kendall coefficient of concordance analysis after collapsing the ratings into 3 categories (unsatisfactory, satisfactory, superior). In addition, we also calculated intraclass correlation coefficients for each item using the raw data and generalizability analysis to calculate the number of raters that would be needed to achieve a desired reliability of 0.95. To ascertain if faculty were able to detect varying levels of performance depicted in the video, an ordinal test of trend on the communication, professionalism, and setting scores was performed.
To assess for rater bias, we were able to use the identifiers on the University of Chicago data to perform a 2‐way analysis of variance (ANOVA) to assess if faculty scores were associated with performance level after controlling for faculty. The results of the faculty rater coefficients and P values in the 2‐way ANOVA were also examined for any evidence of rater bias. All calculations were performed in Stata 11.0 (StataCorp, College Station, TX) with statistical significance defined as P<0.05.
RESULTS
Forty‐seven faculty members (14=site 1; 33=site 2) participated in the validation workshops (2 at the University of Chicago, and 2 at Yale University), which were held in August 2011 and September 2011, providing a total of 172 observations of a possible 191 (90%).
The overall handoff quality ratings for the superior, gold standard video (superior communication, professionalism, and communication) ranged from 7 to 9 with a mean of 8.5 (standard deviation [SD] 0.7). The overall ratings for the video depicting satisfactory communication (satisfactory communication, superior professionalism and setting) ranged from 5 to 9 with a mean of 7.3 (SD 1.1). The overall ratings for the unsatisfactory communication (unsatisfactory communication, superior professionalism and setting) video ranged from 1 to 7 with a mean of 2.6 (SD 1.2). The overall ratings for the satisfactory professionalism video (satisfactory professionalism, superior communication and setting) ranged from 4 to 8 with a mean of 5.7 (SD 1.3). The overall ratings for the unsatisfactory professionalism (unsatisfactory professionalism, superior communication and setting) video ranged from 2 to 5 with a mean of 2.4 (SD 1.03). Finally, the overall ratings for the unsatisfactory setting (unsatisfactory setting, superior communication and professionalism) video ranged from 1 to 8 with a mean of 3.1 (SD 1.7).
Figure 1 demonstrates that for the domain of communication, the raters were able to discern the unsatisfactory performance but had difficulty reliably distinguishing between superior and satisfactory performance. Figure 2 illustrates that for the domain of professionalism, raters were able to detect the videos' changing levels of performance at the extremes of behavior, with unsatisfactory and superior displays more readily identified. Figure 3 shows that for the domain of setting, the raters were able to discern the unsatisfactory versus superior level of the changing setting. Of note, we also found a moderate significant correlation between ratings of professionalism and communication (r=0.47, P<0.001).
The Cronbach , or measurement of internal reliability and consistency, for the Handoff Mini‐CEX (3 items plus overall) was 0.77, indicating high internal reliability and consistency. Using data from University of Chicago, where raters were labeled with a unique identifier, the Kendall coefficient of concordance was calculated to be 0.79, demonstrating high inter‐rater reliability of the faculty raters. High inter‐rater reliability was also seen using intraclass coefficients for each domain: communication (0.84), professionalism (0.68), setting (0.83), and overall (0.89). Using generalizability analysis, the average reliability was determined to be above 0.9 for all domains (0.99 for overall).
Last, the 2‐way ANOVA (n=75 observations from 13 raters) revealed no evidence of rater bias when examining the coefficient for attending rater (P=0.55 for professionalism, P=0.45 for communication, P=0.92 for setting). The range of scores for each video, however, was broad (Table 2).
Unsatisfactory | Satisfactory | Superior | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Mean | Median | Range | Mean | Median | Range | Mean | Median | Range | Pb | |
| ||||||||||
Professionalism | 2.3 | 2 | 14 | 4.4 | 4 | 38 | 7.0 | 7 | 39 | 0.026 |
Communication | 2.8 | 3 | 16 | 7 | 8 | 59 | 6.6 | 7 | 19 | 0.005 |
Setting | 3.1 | 3 | 18 | 7.5 | 8 | 29 | 0.005 |
DISCUSSION
This study demonstrates that valid conclusions on handoff performance can be drawn using the Handoff CEX as the instrument to rate handoff quality. Utilizing standardized videos depicting varying levels of performance communication, professionalism, and setting, the Handoff Mini‐CEX has demonstrated potential to discern between increasing levels of performance, providing evidence for the construct validity of the instrument.
We observed that faculty could reliably detect unsatisfactory professionalism with ease, and that there was a distinct correlation between faculty ratings and the internally set levels of performance displayed in the videos. This trend demonstrated that faculty were able to discern different levels of professionalism using the Handoff Mini‐CEX. It became more difficult, however, for faculty to detect superior professionalism when the domain of communication was permuted. If the sender of the handoff was professional but the information delivered was disorganized, inaccurate, and missing crucial pieces of information, the faculty perceived this ineffective communication as unprofessional. Prior literature on professionalism has found that communication is a necessary component of professional behavior, and consequently, being a competent communicator is necessary to fulfill ones duty as a professional physician.[15, 16]
This is of note because we did find a moderate significant correlation between ratings of professionalism and communication. It is possible that this distinction would be made clearer with formal rater training in the future prior to any evaluations. However, it is also possible that professionalism and communication, due to a synergistic role between the 2 domains, cannot be separated. If this is the case, it would be important to educate clinicians to present patients in a concise, clear, and accurate way with a professional demeanor. Acknowledging professional responsibility as an integral piece of patient care is also critical in effectively communicating patient information.[5]
We also noted that faculty could detect unsatisfactory communication consistently; however, they were unable to differentiate between satisfactory and superior communication reliably or consistently. Because the unsatisfactory professionalism, unsatisfactory setting, and satisfactory professionalism videos all demonstrated superior communication, we believe that the faculty penalized communication when distractions, in the form of interruptions and rude behavior by the resident giving the handoff, interrupted the flow of the handoff. Thus, the wide ranges in scores observed by some raters may be attributed to this interaction between the Handoff Mini‐CEX domains. In the future, definitions of the anchors, including at the middle spectrum of performance, and rater training may improve the ability of raters to distinguish performance between each domain.
The overall value of the Handoff Mini‐CEX is in its ease of use, in part due to its brevity, as well as evidence for its validity in distinguishing between varying levels of performance. Given the emphasis on monitoring handoff quality and performance, the Handoff Mini‐CEX provides a standard foundation from which baseline handoff performance can be easily measured and improved. Moreover, it can also be used to give individual feedback to a specific practicing clinician on their practices and an opportunity to improve. This is particularly important given current recommendations by the Joint Commission, that handoffs are standardized, and by the ACGME, that residents are competent in handoff skills. Moreover, given the creation of the SHM's handoff recommendations and handoffs as a core competency for hospitalists, the tool provides the ability for hospitalist programs to actually assess their handoff practices as baseline measurements for any quality improvement activities that may take place.
Faculty were able to discern the superior and unsatisfactory levels of setting with ease. After watching and rating the videos, participants said that the chaotic scene of the unsatisfactory setting video had significant authenticity, and that they were constantly interrupted during their own handoffs by pages, phone calls, and people entering the handoff space. System‐level fixes, such as protected time and dedicated space for handoffs, and discouraging pages to be sent during the designated handoff time, could mitigate the reality of unsatisfactory settings.[17, 18]
Our study has several limitations. First, although this study was held at 2 sites, it included a small number of faculty, which can impact the generalizability of our findings. Implementation varied at Yale University and the University of Chicago, preventing use of all data for all analyses. Furthermore, institutional culture may also impact faculty raters' perceptions, so future work aims at repeating our protocol at partner institutions, increasing both the number and diversity of participants. We were also unable to compare the new shorter Handoff Mini‐CEX to the larger 9‐item Handoff CEX in this study.
Despite these limitations, we believe that the Handoff Mini‐CEX, has future potential as an instrument with which to make valid and reliable conclusions about handoff quality, and could be used to both evaluate handoff quality and as an educational tool for trainees and faculty on effective handoff communication.
Disclosures
This work was supported by the National Institute on Aging Short‐Term Aging‐Related Research Program (5T35AG029795), Agency for Healthcare Research and Quality (1 R03HS018278‐01), and the University of Chicago Department of Medicine Excellence in Medical Education Award. Dr. Horwitz is supported by the National Institute on Aging (K08 AG038336) and by the American Federation for Aging Research through the Paul B. Beeson Career Development Award Program. Dr. Arora is funded by National Institute on Aging Career Development Award K23AG033763. Prior presentations of these data include the 2011 Association of American Medical Colleges meeting in Denver, Colorado, the 2012 Association of Program Directors of Internal Medicine meeting in Atlanta, Georgia, and the 2012 Society of General Internal Medicine Meeting in Orlando, Florida.
- The new recommendations on duty hours from the ACGME task force. New Engl J Med. 2010;363(2):e3. , , .
- ACGME common program requirements. Effective July 1, 2011. Available at: http://www.acgme.org/acgmeweb/Portals/0/PDFs/Common_Program_Requirements_07012011[2].pdf. Accessed February 8, 2014.
- Consequences of inadequate sign‐out for patient care. Arch Intern Med. 2008;168(16):1755–1760. , , , , .
- Communication failures in patient sign‐out and suggestions for improvement: a critical incident analysis. Qual Saf Healthcare. 2005;14(6):401–407. , , , , .
- Hospitalist handoffs: a systematic review and task force recommendations. J Hosp Med. 2009;4(7):433–440. , , , , , .
- A model for building a standardized hand‐off protocol. Jt Comm J Qual Patient Saf. 2006;32(11):646–655. , .
- World Health Organization Collaborating Centre for Patient Safety. Solutions on communication during patient hand‐overs. 2007; Volume 1, Solution 1. Available at: http://www.who.int/patientsafety/solutions/patientsafety/PS‐Solution3.pdf. Accessed February 8, 2014.
- Patient handoffs: standardized and reliable measurement tools remain elusive. Jt Comm J Qual Patient Saf. 2010;36(2):52–61. , .
- Development of a handoff evaluation tool for shift‐to‐shift physician handoffs: the handoff CEX. J Hosp Med. 2013;8(4):191–200. , , , et al.
- Hand‐off education and evaluation: piloting the observed simulated hand‐off experience (OSHE). J Gen Intern Med. 2010;25(2):129–134. , , , et al.
- Validation of a handoff tool: the Handoff CEX. J Clin Nurs. 2013;22(9‐10):1477–1486. , , , , , .
- The mini‐CEX: a method for assessing clinical skills. Ann Intern Med. 2003;138(6):476–481. , , , .
- Handoff strategies in settings with high consequences for failure: lessons for health care operations. Int J Qual Health Care. 2004;16(2):125–132. , , , , .
- Construct validity of the miniclinical evaluation exercise (miniCEX). Acad Med. 2003;78(8):826–830. , , , , .
- Third‐year medical students' participation in and perceptions of unprofessional behaviors. Acad Med. 2007;82(10 suppl):S35–S39. , , , et al.
- Professionalism—the next wave. N Engl J Med. 2006;355(20):2151–2152. .
- Interns overestimate the effectiveness of their hand‐off communication. Pediatrics. 2010;125(3):491–496. , , , , .
- Characterising physician listening behaviour during hospitalist handoffs using the HEAR checklist. BMJ Qual Saf. 2013;22(3):203–209. , , , , .
- The new recommendations on duty hours from the ACGME task force. New Engl J Med. 2010;363(2):e3. , , .
- ACGME common program requirements. Effective July 1, 2011. Available at: http://www.acgme.org/acgmeweb/Portals/0/PDFs/Common_Program_Requirements_07012011[2].pdf. Accessed February 8, 2014.
- Consequences of inadequate sign‐out for patient care. Arch Intern Med. 2008;168(16):1755–1760. , , , , .
- Communication failures in patient sign‐out and suggestions for improvement: a critical incident analysis. Qual Saf Healthcare. 2005;14(6):401–407. , , , , .
- Hospitalist handoffs: a systematic review and task force recommendations. J Hosp Med. 2009;4(7):433–440. , , , , , .
- A model for building a standardized hand‐off protocol. Jt Comm J Qual Patient Saf. 2006;32(11):646–655. , .
- World Health Organization Collaborating Centre for Patient Safety. Solutions on communication during patient hand‐overs. 2007; Volume 1, Solution 1. Available at: http://www.who.int/patientsafety/solutions/patientsafety/PS‐Solution3.pdf. Accessed February 8, 2014.
- Patient handoffs: standardized and reliable measurement tools remain elusive. Jt Comm J Qual Patient Saf. 2010;36(2):52–61. , .
- Development of a handoff evaluation tool for shift‐to‐shift physician handoffs: the handoff CEX. J Hosp Med. 2013;8(4):191–200. , , , et al.
- Hand‐off education and evaluation: piloting the observed simulated hand‐off experience (OSHE). J Gen Intern Med. 2010;25(2):129–134. , , , et al.
- Validation of a handoff tool: the Handoff CEX. J Clin Nurs. 2013;22(9‐10):1477–1486. , , , , , .
- The mini‐CEX: a method for assessing clinical skills. Ann Intern Med. 2003;138(6):476–481. , , , .
- Handoff strategies in settings with high consequences for failure: lessons for health care operations. Int J Qual Health Care. 2004;16(2):125–132. , , , , .
- Construct validity of the miniclinical evaluation exercise (miniCEX). Acad Med. 2003;78(8):826–830. , , , , .
- Third‐year medical students' participation in and perceptions of unprofessional behaviors. Acad Med. 2007;82(10 suppl):S35–S39. , , , et al.
- Professionalism—the next wave. N Engl J Med. 2006;355(20):2151–2152. .
- Interns overestimate the effectiveness of their hand‐off communication. Pediatrics. 2010;125(3):491–496. , , , , .
- Characterising physician listening behaviour during hospitalist handoffs using the HEAR checklist. BMJ Qual Saf. 2013;22(3):203–209. , , , , .
© 2014 Society of Hospital Medicine
Handoff CEX
Transfers among trainee physicians within the hospital typically occur at least twice a day and have been increasing among trainees as work hours have declined.[1] The 2011 Accreditation Council for Graduate Medical Education (ACGME) guidelines,[2] which restrict intern working hours to 16 hours from a previous maximum of 30, have likely increased the frequency of physician trainee handoffs even further. Similarly, transfers among hospitalist attendings occur at least twice a day, given typical shifts of 8 to 12 hours.
Given the frequency of transfers, and the potential for harm generated by failed transitions,[3, 4, 5, 6] the end‐of‐shift written and verbal handoffs have assumed increasingly greater importance in hospital care among both trainees and hospitalist attendings.
The ACGME now requires that programs assess the competency of trainees in handoff communication.[2] Yet, there are few tools for assessing the quality of sign‐out communication. Those that exist primarily focus on the written sign‐out, and are rarely validated.[7, 8, 9, 10, 11, 12] Furthermore, it is uncertain whether such assessments must be done by supervisors or whether peers can participate in the evaluation. In this prospective multi‐institutional study we assess the performance characteristics of a verbal sign‐out evaluation tool for internal medicine housestaff and hospitalist attendings, and examine whether it can be used by peers as well as by external evaluators. This tool has previously been found to effectively discriminate between experienced and inexperienced nurses conducting nursing handoffs.[13]
METHODS
Tool Design and Measures
The Handoff CEX (clinical evaluation exercise) is a structured assessment based on the format of the mini‐CEX, an instrument used to assess the quality of history and physical examination by trainees for which validation studies have previously been conducted.[14, 15, 16, 17] We developed the tool based on themes we identified from our own expertise,[1, 5, 6, 8, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29] the ACGME core competencies for trainees,[2] and the literature to maximize content validity. First, standardization has numerous demonstrable benefits for safety in general and handoffs in particular.[30, 31, 32] Consequently we created a domain for organization in which standardization was a characteristic of high performance.
Second, there is evidence that people engaged in conversation routinely overestimate peer comprehension,[27] and that explicit strategies to combat this overestimation, such as confirming understanding, explicitly assigning tasks rather than using open‐ended language, and using concrete language, are effective.[33] Accordingly we created a domain for communication skills, which is also an ACGME competency.
Third, although there were no formal guidelines for sign‐out content when we developed this tool, our own research had demonstrated that the content elements most often missing and felt to be important by stakeholders were related to clinical condition and explicating thinking processes,[5, 6] so we created a domain for content that highlighted these areas and met the ACGME competency of medical knowledge. In accordance with standards for evaluation of learners, we incorporated a domain for judgment to identify where trainees were in the RIME spectrum of reporter, interpreter, master, and educator.
Next, we added a section for professionalism in accordance with the ACGME core competencies of professionalism and patient care.[34] To avoid the disinclination of peers to label each other unprofessional, we labeled the professionalism domain as patient‐focused on the tool.
Finally, we included a domain for setting because of an extensive literature demonstrating increased handoff failures in noisy or interruptive settings.[35, 36, 37] We then revised the tool slightly based on our experiences among nurses and students.[13, 38] The final tool included the 6 domains described above and an assessment of overall competency. Each domain was scored on a 9‐point scale and included descriptive anchors at high and low ends of performance. We further divided the scale into 3 main sections: unsatisfactory (score 13), satisfactory (46), and superior (79). We designed 2 tools, 1 to assess the person providing the handoff and 1 to assess the handoff recipient, each with its own descriptive anchors. The recipient tool did not include a content domain (see Supporting Information, Appendix 1, in the online version of this article).
Setting and Subjects
We tested the tool in 2 different urban academic medical centers: the University of Chicago Medicine (UCM) and Yale‐New Haven Hospital (Yale). At UCM, we tested the tool among hospitalists, nurse practitioners, and physician assistants during the Monday and Tuesday morning and Friday evening sign‐out sessions. At Yale, we tested the tool among housestaff during the evening sign‐out session from the primary team to the on‐call covering team.
The UCM is a 550‐bed urban academic medical center in which the nonteaching hospitalist service cares for patients with liver disease, or end‐stage renal or lung disease awaiting transplant, and a small fraction of general medicine and oncology patients when the housestaff service exceeds its cap. No formal training on sign‐out is provided to attending or midlevel providers. The nonteaching hospitalist service operates as a separate service from the housestaff service and consists of 38 hospitalist clinicians (hospitalist attendings, nurse practitioners, and physicians assistants). There are 2 handoffs each day. In the morning the departing night hospitalist hands off to the incoming daytime hospitalist or midlevel provider. These handoffs occur at 7:30 am in a dedicated room. In the evening the daytime hospitalist or midlevel provider hands off to an incoming night hospitalist. This handoff occurs at 5:30 pm or 7:30 pm in a dedicated location. The written sign‐out is maintained on a Microsoft Word (Microsoft Corp., Redmond, WA) document on a password‐protected server and updated daily.
Yale is a 946‐bed urban academic medical center with a large internal medicine training program. Formal sign‐out education that covers the main domains of the tool is provided to new interns during the first 3 months of the year,[19] and a templated electronic medical record‐based electronic written handoff report is produced by the housestaff for all patients.[22] Approximately half of inpatient medicine patients are cared for by housestaff teams, which are entirely separate from the hospitalist service. Housestaff sign‐out occurs between 4 pm and 7 pm every night. At a minimum, the departing intern signs out to the incoming intern; this handoff is typically supervised by at least 1 second‐ or third‐year resident. All patients are signed out verbally; in addition, the written handoff report is provided to the incoming team. Most handoffs occur in a quiet charting room.
Data Collection
Data collection at UCM occurred between March and December 2010 on 3 days of each week: Mondays, Tuesdays, and Fridays. On Mondays and Tuesdays the morning handoffs were observed; on Fridays the evening handoffs were observed. Data collection at Yale occurred between March and May 2011. Only evening handoffs from the primary team to the overnight coverage were observed. At both sites, participants provided verbal informed consent prior to data collection. At the time of an eligible sign‐out session, a research assistant (D.R. at Yale, P.S. at UCM) provided the evaluation tools to all members of the incoming and outgoing teams, and observed the sign‐out session himself. Each person providing a handoff was asked to evaluate the recipient of the handoff; each person receiving a handoff was asked to evaluate the provider of the handoff. In addition, the trained third‐party observer (D.R., P.S.) evaluated both the provider and recipient of the handoff. The external evaluators were trained in principles of effective communication and the use of the tool, with specific review of anchors at each end of each domain. One evaluator had a DO degree and was completing an MPH degree. The second evaluator was an experienced clinical research assistant whose training consisted of supervised observation of 10 handoffs by a physician investigator. At Yale, if a resident was present, she or he was also asked to evaluate both the provider and recipient of the handoff. Consequently, every sign‐out session included at least 2 evaluations of each participant, 1 by a peer evaluator and 1 by a consistent external evaluator who did not know the patients. At Yale, many sign‐outs also included a third evaluation by a resident supervisor.
The study was approved by the institutional review boards at both UCM and Yale.
Statistical Analysis
We obtained mean, median, and interquartile range of scores for each subdomain of the tool as well as the overall assessment of handoff quality. We assessed convergent construct validity by assessing performance of the tool in different contexts. To do so, we determined whether scores differed by type of participant (provider or recipient), by site, by training level of evaluatee, or by type of evaluator (external, resident supervisor, or peer) by using Wilcoxon rank sum tests and Kruskal‐Wallis tests. For the assessment of differences in ratings by training level, we used evaluations of sign‐out providers only, because the 2 sites differed in scores for recipients. We also assessed construct validity by using Spearman rank correlation coefficients to describe the internal consistency of the tool in terms of the correlation between domains of the tool, and we conducted an exploratory factor analysis to gain insight into whether the subdomains of the tool were measuring the same construct. In conducting this analysis, we restricted the dataset to evaluations of sign‐out providers only, and used a principal components estimation method, a promax rotation, and squared multiple correlation communality priors. Finally, we conducted some preliminary studies of reliability by testing whether different types of evaluators provided similar assessments. We calculated a weighted kappa using Fleiss‐Cohen weights for external versus peer scores and again for supervising resident versus peer scores (Yale only). We were not able to assess test‐retest reliability by nature of the sign‐out process. Statistical significance was defined by a P value 0.05, and analyses were performed using SAS 9.2 (SAS Institute, Cary, NC).
RESULTS
A total of 149 handoff sessions were observed: 89 at UCM and 60 at Yale. Each site conducted a similar total number of evaluations: 336 at UCM, 337 at Yale. These sessions involved 97 unique individuals, 34 at UCM and 63 at Yale. Overall scores were high at both sites, but a wide range of scores was applied (Table 1).
Domain | Provider, N=343 | Recipient, N=330 | P Value | ||||
---|---|---|---|---|---|---|---|
Median (IQR) | Mean (SD) | Range | Median (IQR) | Mean (SD) | Range | ||
| |||||||
Setting | 7 (69) | 7.0 (1.7) | 29 | 7 (69) | 7.3 (1.6) | 29 | 0.05 |
Organization | 7 (68) | 7.2 (1.5) | 29 | 8 (69) | 7.4 (1.4) | 29 | 0.07 |
Communication | 7 (69) | 7.2 (1.6) | 19 | 8 (79) | 7.4 (1.5) | 29 | 0.22 |
Content | 7 (68) | 7.0 (1.6) | 29 | ||||
Judgment | 8 (68) | 7.3 (1.4) | 39 | 8 (79) | 7.5 (1.4) | 39 | 0.06 |
Professionalism | 8 (79) | 7.4 (1.5) | 29 | 8 (79) | 7.6 (1.4) | 39 | 0.23 |
Overall | 7 (68) | 7.1 (1.5) | 29 | 7 (68) | 7.4 (1.4) | 29 | 0.02 |
Handoff Providers
A total of 343 evaluations of handoff providers were completed regarding 67 unique individuals. For each domain, scores spanned the full range from unsatisfactory to superior. The highest rated domain on the handoff provider evaluation tool was professionalism (median: 8; interquartile range [IQR]: 79). The lowest rated domain was content (median: 7; IQR: 68) (Table 1).
Handoff Recipients
A total of 330 evaluations of handoff recipients were completed regarding 58 unique individuals. For each domain, scores spanned the full range from unsatisfactory to superior. The highest rated domain on the handoff provider evaluation tool was professionalism, with a median of 8 (IQR: 79). The lowest rated domain was setting, with a median score of 7 (IQR: 6‐9) (Table 1).
Validity Testing
Comparing provider scores to recipient scores, recipients received significantly higher scores for overall assessment (Table 1). Scores at UCM and Yale were similar in all domains for providers but were slightly lower at UCM in several domains for recipients (see Supporting Information, Appendix 2, in the online version of this article). Scores did not differ significantly by training level (Table 2). Third‐party external evaluators consistently gave lower marks for the same handoff than peer evaluators did (Table 3).
Domain | Median (Range) | P Value | |||
---|---|---|---|---|---|
NP/PA, N=33 | Subintern or Intern, N=170 | Resident, N=44 | Hospitalist, N=95 | ||
| |||||
Setting | 7 (29) | 7 (39) | 7 (49) | 7 (29) | 0.89 |
Organization | 8 (49) | 7 (29) | 7 (49) | 8 (39) | 0.11 |
Communication | 8 (49) | 7 (29) | 7 (49) | 8 (19) | 0.72 |
Content | 7 (39) | 7 (29) | 7 (49) | 7 (29) | 0.92 |
Judgment | 8 (59) | 7 (39) | 8 (49) | 8 (49) | 0.09 |
Professionalism | 8 (49) | 7 (29) | 8 (39) | 8 (49) | 0.82 |
Overall | 7 (39) | 7 (29) | 8 (49) | 7 (29) | 0.28 |
Provider, Median (Range) | Recipient, Median (Range) | |||||||
---|---|---|---|---|---|---|---|---|
Domain | Peer, N=152 | Resident, Supervisor, N=43 | External, N=147 | P Value | Peer, N=145 | Resident Supervisor, N=43 | External, N=142 | P Value |
| ||||||||
Setting | 8 (39) | 7 (39) | 7 (29) | 0.02 | 8 (29) | 7 (39) | 7 (29) | <0.001 |
Organization | 8 (39) | 8 (39) | 7 (29) | 0.18 | 8 (39) | 8 (69) | 7 (29) | <0.001 |
Communication | 8 (39) | 8 (39) | 7 (19) | <0.001 | 8 (39) | 8 (49) | 7 (29) | <0.001 |
Content | 8 (39) | 8 (29) | 7 (29) | <0.001 | N/A | N/A | N/A | N/A |
Judgment | 8 (49) | 8 (39) | 7 (39) | <0.001 | 8 (39) | 8 (49) | 7 (39) | <0.001 |
Professionalism | 8 (39) | 8 (59) | 7 (29) | 0.02 | 8 (39) | 8 (69) | 7 (39) | <0.001 |
Overall | 8 (39) | 8 (39) | 7 (29) | 0.001 | 8 (29) | 8 (49) | 7 (29) | <0.001 |
Spearman rank correlation coefficients among the CEX subdomains for provider scores ranged from 0.71 to 0.86, except for setting (Table 4). Setting was less well correlated with the other subdomains, with correlation coefficients ranging from 0.39 to 0.41. Correlations between individual domains and the overall rating ranged from 0.80 to 0.86, except setting, which had a correlation of 0.55. Every correlation was significant at P<0.001. Correlation coefficients for recipient scores were very similar to those for provider scores (see Supporting Information, Appendix 3, in the online version of this article).
Spearman Correlation Coefficients | ||||||
---|---|---|---|---|---|---|
Setting | Organization | Communication | Content | Judgment | Professionalism | |
| ||||||
Setting | 1.000 | 0.40 | 0.40 | 0.39 | 0.39 | 0.41 |
Organization | 0.40 | 1.00 | 0.80 | 0.71 | 0.77 | 0.73 |
Communication | 0.40 | 0.80 | 1.00 | 0.79 | 0.82 | 0.77 |
Content | 0.39 | 0.71 | 0.79 | 1.00 | 0.80 | 0.74 |
Judgment | 0.39 | 0.77 | 0.82 | 0.80 | 1.00 | 0.78 |
Professionalism | 0.41 | 0.73 | 0.77 | 0.74 | 0.78 | 1.00 |
Overall | 0.55 | 0.80 | 0.84 | 0.83 | 0.86 | 0.82 |
We analyzed 343 provider evaluations in the factor analysis; there were 6 missing values. The scree plot of eigenvalues did not support more than 1 factor; however, the rotated factor pattern for standardized regression coefficients for the first factor and the final communality estimates showed the setting component yielding smaller values than did other scale components (see Supporting Information, Appendix 4, in the online version of this article).
Reliability Testing
Weighted kappa scores for provider evaluations ranged from 0.28 (95% confidence interval [CI]: 0.01, 0.56) for setting to 0.59 (95% CI: 0.38, 0.80) for organization, and were generally higher for resident versus peer comparisons than for external versus peer comparisons. Weighted kappa scores for recipient evaluation were slightly lower for external versus peer evaluations, but agreement was no better than chance for resident versus peer evaluations (Table 5).
Domain | Provider | Recipient | ||
---|---|---|---|---|
External vs Peer, N=144 (95% CI) | Resident vs Peer, N=42 (95% CI) | External vs Peer, N=134 (95% CI) | Resident vs Peer, N=43 (95% CI) | |
| ||||
Setting | 0.39 (0.24, 0.54) | 0.28 (0.01, 0.56) | 0.34 (0.20, 0.48) | 0.48 (0.27, 0.69) |
Organization | 0.43 (0.29, 0.58) | 0.59 (0.39, 0.80) | 0.39 (0.22, 0.55) | 0.03 (0.23, 0.29) |
Communication | 0.34 (0.19, 0.49) | 0.52 (0.37, 0.68) | 0.36 (0.22, 0.51) | 0.02 (0.18, 0.23) |
Content | 0.38 (0.25, 0.51) | 0.53 (0.27, 0.80) | N/A (N/A) | N/A (N/A) |
Judgment | 0.36 (0.22, 0.49) | 0.54 (0.25, 0.83) | 0.28 (0.15, 0.42) | 0.12 (0.34, 0.09) |
Professionalism | 0.47 (0.32, 0.63) | 0.47 (0.23, 0.72) | 0.35 (0.18, 0.51) | 0.01 (0.29, 0.26) |
Overall | 0.50 (0.36, 0.64) | 0.45 (0.24, 0.67) | 0.31 (0.16, 0.48) | 0.07 (0.20, 0.34) |
DISCUSSION
In this study we found that an evaluation tool for direct observation of housestaff and hospitalists generated a range of scores and was well validated in the sense of performing similarly across 2 different institutions and among both trainees and attendings, while having high internal consistency. However, external evaluators gave consistently lower marks than peer evaluators at both sites, resulting in low reliability when comparing these 2 groups of raters.
It has traditionally been difficult to conduct direct evaluations of handoffs, because they may occur at haphazard times, in variable locations, and without very much advance notice. For this reason, several attempts have been made to incorporate peers in evaluations of handoff practices.[5, 39, 40] Using peers to conduct evaluations also has the advantage that peers are more likely to be familiar with the patients being handed off and might recognize handoff flaws that external evaluators would miss. Nonetheless, peer evaluations have some important liabilities. Peers may be unwilling or unable to provide honest critiques of their colleagues given that they must work closely together for years. Trainee peers may also lack sufficient clinical expertise or experience to accurately assess competence. In our study, we found that peers gave consistently higher marks to their colleagues than did external evaluators, suggesting they may have found it difficult to criticize their colleagues. We conclude that peer evaluation alone is likely an insufficient means of evaluating handoff quality.
Supervising residents gave very similar marks as intern peers, suggesting that they also are unwilling to criticize, are insufficiently experienced to evaluate, or alternatively, that the peer evaluations were reasonable. We suspect the latter is unlikely given that external evaluator scores were consistently lower than peers. One would expect the external evaluators to be biased toward higher scores given that they are not familiar with the patients and are not able to comment on inaccuracies or omissions in the sign‐out.
The tool appeared to perform less well in most cases for recipients than for providers, with a narrower range of scores and low‐weighted kappa scores. Although recipients play a key role in ensuring a high‐quality sign‐out by paying close attention, ensuring it is a bidirectional conversation, asking appropriate questions, and reading back key information, it may be that evaluators were unable to place these activities within the same domains that were used for the provider evaluation. An altogether different recipient evaluation approach may be necessary.[41]
In general, scores were clustered at the top of the score range, as is typical for evaluations. One strategy to spread out scores further would be to refine the tool by adding anchors for satisfactory performance not just the extremes. A second approach might be to reduce the grading scale to only 3 points (unsatisfactory, satisfactory, superior) to force more scores to the middle. However, this approach might limit the discrimination ability of the tool.
We have previously studied the use of this tool among nurses. In that study, we also found consistently higher scores by peers than by external evaluators. We did, however, find a positive effect of experience, in which more experienced nurses received higher scores on average. We did not observe a similar training effect in this study. There are several possible explanations for the lack of a training effect. It is possible that the types of handoffs assessed played a role. At UCM, some assessed handoffs were night staff to day staff, which might be lower quality than day staff to night staff handoffs, whereas at Yale, all handoffs were day to night teams. Thus, average scores at UCM (primarily hospitalists) might have been lowered by the type of handoff provided. Given that hospitalist evaluations were conducted exclusively at UCM and housestaff evaluations exclusively at Yale, lack of difference between hospitalists and housestaff may also have been related to differences in evaluation practice or handoff practice at the 2 sites, not necessarily related to training level. Third, in our experience, attending physicians provide briefer less‐comprehensive sign‐outs than trainees, particularly when communicating with equally experienced attendings; these sign‐outs may appropriately be scored lower on the tool. Fourth, the great majority of the hospitalists at UCM were within 5 years of residency and therefore not very much more experienced than the trainees. Finally, it is possible that skills do not improve over time given widespread lack of observation and feedback during training years for this important skill.
The high internal consistency of most of the subdomains and the loading of all subdomains except setting onto 1 factor are evidence of convergent construct validity, but also suggest that evaluators have difficulty distinguishing among components of sign‐out quality. Internal consistency may also reflect a halo effect, in which scores on different domains are all influenced by a common overall judgment.[42] We are currently testing a shorter version of the tool including domains only for content, professionalism, and setting in addition to overall score. The fact that setting did not correlate as well with the other domains suggests that sign‐out practitioners may not have or exercise control over their surroundings. Consequently, it may ultimately be reasonable to drop this domain from the tool, or alternatively, to refocus on the need to ensure a quiet setting during sign‐out skills training.
There are several limitations to this study. External evaluations were conducted by personnel who were not familiar with the patients, and they may therefore have overestimated the quality of sign‐out. Studying different types of physicians at different sites might have limited our ability to identify differences by training level. As is commonly seen in evaluation studies, scores were skewed to the high end, although we did observe some use of the full range of the tool. Finally, we were limited in our ability to test inter‐rater reliability because of the multiple sources of variability in the data (numerous different raters, with different backgrounds at different settings, rating different individuals).
In summary, we developed a handoff evaluation tool that was easily completed by housestaff and attendings without training, that performed similarly in a variety of different settings at 2 institutions, and that can in principle be used either for peer evaluations or for external evaluations, although peer evaluations may be positively biased. Further work will be done to refine and simplify the tool.
ACKNOWLEDGMENTS
Disclosures: Development and evaluation of the sign‐out CEX was supported by a grant from the Agency for Healthcare Research and Quality (1R03HS018278‐01). Dr. Arora is supported by a National Institute on Aging (K23 AG033763). Dr. Horwitz is supported by the National Institute on Aging (K08 AG038336) and by the American Federation for Aging Research through the Paul B. Beeson Career Development Award Program. Dr. Horwitz is also a Pepper Scholar with support from the Claude D. Pepper Older Americans Independence Center at Yale University School of Medicine (P30AG021342 NIH/NIA). No funding source had any role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality, the National Institute on Aging, the National Institutes of Health, or the American Federation for Aging Research. Dr. Horwitz had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. An earlier version of this work was presented as a poster presentation at the Society of General Internal Medicine Annual Meeting in Orlando, Florida on May 9, 2012. Dr. Rand is now with the Department of Medicine, University of Vermont College of Medicine, Burlington, Vermont. Mr. Staisiunas is now with the Law School, Marquette University, Milwaukee, Wisconsin. The authors declare they have no conflicts of interest.
Appendix
A
PROVIDER HAND‐OFF CEX TOOL
RECIPIENT HAND‐OFF CEX TOOL
Appendix
B
Handoff CEX scores by site of evaluation
Domain | Provider | Recipient | ||||
Median (Range) | P‐value | Median (Range) | P‐value | |||
UC | Yale | UC | Yale | |||
N=172 | N=170 | N=163 | N=167 | |||
Setting | 7 (29) | 7 (39) | 0.32 | 7 (29) | 7 (39) | 0.36 |
Organization | 8 (29) | 7 (39) | 0.30 | 7 (29) | 8 (59) | 0.001 |
Communication | 7 (19) | 7 (39) | 0.67 | 7 (29) | 8 (49) | 0.03 |
Content | 7 (29) | 7 (29) | N/A | N/A | N/A | |
Judgment | 8 (39) | 7 (39) | 0.60 | 7 (39) | 8 (49) | 0.001 |
Professionalism | 8 (29) | 8 (39) | 0.67 | 8 (39) | 8 (49) | 0.35 |
Overall | 7 (29) | 7 (39) | 0.41 | 7 (29) | 8 (49) | 0.005 |
Appendix
C
Spearman correlation, recipients (N=330)
SpearmanCorrelationCoefficients | |||||
Setting | Organization | Communication | Judgment | Professionalism | |
Setting | 1.0 | 0.46 | 0.48 | 0.47 | 0.40 |
Organization | 0.46 | 1.00 | 0.78 | 0.75 | 0.75 |
Communication | 0.48 | 0.78 | 1.00 | 0.85 | 0.77 |
Judgment | 0.47 | 0.75 | 0.85 | 1.00 | 0.74 |
Professionalism | 0.40 | 0.75 | 0.77 | 0.74 | 1.00 |
Overall | 0.60 | 0.77 | 0.84 | 0.82 | 0.77 |
All p values <0.0001
Appendix
D
Factor analysis results for provider evaluations
Rotated Factor Pattern (Standardized Regression Coefficients) N=336 | ||
Factor1 | Factor2 | |
Organization | 0.64 | 0.27 |
Communication | 0.79 | 0.16 |
Content | 0.82 | 0.06 |
Judgment | 0.86 | 0.06 |
Professionalism | 0.66 | 0.23 |
Setting | 0.18 | 0.29 |
- Transfers of patient care between house staff on internal medicine wards: a national survey. Arch Intern Med. 2006;166(11):1173–1177. , , , .
- Accreditation Council for Graduate Medical Education. Common program requirements. 2011; http://www.acgme‐2010standards.org/pdf/Common_Program_Requirements_07012011.pdf. Accessed August 23, 2011.
- Does housestaff discontinuity of care increase the risk for preventable adverse events? Ann Intern Med. 1994;121(11):866–872. , , , , .
- Communication failures: an insidious contributor to medical mishaps. Acad Med. 2004;79(2):186–194. , , .
- Communication failures in patient sign‐out and suggestions for improvement: a critical incident analysis. Qual Saf Health Care. 2005;14(6):401–407. , , , , .
- Consequences of inadequate sign‐out for patient care. Arch Intern Med. 2008;168(16):1755–1760. , , , , .
- Adequacy of information transferred at resident sign‐out (in‐hospital handover of care): a prospective survey. Qual Saf Health Care. 2008;17(1):6–10. , , , .
- What are covering doctors told about their patients? Analysis of sign‐out among internal medicine house staff. Qual Saf Health Care. 2009;18(4):248–255. , , , , .
- Using direct observation, formal evaluation, and an interactive curriculum to improve the sign‐out practices of internal medicine interns. Acad Med. 2010;85(7):1182–1188. , .
- Doctors' handovers in hospitals: a literature review. Qual Saf Health Care. 2011;20(2):128–133. , , , .
- Resident sign‐out and patient hand‐offs: opportunities for improvement. Teach Learn Med. 2011;23(2):105–111. , , , et al.
- Use of an appreciative inquiry approach to improve resident sign‐out in an era of multiple shift changes. J Gen Intern Med. 2012;27(3):287–291. , , , et al.
- Validation of a handoff assessment tool: the Handoff CEX [published online ahead of print June 7, 2012]. J Clin Nurs. doi: 10.1111/j.1365–2702.2012.04131.x. , , , , , .
- The mini‐CEX (clinical evaluation exercise): a preliminary investigation. Ann Intern Med. 1995;123(10):795–799. , , , .
- Examiner differences in the mini‐CEX. Adv Health Sci Educ Theory Pract. 1997;2(1):27–33. , , , .
- Assessing the reliability and validity of the mini‐clinical evaluation exercise for internal medicine residency training. Acad Med. 2002;77(9):900–904. , , , .
- Construct validity of the miniclinical evaluation exercise (miniCEX). Acad Med. 2003;78(8):826–830. , , , , .
- Dropping the baton: a qualitative analysis of failures during the transition from emergency department to inpatient care. Ann Emerg Med. 2009;53(6):701–710.e4. , , , , , .
- Development and implementation of an oral sign‐out skills curriculum. J Gen Intern Med. 2007;22(10):1470–1474. , , .
- Mixed methods evaluation of oral sign‐out practices. J Gen Intern Med. 2007;22(S1):S114. , , , .
- Evaluation of an asynchronous physician voicemail sign‐out for emergency department admissions. Ann Emerg Med. 2009;54(3):368–378. , , , et al.
- An institution‐wide handoff task force to standardise and improve physician handoffs. BMJ Qual Saf. 2012;21(10):863–871. , , , et al.
- A model for building a standardized hand‐off protocol. Jt Comm J Qual Patient Saf. 2006;32(11):646–655. , .
- Medication discrepancies in resident sign‐outs and their potential to harm. J Gen Intern Med. 2007;22(12):1751–1755. , , , , .
- A theoretical framework and competency‐based approach to improving handoffs. Qual Saf Health Care. 2008;17(1):11–14. , , , .
- Hospitalist handoffs: a systematic review and task force recommendations. J Hosp Med. 2009;4(7):433–440. , , , , , .
- Interns overestimate the effectiveness of their hand‐off communication. Pediatrics. 2010;125(3):491–496. , , , , .
- Improving clinical handovers: creating local solutions for a global problem. Qual Saf Health Care. 2009;18(4):244–245. , .
- Managing discontinuity in academic medical centers: strategies for a safe and effective resident sign‐out. J Hosp Med. 2006;1(4):257–266. , , , , .
- Standardized sign‐out reduces intern perception of medical errors on the general internal medicine ward. Teach Learn Med. 2009;21(2):121–126. , , .
- SBAR: a shared mental model for improving communication between clinicians. Jt Comm J Qual Patient Saf. 2006;32(3):167–175. , , .
- Structuring flexibility: the potential good, bad and ugly in standardisation of handovers. Qual Saf Health Care. 2008;17(1):4–5. .
- Handoff strategies in settings with high consequences for failure: lessons for health care operations. Int J Qual Health Care. 2004;16(2):125–132. , , , , .
- Residents' perceptions of professionalism in training and practice: barriers, promoters, and duty hour requirements. J Gen Intern Med. 2006;21(7):758–763. , , , , , .
- Communication behaviours in a hospital setting: an observational study. BMJ. 1998;316(7132):673–676. , .
- Communication loads on clinical staff in the emergency department. Med J Aust. 2002;176(9):415–418. , , , , .
- A systematic review of failures in handoff communication during intrahospital transfers. Jt Comm J Qual Patient Saf. 2011;37(6):274–284. , .
- Hand‐off education and evaluation: piloting the observed simulated hand‐off experience (OSHE). J Gen Intern Med. 2010;25(2):129–134. , , , et al.
- Handoffs causing patient harm: a survey of medical and surgical house staff. Jt Comm J Qual Patient Saf. 2008;34(10):563–570. , , , et al.
- A prospective observational study of physician handoff for intensive‐care‐unit‐to‐ward patient transfers. Am J Med. 2011;124(9):860–867. , , .
- Characterizing physician listening behavior during hospitalist handoffs using the HEAR checklist (published online ahead of print December 20, 2012]. BMJ Qual Saf. doi:10.1136/bmjqs‐2012‐001138. , , , , .
- A constant error in psychological ratings. J Appl Psychol. 1920;4(1):25. .
Transfers among trainee physicians within the hospital typically occur at least twice a day and have been increasing among trainees as work hours have declined.[1] The 2011 Accreditation Council for Graduate Medical Education (ACGME) guidelines,[2] which restrict intern working hours to 16 hours from a previous maximum of 30, have likely increased the frequency of physician trainee handoffs even further. Similarly, transfers among hospitalist attendings occur at least twice a day, given typical shifts of 8 to 12 hours.
Given the frequency of transfers, and the potential for harm generated by failed transitions,[3, 4, 5, 6] the end‐of‐shift written and verbal handoffs have assumed increasingly greater importance in hospital care among both trainees and hospitalist attendings.
The ACGME now requires that programs assess the competency of trainees in handoff communication.[2] Yet, there are few tools for assessing the quality of sign‐out communication. Those that exist primarily focus on the written sign‐out, and are rarely validated.[7, 8, 9, 10, 11, 12] Furthermore, it is uncertain whether such assessments must be done by supervisors or whether peers can participate in the evaluation. In this prospective multi‐institutional study we assess the performance characteristics of a verbal sign‐out evaluation tool for internal medicine housestaff and hospitalist attendings, and examine whether it can be used by peers as well as by external evaluators. This tool has previously been found to effectively discriminate between experienced and inexperienced nurses conducting nursing handoffs.[13]
METHODS
Tool Design and Measures
The Handoff CEX (clinical evaluation exercise) is a structured assessment based on the format of the mini‐CEX, an instrument used to assess the quality of history and physical examination by trainees for which validation studies have previously been conducted.[14, 15, 16, 17] We developed the tool based on themes we identified from our own expertise,[1, 5, 6, 8, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29] the ACGME core competencies for trainees,[2] and the literature to maximize content validity. First, standardization has numerous demonstrable benefits for safety in general and handoffs in particular.[30, 31, 32] Consequently we created a domain for organization in which standardization was a characteristic of high performance.
Second, there is evidence that people engaged in conversation routinely overestimate peer comprehension,[27] and that explicit strategies to combat this overestimation, such as confirming understanding, explicitly assigning tasks rather than using open‐ended language, and using concrete language, are effective.[33] Accordingly we created a domain for communication skills, which is also an ACGME competency.
Third, although there were no formal guidelines for sign‐out content when we developed this tool, our own research had demonstrated that the content elements most often missing and felt to be important by stakeholders were related to clinical condition and explicating thinking processes,[5, 6] so we created a domain for content that highlighted these areas and met the ACGME competency of medical knowledge. In accordance with standards for evaluation of learners, we incorporated a domain for judgment to identify where trainees were in the RIME spectrum of reporter, interpreter, master, and educator.
Next, we added a section for professionalism in accordance with the ACGME core competencies of professionalism and patient care.[34] To avoid the disinclination of peers to label each other unprofessional, we labeled the professionalism domain as patient‐focused on the tool.
Finally, we included a domain for setting because of an extensive literature demonstrating increased handoff failures in noisy or interruptive settings.[35, 36, 37] We then revised the tool slightly based on our experiences among nurses and students.[13, 38] The final tool included the 6 domains described above and an assessment of overall competency. Each domain was scored on a 9‐point scale and included descriptive anchors at high and low ends of performance. We further divided the scale into 3 main sections: unsatisfactory (score 13), satisfactory (46), and superior (79). We designed 2 tools, 1 to assess the person providing the handoff and 1 to assess the handoff recipient, each with its own descriptive anchors. The recipient tool did not include a content domain (see Supporting Information, Appendix 1, in the online version of this article).
Setting and Subjects
We tested the tool in 2 different urban academic medical centers: the University of Chicago Medicine (UCM) and Yale‐New Haven Hospital (Yale). At UCM, we tested the tool among hospitalists, nurse practitioners, and physician assistants during the Monday and Tuesday morning and Friday evening sign‐out sessions. At Yale, we tested the tool among housestaff during the evening sign‐out session from the primary team to the on‐call covering team.
The UCM is a 550‐bed urban academic medical center in which the nonteaching hospitalist service cares for patients with liver disease, or end‐stage renal or lung disease awaiting transplant, and a small fraction of general medicine and oncology patients when the housestaff service exceeds its cap. No formal training on sign‐out is provided to attending or midlevel providers. The nonteaching hospitalist service operates as a separate service from the housestaff service and consists of 38 hospitalist clinicians (hospitalist attendings, nurse practitioners, and physicians assistants). There are 2 handoffs each day. In the morning the departing night hospitalist hands off to the incoming daytime hospitalist or midlevel provider. These handoffs occur at 7:30 am in a dedicated room. In the evening the daytime hospitalist or midlevel provider hands off to an incoming night hospitalist. This handoff occurs at 5:30 pm or 7:30 pm in a dedicated location. The written sign‐out is maintained on a Microsoft Word (Microsoft Corp., Redmond, WA) document on a password‐protected server and updated daily.
Yale is a 946‐bed urban academic medical center with a large internal medicine training program. Formal sign‐out education that covers the main domains of the tool is provided to new interns during the first 3 months of the year,[19] and a templated electronic medical record‐based electronic written handoff report is produced by the housestaff for all patients.[22] Approximately half of inpatient medicine patients are cared for by housestaff teams, which are entirely separate from the hospitalist service. Housestaff sign‐out occurs between 4 pm and 7 pm every night. At a minimum, the departing intern signs out to the incoming intern; this handoff is typically supervised by at least 1 second‐ or third‐year resident. All patients are signed out verbally; in addition, the written handoff report is provided to the incoming team. Most handoffs occur in a quiet charting room.
Data Collection
Data collection at UCM occurred between March and December 2010 on 3 days of each week: Mondays, Tuesdays, and Fridays. On Mondays and Tuesdays the morning handoffs were observed; on Fridays the evening handoffs were observed. Data collection at Yale occurred between March and May 2011. Only evening handoffs from the primary team to the overnight coverage were observed. At both sites, participants provided verbal informed consent prior to data collection. At the time of an eligible sign‐out session, a research assistant (D.R. at Yale, P.S. at UCM) provided the evaluation tools to all members of the incoming and outgoing teams, and observed the sign‐out session himself. Each person providing a handoff was asked to evaluate the recipient of the handoff; each person receiving a handoff was asked to evaluate the provider of the handoff. In addition, the trained third‐party observer (D.R., P.S.) evaluated both the provider and recipient of the handoff. The external evaluators were trained in principles of effective communication and the use of the tool, with specific review of anchors at each end of each domain. One evaluator had a DO degree and was completing an MPH degree. The second evaluator was an experienced clinical research assistant whose training consisted of supervised observation of 10 handoffs by a physician investigator. At Yale, if a resident was present, she or he was also asked to evaluate both the provider and recipient of the handoff. Consequently, every sign‐out session included at least 2 evaluations of each participant, 1 by a peer evaluator and 1 by a consistent external evaluator who did not know the patients. At Yale, many sign‐outs also included a third evaluation by a resident supervisor.
The study was approved by the institutional review boards at both UCM and Yale.
Statistical Analysis
We obtained mean, median, and interquartile range of scores for each subdomain of the tool as well as the overall assessment of handoff quality. We assessed convergent construct validity by assessing performance of the tool in different contexts. To do so, we determined whether scores differed by type of participant (provider or recipient), by site, by training level of evaluatee, or by type of evaluator (external, resident supervisor, or peer) by using Wilcoxon rank sum tests and Kruskal‐Wallis tests. For the assessment of differences in ratings by training level, we used evaluations of sign‐out providers only, because the 2 sites differed in scores for recipients. We also assessed construct validity by using Spearman rank correlation coefficients to describe the internal consistency of the tool in terms of the correlation between domains of the tool, and we conducted an exploratory factor analysis to gain insight into whether the subdomains of the tool were measuring the same construct. In conducting this analysis, we restricted the dataset to evaluations of sign‐out providers only, and used a principal components estimation method, a promax rotation, and squared multiple correlation communality priors. Finally, we conducted some preliminary studies of reliability by testing whether different types of evaluators provided similar assessments. We calculated a weighted kappa using Fleiss‐Cohen weights for external versus peer scores and again for supervising resident versus peer scores (Yale only). We were not able to assess test‐retest reliability by nature of the sign‐out process. Statistical significance was defined by a P value 0.05, and analyses were performed using SAS 9.2 (SAS Institute, Cary, NC).
RESULTS
A total of 149 handoff sessions were observed: 89 at UCM and 60 at Yale. Each site conducted a similar total number of evaluations: 336 at UCM, 337 at Yale. These sessions involved 97 unique individuals, 34 at UCM and 63 at Yale. Overall scores were high at both sites, but a wide range of scores was applied (Table 1).
Domain | Provider, N=343 | Recipient, N=330 | P Value | ||||
---|---|---|---|---|---|---|---|
Median (IQR) | Mean (SD) | Range | Median (IQR) | Mean (SD) | Range | ||
| |||||||
Setting | 7 (69) | 7.0 (1.7) | 29 | 7 (69) | 7.3 (1.6) | 29 | 0.05 |
Organization | 7 (68) | 7.2 (1.5) | 29 | 8 (69) | 7.4 (1.4) | 29 | 0.07 |
Communication | 7 (69) | 7.2 (1.6) | 19 | 8 (79) | 7.4 (1.5) | 29 | 0.22 |
Content | 7 (68) | 7.0 (1.6) | 29 | ||||
Judgment | 8 (68) | 7.3 (1.4) | 39 | 8 (79) | 7.5 (1.4) | 39 | 0.06 |
Professionalism | 8 (79) | 7.4 (1.5) | 29 | 8 (79) | 7.6 (1.4) | 39 | 0.23 |
Overall | 7 (68) | 7.1 (1.5) | 29 | 7 (68) | 7.4 (1.4) | 29 | 0.02 |
Handoff Providers
A total of 343 evaluations of handoff providers were completed regarding 67 unique individuals. For each domain, scores spanned the full range from unsatisfactory to superior. The highest rated domain on the handoff provider evaluation tool was professionalism (median: 8; interquartile range [IQR]: 79). The lowest rated domain was content (median: 7; IQR: 68) (Table 1).
Handoff Recipients
A total of 330 evaluations of handoff recipients were completed regarding 58 unique individuals. For each domain, scores spanned the full range from unsatisfactory to superior. The highest rated domain on the handoff provider evaluation tool was professionalism, with a median of 8 (IQR: 79). The lowest rated domain was setting, with a median score of 7 (IQR: 6‐9) (Table 1).
Validity Testing
Comparing provider scores to recipient scores, recipients received significantly higher scores for overall assessment (Table 1). Scores at UCM and Yale were similar in all domains for providers but were slightly lower at UCM in several domains for recipients (see Supporting Information, Appendix 2, in the online version of this article). Scores did not differ significantly by training level (Table 2). Third‐party external evaluators consistently gave lower marks for the same handoff than peer evaluators did (Table 3).
Domain | Median (Range) | P Value | |||
---|---|---|---|---|---|
NP/PA, N=33 | Subintern or Intern, N=170 | Resident, N=44 | Hospitalist, N=95 | ||
| |||||
Setting | 7 (29) | 7 (39) | 7 (49) | 7 (29) | 0.89 |
Organization | 8 (49) | 7 (29) | 7 (49) | 8 (39) | 0.11 |
Communication | 8 (49) | 7 (29) | 7 (49) | 8 (19) | 0.72 |
Content | 7 (39) | 7 (29) | 7 (49) | 7 (29) | 0.92 |
Judgment | 8 (59) | 7 (39) | 8 (49) | 8 (49) | 0.09 |
Professionalism | 8 (49) | 7 (29) | 8 (39) | 8 (49) | 0.82 |
Overall | 7 (39) | 7 (29) | 8 (49) | 7 (29) | 0.28 |
Provider, Median (Range) | Recipient, Median (Range) | |||||||
---|---|---|---|---|---|---|---|---|
Domain | Peer, N=152 | Resident, Supervisor, N=43 | External, N=147 | P Value | Peer, N=145 | Resident Supervisor, N=43 | External, N=142 | P Value |
| ||||||||
Setting | 8 (39) | 7 (39) | 7 (29) | 0.02 | 8 (29) | 7 (39) | 7 (29) | <0.001 |
Organization | 8 (39) | 8 (39) | 7 (29) | 0.18 | 8 (39) | 8 (69) | 7 (29) | <0.001 |
Communication | 8 (39) | 8 (39) | 7 (19) | <0.001 | 8 (39) | 8 (49) | 7 (29) | <0.001 |
Content | 8 (39) | 8 (29) | 7 (29) | <0.001 | N/A | N/A | N/A | N/A |
Judgment | 8 (49) | 8 (39) | 7 (39) | <0.001 | 8 (39) | 8 (49) | 7 (39) | <0.001 |
Professionalism | 8 (39) | 8 (59) | 7 (29) | 0.02 | 8 (39) | 8 (69) | 7 (39) | <0.001 |
Overall | 8 (39) | 8 (39) | 7 (29) | 0.001 | 8 (29) | 8 (49) | 7 (29) | <0.001 |
Spearman rank correlation coefficients among the CEX subdomains for provider scores ranged from 0.71 to 0.86, except for setting (Table 4). Setting was less well correlated with the other subdomains, with correlation coefficients ranging from 0.39 to 0.41. Correlations between individual domains and the overall rating ranged from 0.80 to 0.86, except setting, which had a correlation of 0.55. Every correlation was significant at P<0.001. Correlation coefficients for recipient scores were very similar to those for provider scores (see Supporting Information, Appendix 3, in the online version of this article).
Spearman Correlation Coefficients | ||||||
---|---|---|---|---|---|---|
Setting | Organization | Communication | Content | Judgment | Professionalism | |
| ||||||
Setting | 1.000 | 0.40 | 0.40 | 0.39 | 0.39 | 0.41 |
Organization | 0.40 | 1.00 | 0.80 | 0.71 | 0.77 | 0.73 |
Communication | 0.40 | 0.80 | 1.00 | 0.79 | 0.82 | 0.77 |
Content | 0.39 | 0.71 | 0.79 | 1.00 | 0.80 | 0.74 |
Judgment | 0.39 | 0.77 | 0.82 | 0.80 | 1.00 | 0.78 |
Professionalism | 0.41 | 0.73 | 0.77 | 0.74 | 0.78 | 1.00 |
Overall | 0.55 | 0.80 | 0.84 | 0.83 | 0.86 | 0.82 |
We analyzed 343 provider evaluations in the factor analysis; there were 6 missing values. The scree plot of eigenvalues did not support more than 1 factor; however, the rotated factor pattern for standardized regression coefficients for the first factor and the final communality estimates showed the setting component yielding smaller values than did other scale components (see Supporting Information, Appendix 4, in the online version of this article).
Reliability Testing
Weighted kappa scores for provider evaluations ranged from 0.28 (95% confidence interval [CI]: 0.01, 0.56) for setting to 0.59 (95% CI: 0.38, 0.80) for organization, and were generally higher for resident versus peer comparisons than for external versus peer comparisons. Weighted kappa scores for recipient evaluation were slightly lower for external versus peer evaluations, but agreement was no better than chance for resident versus peer evaluations (Table 5).
Domain | Provider | Recipient | ||
---|---|---|---|---|
External vs Peer, N=144 (95% CI) | Resident vs Peer, N=42 (95% CI) | External vs Peer, N=134 (95% CI) | Resident vs Peer, N=43 (95% CI) | |
| ||||
Setting | 0.39 (0.24, 0.54) | 0.28 (0.01, 0.56) | 0.34 (0.20, 0.48) | 0.48 (0.27, 0.69) |
Organization | 0.43 (0.29, 0.58) | 0.59 (0.39, 0.80) | 0.39 (0.22, 0.55) | 0.03 (0.23, 0.29) |
Communication | 0.34 (0.19, 0.49) | 0.52 (0.37, 0.68) | 0.36 (0.22, 0.51) | 0.02 (0.18, 0.23) |
Content | 0.38 (0.25, 0.51) | 0.53 (0.27, 0.80) | N/A (N/A) | N/A (N/A) |
Judgment | 0.36 (0.22, 0.49) | 0.54 (0.25, 0.83) | 0.28 (0.15, 0.42) | 0.12 (0.34, 0.09) |
Professionalism | 0.47 (0.32, 0.63) | 0.47 (0.23, 0.72) | 0.35 (0.18, 0.51) | 0.01 (0.29, 0.26) |
Overall | 0.50 (0.36, 0.64) | 0.45 (0.24, 0.67) | 0.31 (0.16, 0.48) | 0.07 (0.20, 0.34) |
DISCUSSION
In this study we found that an evaluation tool for direct observation of housestaff and hospitalists generated a range of scores and was well validated in the sense of performing similarly across 2 different institutions and among both trainees and attendings, while having high internal consistency. However, external evaluators gave consistently lower marks than peer evaluators at both sites, resulting in low reliability when comparing these 2 groups of raters.
It has traditionally been difficult to conduct direct evaluations of handoffs, because they may occur at haphazard times, in variable locations, and without very much advance notice. For this reason, several attempts have been made to incorporate peers in evaluations of handoff practices.[5, 39, 40] Using peers to conduct evaluations also has the advantage that peers are more likely to be familiar with the patients being handed off and might recognize handoff flaws that external evaluators would miss. Nonetheless, peer evaluations have some important liabilities. Peers may be unwilling or unable to provide honest critiques of their colleagues given that they must work closely together for years. Trainee peers may also lack sufficient clinical expertise or experience to accurately assess competence. In our study, we found that peers gave consistently higher marks to their colleagues than did external evaluators, suggesting they may have found it difficult to criticize their colleagues. We conclude that peer evaluation alone is likely an insufficient means of evaluating handoff quality.
Supervising residents gave very similar marks as intern peers, suggesting that they also are unwilling to criticize, are insufficiently experienced to evaluate, or alternatively, that the peer evaluations were reasonable. We suspect the latter is unlikely given that external evaluator scores were consistently lower than peers. One would expect the external evaluators to be biased toward higher scores given that they are not familiar with the patients and are not able to comment on inaccuracies or omissions in the sign‐out.
The tool appeared to perform less well in most cases for recipients than for providers, with a narrower range of scores and low‐weighted kappa scores. Although recipients play a key role in ensuring a high‐quality sign‐out by paying close attention, ensuring it is a bidirectional conversation, asking appropriate questions, and reading back key information, it may be that evaluators were unable to place these activities within the same domains that were used for the provider evaluation. An altogether different recipient evaluation approach may be necessary.[41]
In general, scores were clustered at the top of the score range, as is typical for evaluations. One strategy to spread out scores further would be to refine the tool by adding anchors for satisfactory performance not just the extremes. A second approach might be to reduce the grading scale to only 3 points (unsatisfactory, satisfactory, superior) to force more scores to the middle. However, this approach might limit the discrimination ability of the tool.
We have previously studied the use of this tool among nurses. In that study, we also found consistently higher scores by peers than by external evaluators. We did, however, find a positive effect of experience, in which more experienced nurses received higher scores on average. We did not observe a similar training effect in this study. There are several possible explanations for the lack of a training effect. It is possible that the types of handoffs assessed played a role. At UCM, some assessed handoffs were night staff to day staff, which might be lower quality than day staff to night staff handoffs, whereas at Yale, all handoffs were day to night teams. Thus, average scores at UCM (primarily hospitalists) might have been lowered by the type of handoff provided. Given that hospitalist evaluations were conducted exclusively at UCM and housestaff evaluations exclusively at Yale, lack of difference between hospitalists and housestaff may also have been related to differences in evaluation practice or handoff practice at the 2 sites, not necessarily related to training level. Third, in our experience, attending physicians provide briefer less‐comprehensive sign‐outs than trainees, particularly when communicating with equally experienced attendings; these sign‐outs may appropriately be scored lower on the tool. Fourth, the great majority of the hospitalists at UCM were within 5 years of residency and therefore not very much more experienced than the trainees. Finally, it is possible that skills do not improve over time given widespread lack of observation and feedback during training years for this important skill.
The high internal consistency of most of the subdomains and the loading of all subdomains except setting onto 1 factor are evidence of convergent construct validity, but also suggest that evaluators have difficulty distinguishing among components of sign‐out quality. Internal consistency may also reflect a halo effect, in which scores on different domains are all influenced by a common overall judgment.[42] We are currently testing a shorter version of the tool including domains only for content, professionalism, and setting in addition to overall score. The fact that setting did not correlate as well with the other domains suggests that sign‐out practitioners may not have or exercise control over their surroundings. Consequently, it may ultimately be reasonable to drop this domain from the tool, or alternatively, to refocus on the need to ensure a quiet setting during sign‐out skills training.
There are several limitations to this study. External evaluations were conducted by personnel who were not familiar with the patients, and they may therefore have overestimated the quality of sign‐out. Studying different types of physicians at different sites might have limited our ability to identify differences by training level. As is commonly seen in evaluation studies, scores were skewed to the high end, although we did observe some use of the full range of the tool. Finally, we were limited in our ability to test inter‐rater reliability because of the multiple sources of variability in the data (numerous different raters, with different backgrounds at different settings, rating different individuals).
In summary, we developed a handoff evaluation tool that was easily completed by housestaff and attendings without training, that performed similarly in a variety of different settings at 2 institutions, and that can in principle be used either for peer evaluations or for external evaluations, although peer evaluations may be positively biased. Further work will be done to refine and simplify the tool.
ACKNOWLEDGMENTS
Disclosures: Development and evaluation of the sign‐out CEX was supported by a grant from the Agency for Healthcare Research and Quality (1R03HS018278‐01). Dr. Arora is supported by a National Institute on Aging (K23 AG033763). Dr. Horwitz is supported by the National Institute on Aging (K08 AG038336) and by the American Federation for Aging Research through the Paul B. Beeson Career Development Award Program. Dr. Horwitz is also a Pepper Scholar with support from the Claude D. Pepper Older Americans Independence Center at Yale University School of Medicine (P30AG021342 NIH/NIA). No funding source had any role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality, the National Institute on Aging, the National Institutes of Health, or the American Federation for Aging Research. Dr. Horwitz had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. An earlier version of this work was presented as a poster presentation at the Society of General Internal Medicine Annual Meeting in Orlando, Florida on May 9, 2012. Dr. Rand is now with the Department of Medicine, University of Vermont College of Medicine, Burlington, Vermont. Mr. Staisiunas is now with the Law School, Marquette University, Milwaukee, Wisconsin. The authors declare they have no conflicts of interest.
Appendix
A
PROVIDER HAND‐OFF CEX TOOL
RECIPIENT HAND‐OFF CEX TOOL
Appendix
B
Handoff CEX scores by site of evaluation
Domain | Provider | Recipient | ||||
Median (Range) | P‐value | Median (Range) | P‐value | |||
UC | Yale | UC | Yale | |||
N=172 | N=170 | N=163 | N=167 | |||
Setting | 7 (29) | 7 (39) | 0.32 | 7 (29) | 7 (39) | 0.36 |
Organization | 8 (29) | 7 (39) | 0.30 | 7 (29) | 8 (59) | 0.001 |
Communication | 7 (19) | 7 (39) | 0.67 | 7 (29) | 8 (49) | 0.03 |
Content | 7 (29) | 7 (29) | N/A | N/A | N/A | |
Judgment | 8 (39) | 7 (39) | 0.60 | 7 (39) | 8 (49) | 0.001 |
Professionalism | 8 (29) | 8 (39) | 0.67 | 8 (39) | 8 (49) | 0.35 |
Overall | 7 (29) | 7 (39) | 0.41 | 7 (29) | 8 (49) | 0.005 |
Appendix
C
Spearman correlation, recipients (N=330)
SpearmanCorrelationCoefficients | |||||
Setting | Organization | Communication | Judgment | Professionalism | |
Setting | 1.0 | 0.46 | 0.48 | 0.47 | 0.40 |
Organization | 0.46 | 1.00 | 0.78 | 0.75 | 0.75 |
Communication | 0.48 | 0.78 | 1.00 | 0.85 | 0.77 |
Judgment | 0.47 | 0.75 | 0.85 | 1.00 | 0.74 |
Professionalism | 0.40 | 0.75 | 0.77 | 0.74 | 1.00 |
Overall | 0.60 | 0.77 | 0.84 | 0.82 | 0.77 |
All p values <0.0001
Appendix
D
Factor analysis results for provider evaluations
Rotated Factor Pattern (Standardized Regression Coefficients) N=336 | ||
Factor1 | Factor2 | |
Organization | 0.64 | 0.27 |
Communication | 0.79 | 0.16 |
Content | 0.82 | 0.06 |
Judgment | 0.86 | 0.06 |
Professionalism | 0.66 | 0.23 |
Setting | 0.18 | 0.29 |
Transfers among trainee physicians within the hospital typically occur at least twice a day and have been increasing among trainees as work hours have declined.[1] The 2011 Accreditation Council for Graduate Medical Education (ACGME) guidelines,[2] which restrict intern working hours to 16 hours from a previous maximum of 30, have likely increased the frequency of physician trainee handoffs even further. Similarly, transfers among hospitalist attendings occur at least twice a day, given typical shifts of 8 to 12 hours.
Given the frequency of transfers, and the potential for harm generated by failed transitions,[3, 4, 5, 6] the end‐of‐shift written and verbal handoffs have assumed increasingly greater importance in hospital care among both trainees and hospitalist attendings.
The ACGME now requires that programs assess the competency of trainees in handoff communication.[2] Yet, there are few tools for assessing the quality of sign‐out communication. Those that exist primarily focus on the written sign‐out, and are rarely validated.[7, 8, 9, 10, 11, 12] Furthermore, it is uncertain whether such assessments must be done by supervisors or whether peers can participate in the evaluation. In this prospective multi‐institutional study we assess the performance characteristics of a verbal sign‐out evaluation tool for internal medicine housestaff and hospitalist attendings, and examine whether it can be used by peers as well as by external evaluators. This tool has previously been found to effectively discriminate between experienced and inexperienced nurses conducting nursing handoffs.[13]
METHODS
Tool Design and Measures
The Handoff CEX (clinical evaluation exercise) is a structured assessment based on the format of the mini‐CEX, an instrument used to assess the quality of history and physical examination by trainees for which validation studies have previously been conducted.[14, 15, 16, 17] We developed the tool based on themes we identified from our own expertise,[1, 5, 6, 8, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29] the ACGME core competencies for trainees,[2] and the literature to maximize content validity. First, standardization has numerous demonstrable benefits for safety in general and handoffs in particular.[30, 31, 32] Consequently we created a domain for organization in which standardization was a characteristic of high performance.
Second, there is evidence that people engaged in conversation routinely overestimate peer comprehension,[27] and that explicit strategies to combat this overestimation, such as confirming understanding, explicitly assigning tasks rather than using open‐ended language, and using concrete language, are effective.[33] Accordingly we created a domain for communication skills, which is also an ACGME competency.
Third, although there were no formal guidelines for sign‐out content when we developed this tool, our own research had demonstrated that the content elements most often missing and felt to be important by stakeholders were related to clinical condition and explicating thinking processes,[5, 6] so we created a domain for content that highlighted these areas and met the ACGME competency of medical knowledge. In accordance with standards for evaluation of learners, we incorporated a domain for judgment to identify where trainees were in the RIME spectrum of reporter, interpreter, master, and educator.
Next, we added a section for professionalism in accordance with the ACGME core competencies of professionalism and patient care.[34] To avoid the disinclination of peers to label each other unprofessional, we labeled the professionalism domain as patient‐focused on the tool.
Finally, we included a domain for setting because of an extensive literature demonstrating increased handoff failures in noisy or interruptive settings.[35, 36, 37] We then revised the tool slightly based on our experiences among nurses and students.[13, 38] The final tool included the 6 domains described above and an assessment of overall competency. Each domain was scored on a 9‐point scale and included descriptive anchors at high and low ends of performance. We further divided the scale into 3 main sections: unsatisfactory (score 13), satisfactory (46), and superior (79). We designed 2 tools, 1 to assess the person providing the handoff and 1 to assess the handoff recipient, each with its own descriptive anchors. The recipient tool did not include a content domain (see Supporting Information, Appendix 1, in the online version of this article).
Setting and Subjects
We tested the tool in 2 different urban academic medical centers: the University of Chicago Medicine (UCM) and Yale‐New Haven Hospital (Yale). At UCM, we tested the tool among hospitalists, nurse practitioners, and physician assistants during the Monday and Tuesday morning and Friday evening sign‐out sessions. At Yale, we tested the tool among housestaff during the evening sign‐out session from the primary team to the on‐call covering team.
The UCM is a 550‐bed urban academic medical center in which the nonteaching hospitalist service cares for patients with liver disease, or end‐stage renal or lung disease awaiting transplant, and a small fraction of general medicine and oncology patients when the housestaff service exceeds its cap. No formal training on sign‐out is provided to attending or midlevel providers. The nonteaching hospitalist service operates as a separate service from the housestaff service and consists of 38 hospitalist clinicians (hospitalist attendings, nurse practitioners, and physicians assistants). There are 2 handoffs each day. In the morning the departing night hospitalist hands off to the incoming daytime hospitalist or midlevel provider. These handoffs occur at 7:30 am in a dedicated room. In the evening the daytime hospitalist or midlevel provider hands off to an incoming night hospitalist. This handoff occurs at 5:30 pm or 7:30 pm in a dedicated location. The written sign‐out is maintained on a Microsoft Word (Microsoft Corp., Redmond, WA) document on a password‐protected server and updated daily.
Yale is a 946‐bed urban academic medical center with a large internal medicine training program. Formal sign‐out education that covers the main domains of the tool is provided to new interns during the first 3 months of the year,[19] and a templated electronic medical record‐based electronic written handoff report is produced by the housestaff for all patients.[22] Approximately half of inpatient medicine patients are cared for by housestaff teams, which are entirely separate from the hospitalist service. Housestaff sign‐out occurs between 4 pm and 7 pm every night. At a minimum, the departing intern signs out to the incoming intern; this handoff is typically supervised by at least 1 second‐ or third‐year resident. All patients are signed out verbally; in addition, the written handoff report is provided to the incoming team. Most handoffs occur in a quiet charting room.
Data Collection
Data collection at UCM occurred between March and December 2010 on 3 days of each week: Mondays, Tuesdays, and Fridays. On Mondays and Tuesdays the morning handoffs were observed; on Fridays the evening handoffs were observed. Data collection at Yale occurred between March and May 2011. Only evening handoffs from the primary team to the overnight coverage were observed. At both sites, participants provided verbal informed consent prior to data collection. At the time of an eligible sign‐out session, a research assistant (D.R. at Yale, P.S. at UCM) provided the evaluation tools to all members of the incoming and outgoing teams, and observed the sign‐out session himself. Each person providing a handoff was asked to evaluate the recipient of the handoff; each person receiving a handoff was asked to evaluate the provider of the handoff. In addition, the trained third‐party observer (D.R., P.S.) evaluated both the provider and recipient of the handoff. The external evaluators were trained in principles of effective communication and the use of the tool, with specific review of anchors at each end of each domain. One evaluator had a DO degree and was completing an MPH degree. The second evaluator was an experienced clinical research assistant whose training consisted of supervised observation of 10 handoffs by a physician investigator. At Yale, if a resident was present, she or he was also asked to evaluate both the provider and recipient of the handoff. Consequently, every sign‐out session included at least 2 evaluations of each participant, 1 by a peer evaluator and 1 by a consistent external evaluator who did not know the patients. At Yale, many sign‐outs also included a third evaluation by a resident supervisor.
The study was approved by the institutional review boards at both UCM and Yale.
Statistical Analysis
We obtained mean, median, and interquartile range of scores for each subdomain of the tool as well as the overall assessment of handoff quality. We assessed convergent construct validity by assessing performance of the tool in different contexts. To do so, we determined whether scores differed by type of participant (provider or recipient), by site, by training level of evaluatee, or by type of evaluator (external, resident supervisor, or peer) by using Wilcoxon rank sum tests and Kruskal‐Wallis tests. For the assessment of differences in ratings by training level, we used evaluations of sign‐out providers only, because the 2 sites differed in scores for recipients. We also assessed construct validity by using Spearman rank correlation coefficients to describe the internal consistency of the tool in terms of the correlation between domains of the tool, and we conducted an exploratory factor analysis to gain insight into whether the subdomains of the tool were measuring the same construct. In conducting this analysis, we restricted the dataset to evaluations of sign‐out providers only, and used a principal components estimation method, a promax rotation, and squared multiple correlation communality priors. Finally, we conducted some preliminary studies of reliability by testing whether different types of evaluators provided similar assessments. We calculated a weighted kappa using Fleiss‐Cohen weights for external versus peer scores and again for supervising resident versus peer scores (Yale only). We were not able to assess test‐retest reliability by nature of the sign‐out process. Statistical significance was defined by a P value 0.05, and analyses were performed using SAS 9.2 (SAS Institute, Cary, NC).
RESULTS
A total of 149 handoff sessions were observed: 89 at UCM and 60 at Yale. Each site conducted a similar total number of evaluations: 336 at UCM, 337 at Yale. These sessions involved 97 unique individuals, 34 at UCM and 63 at Yale. Overall scores were high at both sites, but a wide range of scores was applied (Table 1).
Domain | Provider, N=343 | Recipient, N=330 | P Value | ||||
---|---|---|---|---|---|---|---|
Median (IQR) | Mean (SD) | Range | Median (IQR) | Mean (SD) | Range | ||
| |||||||
Setting | 7 (69) | 7.0 (1.7) | 29 | 7 (69) | 7.3 (1.6) | 29 | 0.05 |
Organization | 7 (68) | 7.2 (1.5) | 29 | 8 (69) | 7.4 (1.4) | 29 | 0.07 |
Communication | 7 (69) | 7.2 (1.6) | 19 | 8 (79) | 7.4 (1.5) | 29 | 0.22 |
Content | 7 (68) | 7.0 (1.6) | 29 | ||||
Judgment | 8 (68) | 7.3 (1.4) | 39 | 8 (79) | 7.5 (1.4) | 39 | 0.06 |
Professionalism | 8 (79) | 7.4 (1.5) | 29 | 8 (79) | 7.6 (1.4) | 39 | 0.23 |
Overall | 7 (68) | 7.1 (1.5) | 29 | 7 (68) | 7.4 (1.4) | 29 | 0.02 |
Handoff Providers
A total of 343 evaluations of handoff providers were completed regarding 67 unique individuals. For each domain, scores spanned the full range from unsatisfactory to superior. The highest rated domain on the handoff provider evaluation tool was professionalism (median: 8; interquartile range [IQR]: 79). The lowest rated domain was content (median: 7; IQR: 68) (Table 1).
Handoff Recipients
A total of 330 evaluations of handoff recipients were completed regarding 58 unique individuals. For each domain, scores spanned the full range from unsatisfactory to superior. The highest rated domain on the handoff provider evaluation tool was professionalism, with a median of 8 (IQR: 79). The lowest rated domain was setting, with a median score of 7 (IQR: 6‐9) (Table 1).
Validity Testing
Comparing provider scores to recipient scores, recipients received significantly higher scores for overall assessment (Table 1). Scores at UCM and Yale were similar in all domains for providers but were slightly lower at UCM in several domains for recipients (see Supporting Information, Appendix 2, in the online version of this article). Scores did not differ significantly by training level (Table 2). Third‐party external evaluators consistently gave lower marks for the same handoff than peer evaluators did (Table 3).
Domain | Median (Range) | P Value | |||
---|---|---|---|---|---|
NP/PA, N=33 | Subintern or Intern, N=170 | Resident, N=44 | Hospitalist, N=95 | ||
| |||||
Setting | 7 (29) | 7 (39) | 7 (49) | 7 (29) | 0.89 |
Organization | 8 (49) | 7 (29) | 7 (49) | 8 (39) | 0.11 |
Communication | 8 (49) | 7 (29) | 7 (49) | 8 (19) | 0.72 |
Content | 7 (39) | 7 (29) | 7 (49) | 7 (29) | 0.92 |
Judgment | 8 (59) | 7 (39) | 8 (49) | 8 (49) | 0.09 |
Professionalism | 8 (49) | 7 (29) | 8 (39) | 8 (49) | 0.82 |
Overall | 7 (39) | 7 (29) | 8 (49) | 7 (29) | 0.28 |
Provider, Median (Range) | Recipient, Median (Range) | |||||||
---|---|---|---|---|---|---|---|---|
Domain | Peer, N=152 | Resident, Supervisor, N=43 | External, N=147 | P Value | Peer, N=145 | Resident Supervisor, N=43 | External, N=142 | P Value |
| ||||||||
Setting | 8 (39) | 7 (39) | 7 (29) | 0.02 | 8 (29) | 7 (39) | 7 (29) | <0.001 |
Organization | 8 (39) | 8 (39) | 7 (29) | 0.18 | 8 (39) | 8 (69) | 7 (29) | <0.001 |
Communication | 8 (39) | 8 (39) | 7 (19) | <0.001 | 8 (39) | 8 (49) | 7 (29) | <0.001 |
Content | 8 (39) | 8 (29) | 7 (29) | <0.001 | N/A | N/A | N/A | N/A |
Judgment | 8 (49) | 8 (39) | 7 (39) | <0.001 | 8 (39) | 8 (49) | 7 (39) | <0.001 |
Professionalism | 8 (39) | 8 (59) | 7 (29) | 0.02 | 8 (39) | 8 (69) | 7 (39) | <0.001 |
Overall | 8 (39) | 8 (39) | 7 (29) | 0.001 | 8 (29) | 8 (49) | 7 (29) | <0.001 |
Spearman rank correlation coefficients among the CEX subdomains for provider scores ranged from 0.71 to 0.86, except for setting (Table 4). Setting was less well correlated with the other subdomains, with correlation coefficients ranging from 0.39 to 0.41. Correlations between individual domains and the overall rating ranged from 0.80 to 0.86, except setting, which had a correlation of 0.55. Every correlation was significant at P<0.001. Correlation coefficients for recipient scores were very similar to those for provider scores (see Supporting Information, Appendix 3, in the online version of this article).
Spearman Correlation Coefficients | ||||||
---|---|---|---|---|---|---|
Setting | Organization | Communication | Content | Judgment | Professionalism | |
| ||||||
Setting | 1.000 | 0.40 | 0.40 | 0.39 | 0.39 | 0.41 |
Organization | 0.40 | 1.00 | 0.80 | 0.71 | 0.77 | 0.73 |
Communication | 0.40 | 0.80 | 1.00 | 0.79 | 0.82 | 0.77 |
Content | 0.39 | 0.71 | 0.79 | 1.00 | 0.80 | 0.74 |
Judgment | 0.39 | 0.77 | 0.82 | 0.80 | 1.00 | 0.78 |
Professionalism | 0.41 | 0.73 | 0.77 | 0.74 | 0.78 | 1.00 |
Overall | 0.55 | 0.80 | 0.84 | 0.83 | 0.86 | 0.82 |
We analyzed 343 provider evaluations in the factor analysis; there were 6 missing values. The scree plot of eigenvalues did not support more than 1 factor; however, the rotated factor pattern for standardized regression coefficients for the first factor and the final communality estimates showed the setting component yielding smaller values than did other scale components (see Supporting Information, Appendix 4, in the online version of this article).
Reliability Testing
Weighted kappa scores for provider evaluations ranged from 0.28 (95% confidence interval [CI]: 0.01, 0.56) for setting to 0.59 (95% CI: 0.38, 0.80) for organization, and were generally higher for resident versus peer comparisons than for external versus peer comparisons. Weighted kappa scores for recipient evaluation were slightly lower for external versus peer evaluations, but agreement was no better than chance for resident versus peer evaluations (Table 5).
Domain | Provider | Recipient | ||
---|---|---|---|---|
External vs Peer, N=144 (95% CI) | Resident vs Peer, N=42 (95% CI) | External vs Peer, N=134 (95% CI) | Resident vs Peer, N=43 (95% CI) | |
| ||||
Setting | 0.39 (0.24, 0.54) | 0.28 (0.01, 0.56) | 0.34 (0.20, 0.48) | 0.48 (0.27, 0.69) |
Organization | 0.43 (0.29, 0.58) | 0.59 (0.39, 0.80) | 0.39 (0.22, 0.55) | 0.03 (0.23, 0.29) |
Communication | 0.34 (0.19, 0.49) | 0.52 (0.37, 0.68) | 0.36 (0.22, 0.51) | 0.02 (0.18, 0.23) |
Content | 0.38 (0.25, 0.51) | 0.53 (0.27, 0.80) | N/A (N/A) | N/A (N/A) |
Judgment | 0.36 (0.22, 0.49) | 0.54 (0.25, 0.83) | 0.28 (0.15, 0.42) | 0.12 (0.34, 0.09) |
Professionalism | 0.47 (0.32, 0.63) | 0.47 (0.23, 0.72) | 0.35 (0.18, 0.51) | 0.01 (0.29, 0.26) |
Overall | 0.50 (0.36, 0.64) | 0.45 (0.24, 0.67) | 0.31 (0.16, 0.48) | 0.07 (0.20, 0.34) |
DISCUSSION
In this study we found that an evaluation tool for direct observation of housestaff and hospitalists generated a range of scores and was well validated in the sense of performing similarly across 2 different institutions and among both trainees and attendings, while having high internal consistency. However, external evaluators gave consistently lower marks than peer evaluators at both sites, resulting in low reliability when comparing these 2 groups of raters.
It has traditionally been difficult to conduct direct evaluations of handoffs, because they may occur at haphazard times, in variable locations, and without very much advance notice. For this reason, several attempts have been made to incorporate peers in evaluations of handoff practices.[5, 39, 40] Using peers to conduct evaluations also has the advantage that peers are more likely to be familiar with the patients being handed off and might recognize handoff flaws that external evaluators would miss. Nonetheless, peer evaluations have some important liabilities. Peers may be unwilling or unable to provide honest critiques of their colleagues given that they must work closely together for years. Trainee peers may also lack sufficient clinical expertise or experience to accurately assess competence. In our study, we found that peers gave consistently higher marks to their colleagues than did external evaluators, suggesting they may have found it difficult to criticize their colleagues. We conclude that peer evaluation alone is likely an insufficient means of evaluating handoff quality.
Supervising residents gave very similar marks as intern peers, suggesting that they also are unwilling to criticize, are insufficiently experienced to evaluate, or alternatively, that the peer evaluations were reasonable. We suspect the latter is unlikely given that external evaluator scores were consistently lower than peers. One would expect the external evaluators to be biased toward higher scores given that they are not familiar with the patients and are not able to comment on inaccuracies or omissions in the sign‐out.
The tool appeared to perform less well in most cases for recipients than for providers, with a narrower range of scores and low‐weighted kappa scores. Although recipients play a key role in ensuring a high‐quality sign‐out by paying close attention, ensuring it is a bidirectional conversation, asking appropriate questions, and reading back key information, it may be that evaluators were unable to place these activities within the same domains that were used for the provider evaluation. An altogether different recipient evaluation approach may be necessary.[41]
In general, scores were clustered at the top of the score range, as is typical for evaluations. One strategy to spread out scores further would be to refine the tool by adding anchors for satisfactory performance not just the extremes. A second approach might be to reduce the grading scale to only 3 points (unsatisfactory, satisfactory, superior) to force more scores to the middle. However, this approach might limit the discrimination ability of the tool.
We have previously studied the use of this tool among nurses. In that study, we also found consistently higher scores by peers than by external evaluators. We did, however, find a positive effect of experience, in which more experienced nurses received higher scores on average. We did not observe a similar training effect in this study. There are several possible explanations for the lack of a training effect. It is possible that the types of handoffs assessed played a role. At UCM, some assessed handoffs were night staff to day staff, which might be lower quality than day staff to night staff handoffs, whereas at Yale, all handoffs were day to night teams. Thus, average scores at UCM (primarily hospitalists) might have been lowered by the type of handoff provided. Given that hospitalist evaluations were conducted exclusively at UCM and housestaff evaluations exclusively at Yale, lack of difference between hospitalists and housestaff may also have been related to differences in evaluation practice or handoff practice at the 2 sites, not necessarily related to training level. Third, in our experience, attending physicians provide briefer less‐comprehensive sign‐outs than trainees, particularly when communicating with equally experienced attendings; these sign‐outs may appropriately be scored lower on the tool. Fourth, the great majority of the hospitalists at UCM were within 5 years of residency and therefore not very much more experienced than the trainees. Finally, it is possible that skills do not improve over time given widespread lack of observation and feedback during training years for this important skill.
The high internal consistency of most of the subdomains and the loading of all subdomains except setting onto 1 factor are evidence of convergent construct validity, but also suggest that evaluators have difficulty distinguishing among components of sign‐out quality. Internal consistency may also reflect a halo effect, in which scores on different domains are all influenced by a common overall judgment.[42] We are currently testing a shorter version of the tool including domains only for content, professionalism, and setting in addition to overall score. The fact that setting did not correlate as well with the other domains suggests that sign‐out practitioners may not have or exercise control over their surroundings. Consequently, it may ultimately be reasonable to drop this domain from the tool, or alternatively, to refocus on the need to ensure a quiet setting during sign‐out skills training.
There are several limitations to this study. External evaluations were conducted by personnel who were not familiar with the patients, and they may therefore have overestimated the quality of sign‐out. Studying different types of physicians at different sites might have limited our ability to identify differences by training level. As is commonly seen in evaluation studies, scores were skewed to the high end, although we did observe some use of the full range of the tool. Finally, we were limited in our ability to test inter‐rater reliability because of the multiple sources of variability in the data (numerous different raters, with different backgrounds at different settings, rating different individuals).
In summary, we developed a handoff evaluation tool that was easily completed by housestaff and attendings without training, that performed similarly in a variety of different settings at 2 institutions, and that can in principle be used either for peer evaluations or for external evaluations, although peer evaluations may be positively biased. Further work will be done to refine and simplify the tool.
ACKNOWLEDGMENTS
Disclosures: Development and evaluation of the sign‐out CEX was supported by a grant from the Agency for Healthcare Research and Quality (1R03HS018278‐01). Dr. Arora is supported by a National Institute on Aging (K23 AG033763). Dr. Horwitz is supported by the National Institute on Aging (K08 AG038336) and by the American Federation for Aging Research through the Paul B. Beeson Career Development Award Program. Dr. Horwitz is also a Pepper Scholar with support from the Claude D. Pepper Older Americans Independence Center at Yale University School of Medicine (P30AG021342 NIH/NIA). No funding source had any role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality, the National Institute on Aging, the National Institutes of Health, or the American Federation for Aging Research. Dr. Horwitz had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. An earlier version of this work was presented as a poster presentation at the Society of General Internal Medicine Annual Meeting in Orlando, Florida on May 9, 2012. Dr. Rand is now with the Department of Medicine, University of Vermont College of Medicine, Burlington, Vermont. Mr. Staisiunas is now with the Law School, Marquette University, Milwaukee, Wisconsin. The authors declare they have no conflicts of interest.
Appendix
A
PROVIDER HAND‐OFF CEX TOOL
RECIPIENT HAND‐OFF CEX TOOL
Appendix
B
Handoff CEX scores by site of evaluation
Domain | Provider | Recipient | ||||
Median (Range) | P‐value | Median (Range) | P‐value | |||
UC | Yale | UC | Yale | |||
N=172 | N=170 | N=163 | N=167 | |||
Setting | 7 (29) | 7 (39) | 0.32 | 7 (29) | 7 (39) | 0.36 |
Organization | 8 (29) | 7 (39) | 0.30 | 7 (29) | 8 (59) | 0.001 |
Communication | 7 (19) | 7 (39) | 0.67 | 7 (29) | 8 (49) | 0.03 |
Content | 7 (29) | 7 (29) | N/A | N/A | N/A | |
Judgment | 8 (39) | 7 (39) | 0.60 | 7 (39) | 8 (49) | 0.001 |
Professionalism | 8 (29) | 8 (39) | 0.67 | 8 (39) | 8 (49) | 0.35 |
Overall | 7 (29) | 7 (39) | 0.41 | 7 (29) | 8 (49) | 0.005 |
Appendix
C
Spearman correlation, recipients (N=330)
SpearmanCorrelationCoefficients | |||||
Setting | Organization | Communication | Judgment | Professionalism | |
Setting | 1.0 | 0.46 | 0.48 | 0.47 | 0.40 |
Organization | 0.46 | 1.00 | 0.78 | 0.75 | 0.75 |
Communication | 0.48 | 0.78 | 1.00 | 0.85 | 0.77 |
Judgment | 0.47 | 0.75 | 0.85 | 1.00 | 0.74 |
Professionalism | 0.40 | 0.75 | 0.77 | 0.74 | 1.00 |
Overall | 0.60 | 0.77 | 0.84 | 0.82 | 0.77 |
All p values <0.0001
Appendix
D
Factor analysis results for provider evaluations
Rotated Factor Pattern (Standardized Regression Coefficients) N=336 | ||
Factor1 | Factor2 | |
Organization | 0.64 | 0.27 |
Communication | 0.79 | 0.16 |
Content | 0.82 | 0.06 |
Judgment | 0.86 | 0.06 |
Professionalism | 0.66 | 0.23 |
Setting | 0.18 | 0.29 |
- Transfers of patient care between house staff on internal medicine wards: a national survey. Arch Intern Med. 2006;166(11):1173–1177. , , , .
- Accreditation Council for Graduate Medical Education. Common program requirements. 2011; http://www.acgme‐2010standards.org/pdf/Common_Program_Requirements_07012011.pdf. Accessed August 23, 2011.
- Does housestaff discontinuity of care increase the risk for preventable adverse events? Ann Intern Med. 1994;121(11):866–872. , , , , .
- Communication failures: an insidious contributor to medical mishaps. Acad Med. 2004;79(2):186–194. , , .
- Communication failures in patient sign‐out and suggestions for improvement: a critical incident analysis. Qual Saf Health Care. 2005;14(6):401–407. , , , , .
- Consequences of inadequate sign‐out for patient care. Arch Intern Med. 2008;168(16):1755–1760. , , , , .
- Adequacy of information transferred at resident sign‐out (in‐hospital handover of care): a prospective survey. Qual Saf Health Care. 2008;17(1):6–10. , , , .
- What are covering doctors told about their patients? Analysis of sign‐out among internal medicine house staff. Qual Saf Health Care. 2009;18(4):248–255. , , , , .
- Using direct observation, formal evaluation, and an interactive curriculum to improve the sign‐out practices of internal medicine interns. Acad Med. 2010;85(7):1182–1188. , .
- Doctors' handovers in hospitals: a literature review. Qual Saf Health Care. 2011;20(2):128–133. , , , .
- Resident sign‐out and patient hand‐offs: opportunities for improvement. Teach Learn Med. 2011;23(2):105–111. , , , et al.
- Use of an appreciative inquiry approach to improve resident sign‐out in an era of multiple shift changes. J Gen Intern Med. 2012;27(3):287–291. , , , et al.
- Validation of a handoff assessment tool: the Handoff CEX [published online ahead of print June 7, 2012]. J Clin Nurs. doi: 10.1111/j.1365–2702.2012.04131.x. , , , , , .
- The mini‐CEX (clinical evaluation exercise): a preliminary investigation. Ann Intern Med. 1995;123(10):795–799. , , , .
- Examiner differences in the mini‐CEX. Adv Health Sci Educ Theory Pract. 1997;2(1):27–33. , , , .
- Assessing the reliability and validity of the mini‐clinical evaluation exercise for internal medicine residency training. Acad Med. 2002;77(9):900–904. , , , .
- Construct validity of the miniclinical evaluation exercise (miniCEX). Acad Med. 2003;78(8):826–830. , , , , .
- Dropping the baton: a qualitative analysis of failures during the transition from emergency department to inpatient care. Ann Emerg Med. 2009;53(6):701–710.e4. , , , , , .
- Development and implementation of an oral sign‐out skills curriculum. J Gen Intern Med. 2007;22(10):1470–1474. , , .
- Mixed methods evaluation of oral sign‐out practices. J Gen Intern Med. 2007;22(S1):S114. , , , .
- Evaluation of an asynchronous physician voicemail sign‐out for emergency department admissions. Ann Emerg Med. 2009;54(3):368–378. , , , et al.
- An institution‐wide handoff task force to standardise and improve physician handoffs. BMJ Qual Saf. 2012;21(10):863–871. , , , et al.
- A model for building a standardized hand‐off protocol. Jt Comm J Qual Patient Saf. 2006;32(11):646–655. , .
- Medication discrepancies in resident sign‐outs and their potential to harm. J Gen Intern Med. 2007;22(12):1751–1755. , , , , .
- A theoretical framework and competency‐based approach to improving handoffs. Qual Saf Health Care. 2008;17(1):11–14. , , , .
- Hospitalist handoffs: a systematic review and task force recommendations. J Hosp Med. 2009;4(7):433–440. , , , , , .
- Interns overestimate the effectiveness of their hand‐off communication. Pediatrics. 2010;125(3):491–496. , , , , .
- Improving clinical handovers: creating local solutions for a global problem. Qual Saf Health Care. 2009;18(4):244–245. , .
- Managing discontinuity in academic medical centers: strategies for a safe and effective resident sign‐out. J Hosp Med. 2006;1(4):257–266. , , , , .
- Standardized sign‐out reduces intern perception of medical errors on the general internal medicine ward. Teach Learn Med. 2009;21(2):121–126. , , .
- SBAR: a shared mental model for improving communication between clinicians. Jt Comm J Qual Patient Saf. 2006;32(3):167–175. , , .
- Structuring flexibility: the potential good, bad and ugly in standardisation of handovers. Qual Saf Health Care. 2008;17(1):4–5. .
- Handoff strategies in settings with high consequences for failure: lessons for health care operations. Int J Qual Health Care. 2004;16(2):125–132. , , , , .
- Residents' perceptions of professionalism in training and practice: barriers, promoters, and duty hour requirements. J Gen Intern Med. 2006;21(7):758–763. , , , , , .
- Communication behaviours in a hospital setting: an observational study. BMJ. 1998;316(7132):673–676. , .
- Communication loads on clinical staff in the emergency department. Med J Aust. 2002;176(9):415–418. , , , , .
- A systematic review of failures in handoff communication during intrahospital transfers. Jt Comm J Qual Patient Saf. 2011;37(6):274–284. , .
- Hand‐off education and evaluation: piloting the observed simulated hand‐off experience (OSHE). J Gen Intern Med. 2010;25(2):129–134. , , , et al.
- Handoffs causing patient harm: a survey of medical and surgical house staff. Jt Comm J Qual Patient Saf. 2008;34(10):563–570. , , , et al.
- A prospective observational study of physician handoff for intensive‐care‐unit‐to‐ward patient transfers. Am J Med. 2011;124(9):860–867. , , .
- Characterizing physician listening behavior during hospitalist handoffs using the HEAR checklist (published online ahead of print December 20, 2012]. BMJ Qual Saf. doi:10.1136/bmjqs‐2012‐001138. , , , , .
- A constant error in psychological ratings. J Appl Psychol. 1920;4(1):25. .
- Transfers of patient care between house staff on internal medicine wards: a national survey. Arch Intern Med. 2006;166(11):1173–1177. , , , .
- Accreditation Council for Graduate Medical Education. Common program requirements. 2011; http://www.acgme‐2010standards.org/pdf/Common_Program_Requirements_07012011.pdf. Accessed August 23, 2011.
- Does housestaff discontinuity of care increase the risk for preventable adverse events? Ann Intern Med. 1994;121(11):866–872. , , , , .
- Communication failures: an insidious contributor to medical mishaps. Acad Med. 2004;79(2):186–194. , , .
- Communication failures in patient sign‐out and suggestions for improvement: a critical incident analysis. Qual Saf Health Care. 2005;14(6):401–407. , , , , .
- Consequences of inadequate sign‐out for patient care. Arch Intern Med. 2008;168(16):1755–1760. , , , , .
- Adequacy of information transferred at resident sign‐out (in‐hospital handover of care): a prospective survey. Qual Saf Health Care. 2008;17(1):6–10. , , , .
- What are covering doctors told about their patients? Analysis of sign‐out among internal medicine house staff. Qual Saf Health Care. 2009;18(4):248–255. , , , , .
- Using direct observation, formal evaluation, and an interactive curriculum to improve the sign‐out practices of internal medicine interns. Acad Med. 2010;85(7):1182–1188. , .
- Doctors' handovers in hospitals: a literature review. Qual Saf Health Care. 2011;20(2):128–133. , , , .
- Resident sign‐out and patient hand‐offs: opportunities for improvement. Teach Learn Med. 2011;23(2):105–111. , , , et al.
- Use of an appreciative inquiry approach to improve resident sign‐out in an era of multiple shift changes. J Gen Intern Med. 2012;27(3):287–291. , , , et al.
- Validation of a handoff assessment tool: the Handoff CEX [published online ahead of print June 7, 2012]. J Clin Nurs. doi: 10.1111/j.1365–2702.2012.04131.x. , , , , , .
- The mini‐CEX (clinical evaluation exercise): a preliminary investigation. Ann Intern Med. 1995;123(10):795–799. , , , .
- Examiner differences in the mini‐CEX. Adv Health Sci Educ Theory Pract. 1997;2(1):27–33. , , , .
- Assessing the reliability and validity of the mini‐clinical evaluation exercise for internal medicine residency training. Acad Med. 2002;77(9):900–904. , , , .
- Construct validity of the miniclinical evaluation exercise (miniCEX). Acad Med. 2003;78(8):826–830. , , , , .
- Dropping the baton: a qualitative analysis of failures during the transition from emergency department to inpatient care. Ann Emerg Med. 2009;53(6):701–710.e4. , , , , , .
- Development and implementation of an oral sign‐out skills curriculum. J Gen Intern Med. 2007;22(10):1470–1474. , , .
- Mixed methods evaluation of oral sign‐out practices. J Gen Intern Med. 2007;22(S1):S114. , , , .
- Evaluation of an asynchronous physician voicemail sign‐out for emergency department admissions. Ann Emerg Med. 2009;54(3):368–378. , , , et al.
- An institution‐wide handoff task force to standardise and improve physician handoffs. BMJ Qual Saf. 2012;21(10):863–871. , , , et al.
- A model for building a standardized hand‐off protocol. Jt Comm J Qual Patient Saf. 2006;32(11):646–655. , .
- Medication discrepancies in resident sign‐outs and their potential to harm. J Gen Intern Med. 2007;22(12):1751–1755. , , , , .
- A theoretical framework and competency‐based approach to improving handoffs. Qual Saf Health Care. 2008;17(1):11–14. , , , .
- Hospitalist handoffs: a systematic review and task force recommendations. J Hosp Med. 2009;4(7):433–440. , , , , , .
- Interns overestimate the effectiveness of their hand‐off communication. Pediatrics. 2010;125(3):491–496. , , , , .
- Improving clinical handovers: creating local solutions for a global problem. Qual Saf Health Care. 2009;18(4):244–245. , .
- Managing discontinuity in academic medical centers: strategies for a safe and effective resident sign‐out. J Hosp Med. 2006;1(4):257–266. , , , , .
- Standardized sign‐out reduces intern perception of medical errors on the general internal medicine ward. Teach Learn Med. 2009;21(2):121–126. , , .
- SBAR: a shared mental model for improving communication between clinicians. Jt Comm J Qual Patient Saf. 2006;32(3):167–175. , , .
- Structuring flexibility: the potential good, bad and ugly in standardisation of handovers. Qual Saf Health Care. 2008;17(1):4–5. .
- Handoff strategies in settings with high consequences for failure: lessons for health care operations. Int J Qual Health Care. 2004;16(2):125–132. , , , , .
- Residents' perceptions of professionalism in training and practice: barriers, promoters, and duty hour requirements. J Gen Intern Med. 2006;21(7):758–763. , , , , , .
- Communication behaviours in a hospital setting: an observational study. BMJ. 1998;316(7132):673–676. , .
- Communication loads on clinical staff in the emergency department. Med J Aust. 2002;176(9):415–418. , , , , .
- A systematic review of failures in handoff communication during intrahospital transfers. Jt Comm J Qual Patient Saf. 2011;37(6):274–284. , .
- Hand‐off education and evaluation: piloting the observed simulated hand‐off experience (OSHE). J Gen Intern Med. 2010;25(2):129–134. , , , et al.
- Handoffs causing patient harm: a survey of medical and surgical house staff. Jt Comm J Qual Patient Saf. 2008;34(10):563–570. , , , et al.
- A prospective observational study of physician handoff for intensive‐care‐unit‐to‐ward patient transfers. Am J Med. 2011;124(9):860–867. , , .
- Characterizing physician listening behavior during hospitalist handoffs using the HEAR checklist (published online ahead of print December 20, 2012]. BMJ Qual Saf. doi:10.1136/bmjqs‐2012‐001138. , , , , .
- A constant error in psychological ratings. J Appl Psychol. 1920;4(1):25. .
Copyright © 2013 Society of Hospital Medicine