KeyLIME Podcast #183: Assessing Portfolios: Turning complex data into pass/fail

Competency committees are all the rage. The promise of CBME is that group decision making and local accountability will lead to a more rigorous and valid summative assessment of a learner. After all, group decision making is the backbone of our judicial system (e.g., jury) so it should transfer to medical education, right? Read on, and check out the podcast here (or on iTunes!)

————————————————————————–

KeyLIME Session 183:

Listen to the podcast.

Reference:

New KeyLIME Podcast Episode Image

Andrea Oudkerk Pool • Marjan J. B. Govaerts • Debbie A. D. C. Jaarsma • Erik W. Driessen From aggregation to interpretation: how assessors judge complex data in a competency-based portfolio Adv in Health Sci Educ (2018) 23:275–287

Reviewer: Jonathan Sherbino (@sherbino)

Background

Competency committees are all the rage. The promise of CBME is that group decision making and local accountability will lead to a more rigorous and valid summative assessment of a learner. After all, group decision making is the backbone of our judicial system (e.g., jury) so it should transfer to medical education, right?

However, like many principles that undergird the CBME construct, the theory has yet to be tested against the lived experience. This paper tackles an intermediate step crucial to the functioning of a competency committee – the review and assessment of the learner portfolio. An assumption in this system is that an assessor can select, interpret and integrate all of the relevant evidence in the portfolio to reach a global decision that is presented to the competency committee. Of course, we’ve talked many times about the challenges of rater assessments and bias on KeyLIME. So, if you’re interested in how an individual assessor comes to a global judgement about the aggregated data in a portfolio, read on. (Psst… it’s not a simple process).

Purpose

“The purpose of the present research is therefore to further our understanding of how assessors form judgments when interpreting the complex data included in a competency-based portfolio.”

Key Points on Method

The investigators adopted constructivist grounded theory to inform the thematic analysis. REB approval was granted.

Senior medical students (Master’s in Medicine program, Maastricht University) are assigned a mentor-assessor. The mentor-assessor monitors competency development (aligned with CanMEDS) and guides self-assessment and establishment of learning goals. Student/mentor meetings occur quarterly. The mentor-assessor must send a summative assessment to the competency committee for a final pass/fail decision.

Three mock portfolios were developed, containing narrative and numerical data from self-assessments, WBAs, progress tests and a CV:

  • Positive feedback for Medical Expert plus critical and positive feedback in Manager and Communicator.
  • Critical and positive feedback for Medical Expert plus positive feedback for all Intrinsic Roles
  • Positive feedback for all Roles.

18 mentor-assessors were purposively sampled to represent a variety of clinical disciplines. 6 assessors per mock portfolio. A think-aloud protocol was used to capture conscious reasoning processes. After providing a summary assessment (insufficient, sufficient, good) a semistructured interview was completed.


Key Outcomes

Sufficiency was achieved after 12 think-aloud/interview sessions.

1. Assessors follow a cyclical process. First, key and credible information is identified to ground the initial global assessment of performance. Second, additional information is reviewed and compared to the key data to confirm/refute the global assessment. Finally, this process continued until a pattern was confirmed. However, the initial judgement was hard to change even when disconfirming evidence was discovered. While final judgements were more elaborate than preliminary judgements, in supporting evidence, they were not substantially different.

2. There was no consistent approach to reviewing portfolio data. Some assessors reviewed all of the data; others emphasized self-assessment, while others emphasized WBAs.

Some assessors valued physician data, while others valued peer or nursing data. This difference was influenced by assessor performance and assessment beliefs.

3. Narrative data was more influential in guiding an overall assessment because of the richness of the data. Numerical data was used to support opinions, not develop opinions.

4. Operational definitions of competence varied. Most assessors emphasized Medical Expert, Manager and Communicator, only scanning the other Intrinsic Roles. Students with data related to these other Roles where often deemed “above average” as such data was uncommon and often hard to collect.

5. There was inconsistency in the global assessment of performance based on idiosyncratic interpretation of what the data meant. For example, non-medical extracurricular activities (from the CV) influenced various assessors interpretation of portfolio data. A learner with many hobbies might be more favourably scored by one assessor over another, when compared to a better performing learner without any hobbies.

Of note, this study does not describe how the reflexivity of the researchers was addressed. Nor is there evidence of a member check of the data. Finally, the transferability of the results are limited, as portfolios are often inherently unique to an institution and the process for assessing the portfolio equally institutional specific.

Key Conclusions

“Assessors were able to form a judgment based on the portfolio evidence alone. Although they reached the same overall judgments, they differed in the way they processed the evidence and in the reasoning behind their judgments. Differences sprung from assessors’ divergent assessment beliefs, performance theories, and inferences acting in concert. These findings support the notion that portfolios should be judged by multiple assessors who should, moreover, thoroughly substantiate their judgments. Also, assessors should receive training that provides insight into factors influencing their own decision making process and group decisions.”

Spare Keys- other take home points for clinician educators

Maastrict employs an unusual process, where a faculty mentor is also a portfolio assessor. I am concerned about this internal conflict between responsibility for advocacy for the learner and responsibility to the program to maintain an appropriate standard.

CEs need to look widely for the best place to publish before they think about tweeting about their paper!

Access KeyLIME podcast archives here

Check us out on iTunes