KeyLIME Podcast #216: “I know kung fu”: Using gaze-tracking tech to identify expertise.

This week’s discussion is around a study of the information gathering techniques of a group of emergency medicine students. Using a mobile gaze-tracking device, the researchers reported on how the visual fixation patterns of the residents correlated to their objective perfomance. Listen to the hosts discussing the article here (or on iTunes).


KeyLIME Session 216:

Listen to the podcast.


New KeyLIME Podcast Episode Image

Szulewski A A new way to look at simulation-based assessment: the relationship between gaze-tracking and exam performance CJEM. 2019 Jan;21(1):129-137. Epub 2018 Jun 21.


Jonathan Sherbino (@sherbino)


As physicians our  clinical world is complex.  Think back to your first day in the OR, in the clinic, on rounds.  I was in simple brain stem survival mode.  My cognitive task load was overwhelmed with extraneous load.  I couldn’t process key information from a blood gas because of all the alarms, and noise, and distracting movements of other healthcare professionals entering and existing clinical areas. I tried to determine if this data was important, but the extraneous load was so overwhelming that I had no bandwidth for instrinsic load (to actually solve the avid/base problem) let alone monitor my learning and other actions (i.e. germane load).  I hope I’m an expert.. although many of my patients and colleagues may disagree with his statement.  Now, as I walk into an emergency department the cries of an agitated and delirious patient, the sounding alarms, the near constant interruptions to respond to interpret an ECG or provide an order for patient care, don’t overwhelm my cognitive load.  I have learned micro automations to preserve my bandwidth and focus in on patterns that seem abnormal.  Does this resonate with you?

How do we get from novice to expert?  And how do we diagnose the expert?  Is there an assessment hack that can help us trend performance?  This study may be a solution.


“This study explored the information-gathering techniques of residents by analyzing their initial visual fixation patterns in a simulated resuscitation environment.”

Key Points on the Methods

A convenience sample of emergency medicine residents from one residency training program were recruited during two separate sessions.  Each session was comprised of two 10 minute simulated scenarios in a “high-fidelity simulation lab that included robotic mannequins and confederates (e.g. nurse and respiratory therapist).”

  • DKA
  • Symptomatic Bblocker overdose
  • Rupture AAA
  • Toxic alcohol ingestion

The scenarios were previously demonstrated to be highly reliable.

The residents were fitted and calibrated with a mobile gaze tracking device that records and superimposes the participants first-person POV and participants gaze indicator (e.g. a final video that shows what is in front of the participant and where they are looking).

Using software and two human raters, gaze-tracking variables were measured (e.g. areas in the environment where scanning stopped and dilation occurs, time to fixation, frequency of fixation etc.) during the first 60 seconds of each scenario.  A Pearson correlation coefficient was calculated with a prior detmeried areas of interest, determined by subject matter experts. Intra-class correlation coefficients were calculated to determine inter-rater reliability.

Blinded experts assessed participant performance in managing the scenario using a previously validated direct observation instrument.

Analysis of variance was performed to determine discrimination of gaze-tracking data for quality of performance as different from function of postgraduate year of training (e.g. surrogate of experience).

Key Outcomes

There were 42 participants across both sessions  (with 5 participants in both sessions). 78 scenarios were analyzed.

Inter-rater analysis was acceptable for number of fixations but not time to fixation.  Expert assessment of performance was acceptable.

Analysis of variance was inconclusive because of limitations of recruitment and distribution of experience between scenarios.

Within scenarios there as an association between specific a priori gaze patterns and objective performance (e.g. fixation on task relevant information and ignoring task irrelevant information) that was not predictive by postgraduate year of training.

Key Conclusions

The authors conclude…

Residents’ gaze-tracking patterns were found to be significantly correlated with objective performance in the simulation-based resuscitation examinations.Because of the number of participants and array of data, correlation was used to elucidate patterns and associations. Given that this was a pioneering study on the use of gaze analytics to establish patterns associated with performance, these results have not been put forward as definitive findings; rather, they lay the groundwork for future study in this novel area.”

Spare Keys – other take home points for clinician educators

This study is suggestive of the need to pilot novel education innovations before embarking on a trial.  There is a sense in the discussion that the complexity of the intervention (e.g. strongly contrasting scenarios, scoring schema that did not align with a priori priority areas for gaze fixation, lack of rater training etc.) may have attributed to elements of a type II error.  Sometimes in a program of research the simple first study is required for proof of concept before starting the robust trial.  Fixing a study post-trial with new analyses or the addition of study components not initially in the original design can make for an inelegant project.  Believe me, I know from a lot of painful experience.  My latest version of this has taken me > 2 years to get from original study to analysis.

Access KeyLIME podcast archives here

Check us out on iTunes