The Key Literature in Medical Education podcast this week looks at the hot topic of milestones.  As a teaser, the Royal College Milestones (+ the CanMEDS 2015 Framework) will be released next month at the International Conference on Residency Education (ICRE).  This topic is certainly timely.  Details about how a pediatrics program used the ACGME milestones are in the abstract below.  But to get a better sense of how this paper from JGME can help you as a Clinician Educator, subscribe to the podcast here.


Bartlett KW, Whicker SA, Bookman J, Narayan AP, Staples, BB, Hering H, McGann KA. Milestone-Based Assessments Are Superior to Likert-Type Assessments in Illustrating Trainee Progression. Journal of Graduate Medical Education. 2015 Mar;7 (1):75-80

Reviewer: Jason Frank

Worldwide there is a growing concern & dissatisfaction with the assessment systems we have been using in the health professions: ad hoc, unreliable, often not valid in any sense.

In response, there has been a move to new educational technologies and approaches: “programmatic assessment” [see the work of Schuwirth & van der Vleuten, such as Med Ed 2005] and competency based medical education in the form of milestones and EPAs (Entrustable Professional Activities). However, little evidence is available about the role of milestones or EPAs in assessment.

Fast out of the gate arrives Bartlett’s group from Duke in the US. They set out to “determine if milestone-based assessments better stratify trainees than Likert-type assessments”.

Type of paper
Research: Observational Database study

Key Points on the Method
This study took place in a Pediatrics training program in the US. I found the authors’ description of their methods complex, and difficult to understand…Essentially, they decided that 3 of the new milestones deployed as part of the American Board of Pediatrics’ response to the new ACGME system were essentially identical to 3 of their “old” Likert-based assessment forms. The 3 domains involved data gathering at the bedside, MD-patient communication skills, and plan formulation from a clinical encounter. They then looked at 2 research questions:

1) For the first 7 months of rotations in these 3 domains, how did the mean scores of the new (2013) PGY1 “milestone” cohort compare to the previous (2012) cohort in PGY1?

2) Comparing all the mean assessment scores achieved by all the residents in 2012 (using Likert tools) and all the residents in 2013 (using milestones tools), which were more discriminating by PGY?

They also took pains to compare various characteristics of the 2 cohort years to show that they were similar in other ways. Notably, the “milestones” phase involved intensive faculty development in assessment.

Key Conclusions
The authors conclude that “milestones are better”, because they show progression of competence.

Unfortunately the entire study has some major threats to validity & generalizability, Including:

  1. Single site, single program
  2. only 7 months of data after launch of a major change
  3. One cohort had faculty development that may account for any signal
  4. The biggie: the way the “milestones” are used, were essentially like a Likert scale themselves, so this may be an odd comparison. My impression is that this is really an observation of some mean scores on 2 different Likert based instruments.

Having said this, full kudos for being scholarly about the change to a new system. There may be some signal here about progression (validity evidence for the concept of milestones, but more study is needed).

