Generative AI Matches Clinician Accuracy in Evaluating Medical Interviews While Cutting Assessment Time by Fifty Percent

Japanese researchers find AI-based scoring of medical interviews is as accurate as clinician evaluations while reducing feedback time by more than 50%.

By: AXL Media

Published: Apr 14, 2026, 11:49 AM EDT

Source: Information for this report was sourced from EurekAlert!

Generative AI Matches Clinician Accuracy in Evaluating Medical Interviews While Cutting Assessment Time by Fifty Percent - article image

Addressing the Assessment Burden in Medical Training

Clinical interviewing is the cornerstone of medical diagnosis, yet training students in this skill is notoriously resource-intensive. Traditionally, evaluating a student's performance requires experienced clinicians to spend hours observing sessions or reviewing transcripts to provide detailed feedback. To address this bottleneck, researchers at Juntendo University Faculty of Medicine in Japan explored whether advanced generative AI could perform these evaluations with the same level of nuance as human instructors. Their findings, published in February 2026, indicate that AI is not only accurate but significantly faster and more consistent than human evaluators.

The Study: ABA vs. HBA

The research team, led by Dr. Hiromizu Takahashi and Professor Toshio Naito, compared AI-based assessment (ABA) against traditional human-based assessment (HBA). Seven participants—ranging from medical students to attending physicians—conducted interviews with an AI-simulated patient. These interactions were transcribed and scored using the Master Interview Rating Scale, a standard tool for measuring empathy, information gathering, and organization. The AI models (including GPT-5 Pro) were pitted against five independent clinical instructors. The results showed strong agreement between the two groups, with the AI demonstrating superior consistency across repeated evaluations of the same text.

Efficiency and Scalability

One of the most compelling outcomes of the study was the dramatic increase in efficiency. The AI system reduced the time required to evaluate each interview transcript by more than half. This speed enables a "real-time" feedback loop that is currently impossible in most overstretched medical programs. "Students could interview an AI-simulated patient and receive feedback almost immediately instead of waiting days or weeks," noted Prof. Naito. This scalability allows for repeated, self-directed practice, ensuring that students can refine their communication skills at their own pace without increasing the workload of their professors.

Generative AI Matches Clinician Accuracy in Evaluating Medical Interviews While Cutting Assessment Time by Fifty Percent

Categories

Topics

Related Coverage