CLASSIFICATION OF CAROTID DOPPLER REPORTS BY LARGE LANGUAGE MODELS: A BRIEF OBSERVATION
Dear Editor,
Large language models (LLMs) are increasingly important in clinical decision support systems and medical education due to their ability to analyse medical texts (1). This brief observation evaluates the performance of four LLMs — ChatGPT-4o (OpenAI), Claude 3.7 Sonnet (Anthropic), Gemini 1.5 Pro (Google DeepMind), and Grok-3 (xAI) — in classifying internal carotid artery (ICA) stenosis according to the Society of Radiologists in Ultrasound (SRU) criteria, using velocity parameters in carotid Doppler ultrasonography (USG) reports (2). A total of 40 USG reports were used, all containing identical velocity data but presented in two distinct formats. Each report included the peak systolic velocity (PSV), end diastolic velocity (EDV), and internal carotid artery/common carotid artery (ICA/CCA) PSV ratio for both the right and left ICA. The first 20 reports included non-directive descriptive statements. In the remaining 20, the same velocity values were retained, but directive phrases such as “plaques not causing significant stenosis” and “no haemodynamically significant stenosis detected” were added.