Acute Ischemic Stroke Prognosis Improved by LLM

04/21/2026
At the AAN 2026 conference, investigators presented a reasoning-enhanced large language model for acute ischemic stroke prognosis.
In a single-center cohort of 464 patients, the system used routine discharge summaries to predict 90-day modified Rankin Scale outcomes. Mean absolute error was 1.00, providing the clearest quantitative topline from the meeting presentation. On the reported primary measures, performance matched GPT-4.1 and exceeded Clinical BERT and a variable-based support vector machine. The findings suggested that routine discharge documentation carried prognostic signal in this early single-center report.
Investigators asked whether narrative discharge summaries could support 90-day outcome prediction when structured records did not capture all relevant detail. They analyzed 464 acute ischemic stroke patients treated at a single center between 2010 and 2023. All had discharge summaries and 90-day mRS scores available for model development and evaluation. They called the system COPE, or Chain of thought Outcome Prediction Engine, and described it as a two-stage large language model framework. The first model generated clinical reasoning, and the second used that reasoning to predict functional outcome on the mRS scale.
In the comparison presented at the meeting, COPE placed 75% of predictions within 1 mRS point of the observed outcome and reached exact accuracy of 33%. Mean absolute error was 1.00 for COPE and 1.28 for both Clinical BERT and the variable-based support vector machine. The alternative models also had lower overall accuracy, and the comparison included both an alternative text model and a structured-variable machine learning benchmark. Overall, the comparative results favored the text model with reasoning over the alternative text and variable-based approaches.
When investigators removed the reasoning component from COPE, exact accuracy fell to 23%, which was the clearest ablation signal presented. Ablation testing identified the Medications section and the Discharge and Follow up Summary as the most informative parts of the discharge summaries. Removing either section produced the largest performance drop, tying the strongest reported signal to those note components.
