Health

Language Models as Emergency Room Decision Support

North America / United States0 views1 min
Language Models as Emergency Room Decision Support

A study published in *Science* found that large language models (LLMs) like OpenAI’s o1, o1-preview, and GPT-4o matched or outperformed physicians in diagnosing emergency cases, particularly in time-critical scenarios with limited patient data. Researchers at Beth Israel Deaconess Medical Center tested AI performance using real emergency department data and educational case studies, with independent physician evaluations confirming AI’s diagnostic and decision-making capabilities were often superior, especially at initial patient contact." "article": "A study published in *Science* examined whether large language models (LLMs) could assist physicians in emergency room decision-making. Led by Peter Brodeur, MD, a resident at Beth Israel Deaconess Medical Center in Boston, the research team compared the diagnostic performance of OpenAI’s o1, o1-preview, and GPT-4o against human doctors. The study used a mix of educational case datasets—including reports from *NEJM* and the American College of Physicians’ podcast—and real, unstructured emergency department data from the U.S. The LLMs were tasked with describing clinical cases, generating differential diagnoses, recommending tests, and justifying decisions. Their outputs were evaluated against original treatment plans by two physicians, with disputes resolved by a third reviewer. In the second phase, researchers tested the models’ performance in real-time emergency scenarios, where AI had access to varying patient information, from symptoms and vital signs to medical history. Results showed the LLMs provided at least helpful diagnoses in most cases, with suggested tests deemed appropriate. Notably, AI outperformed physicians in diagnostic accuracy and decision-making, particularly during initial emergency department assessments. Gitta Kutyniok, chair of Mathematical Foundations of Artificial Intelligence at Ludwig-Maximilians-Universität München, praised the study’s methodology, calling it stronger than earlier AI benchmarks due to its use of real emergency cases and blinded evaluations. Felix Nensa, a senior physician at Essen University Hospital in Germany, also acknowledged the study’s rigor but cautioned about its limitations. The research relied solely on text-based inputs, meaning the LLMs could not process imaging or lab results. Despite this, the findings suggest modern AI could serve as a valuable decision-support tool in emergency medicine, particularly in high-pressure situations where rapid, accurate assessments are critical.

A study published in *Science* examined whether large language models (LLMs) could assist physicians in emergency room decision-making. Led by Peter Brodeur, MD, a resident at Beth Israel Deaconess Medical Center in Boston, the research team compared the diagnostic performance of OpenAI’s o1, o1-preview, and GPT-4o against human doctors. The study used a mix of educational case datasets—including reports from *NEJM* and the American College of Physicians’ podcast—and real, unstructured emergency department data from the U.S. The LLMs were tasked with describing clinical cases, generating differential diagnoses, recommending tests, and justifying decisions. Their outputs were evaluated against original treatment plans by two physicians, with disputes resolved by a third reviewer. In the second phase, researchers tested the models’ performance in real-time emergency scenarios, where AI had access to varying patient information, from symptoms and vital signs to medical history. Results showed the LLMs provided at least helpful diagnoses in most cases, with suggested tests deemed appropriate. Notably, AI outperformed physicians in diagnostic accuracy and decision-making, particularly during initial emergency department assessments. Gitta Kutyniok, chair of Mathematical Foundations of Artificial Intelligence at Ludwig-Maximilians-Universität München, praised the study’s methodology, calling it stronger than earlier AI benchmarks due to its use of real emergency cases and blinded evaluations. Felix Nensa, a senior physician at Essen University Hospital in Germany, also acknowledged the study’s rigor but cautioned about its limitations. The research relied solely on text-based inputs, meaning the LLMs could not process imaging or lab results. Despite this, the findings suggest modern AI could serve as a valuable decision-support tool in emergency medicine, particularly in high-pressure situations where rapid, accurate assessments are critical.

This content was automatically generated and/or translated by AI. It may contain inaccuracies. Please refer to the original sources for verification.

Comments (0)

Log in to comment.

Loading...