How good are ‘AI doctors’ — and will they take over medicine?

A study published in *Science* found OpenAI’s AI model, o1, outperformed doctors in diagnosing emergency department cases, achieving 67% accuracy compared to 50–55% for physicians. Meanwhile, Google’s AI chatbot, AMIE, matched human doctors in diagnostic accuracy during text-based patient interactions but fell short in proposing practical treatment plans.
Researchers are exploring the potential of AI in medicine, with recent studies highlighting both progress and limitations. In April, a study in *Science* revealed that OpenAI’s advanced large language model, o1, developed in San Francisco, diagnosed emergency department cases more accurately than doctors at a Boston hospital. The AI achieved 67% accuracy, while human physicians scored between 50–55%, using real-world patient data instead of simulated scenarios. Another study, posted on arXiv in March, tested Google’s AI system, AMIE, which communicated with patients via text messages before their clinic visits. AMIE correctly identified diagnoses in 75% of cases, with the top suggestion matching the final diagnosis in 56% of instances, comparable to human doctors. However, human clinicians provided more practical and cost-effective treatment plans than AMIE. Experts caution that AI cannot yet replace physicians, as medicine involves complex, unpredictable patient interactions. Harvard Medical School’s David Wu noted that AI struggles with real-world variability, where patients often present non-textbook symptoms. While AI excels at structured tasks like note-taking and prescription renewals, its ability to handle nuanced medical scenarios remains unproven. The studies mark progress in AI’s role in healthcare, but researchers emphasize the need for further development before widespread adoption. AI tools like o1 and AMIE demonstrate potential in diagnostics, yet their practical integration into clinical workflows requires addressing gaps in patient interaction and treatment planning.
This content was automatically generated and/or translated by AI. It may contain inaccuracies. Please refer to the original sources for verification.