Study: AI Like ChatGPT Falls Short in Medical Diagnosis

NasirMehmood February 10, 2026 0 1 min read

Research Reveals Limitations of ChatGPT and Similar Models

Artificial intelligence tools like ChatGPT are not reliable for medical diagnosis and perform no better than a simple online search, according to a new study published in Nature Medicine. The research, involving 1,300 participants in the UK, tested several AI models including ChatGPT, Meta’s Llama, and Command R+.

Only One-Third of Diagnoses Were Correct

In the study, participants were given ten different sets of symptoms with established medical diagnoses. The AI models correctly identified the conditions only about one-third of the time—a rate equivalent to that achieved by a control group using standard internet searches.

“There is a lot of hype around AI, but they are simply not ready to replace a doctor,” said Rebecca Payne, a researcher at the University of Oxford and co-author of the study, in a statement.

The Gap Between Exams and Real-World Application

While previous studies have shown that AI can pass medical exam questions, such as multiple-choice tests designed for students, the new findings highlight a significant shortfall when these models interact with real-world symptom descriptions from people.

The study underscores that, despite advancements, human medical professionals remain essential for accurate diagnosis and patient care.

Research Reveals Limitations of ChatGPT and Similar Models

Only One-Third of Diagnoses Were Correct

The Gap Between Exams and Real-World Application

Related Posts