

ChatGPT鈥檚 Performance Found Lacking on Cancer Treatment Recommendations
In a published by the Dana-Farber Cancer Institute, one of its physicians relayed his experience in using ChatGPT to provide statistics on a certain type of cancer. To his surprise, ChatGPT made up an equation and even gave it a name.
鈥淚t was an equation that does nothing, but it looked very convincing,鈥 said Benjamin Schlechter, M.D., who specializes in gastrointestinal cancers. 鈥淚n a way, it鈥檚 like talking to children: They start making up a story and continue the more you ask them about it. In this case, ChatGPT was adding detail after detail, none of it real, because I asked it to elaborate. It鈥檚 very confident for a computer.鈥
It turns out that ChatGPT has similar problems with accuracy in making cancer treatment recommendations, according to a .
Researchers from Mass General Brigham found that one-third of GPT鈥檚 3.5 recommendations went at least partially against 2021 National Comprehensive Cancer Treatment guidelines. 鈥淐linicians should advise patients that large language model chatbots are not a reliable source of information,鈥 the study concluded.
The chatbot was most likely to mix incorrect recommendations among correct ones, creating an error that鈥檚 difficult even for experts to detect. The study only evaluated one model at a snapshot in time, but the findings provide insight into areas of concern and future research needs.
Danielle Bitterman, M.D., Mass General Brigham鈥檚 department of radiation oncology and the artificial intelligence (AI) in medicine program, said in a statement: 鈥淐hatGPT responses can sound a lot like a human and can be quite convincing. But, when it comes to clinical decision-making, there are so many subtleties for every patient's unique situation. A right answer can be very nuanced, and not necessarily something ChatGPT or another large language model can provide.鈥
The chatbot did not purport to be a medical device, and need not be held to such standards, the study said. Patients, however, likely will use technologies like this to educate themselves, which may affect shared decision-making in the doctor-patient relationship.
The investigators plan to explore how patients and physicians can distinguish between medical advice written by a physician compared with AI. They also plan to prompt ChatGPT with more detailed clinical cases to evaluate AI鈥檚 clinical knowledge further.