A research team at Canada have come to the conclusion that advanced AI systems are still not good enough when it comes to diagnosing ailments. LLMs stand for large language models a highly advanced feature that enables the modelling from the collection of data and performance of tasks with adequate information to present a data of a particular kind. This is based on the training that enhances its task performance.
The medical researchers at the Western University’s Schulich School of Medicine and Dentistry have found out that despite training the creative AI on coming up with a diagnosis and treatment plans, it is still not a reliable tool for diagnosis completely. The study was published on the open-acess site PLOS ONE, and the LLM was made to be trained on the 150 cases that were taken from Medscape, a popular website created for and used by medical professionals. The ChatGPT 3.5 was pre-trained with data such as patient history, lab results, office exam findings, and was evaluated by the research team based on how close it was to accuracy when it came to the diagnosis. Many medical professionals have advised to take caution when making use of the topics that promote/give health advice from the AI or any other LLM for that matter. They also evaluated how well it was at reaching a diagnosis even as far as offering citations which are an important part of medical diagnostics. The average score for all the case studies showed that the LLM would give a correct diagnosis 49% of the time.
It is to be considered that the research team did acknowledge that the LLM did a good job at reaching a diagnosis which according to them might prove useful for medical students maybe at a PBL setting and was also good at ruling out possible ailments. They have finally arrived at the conclusion that the LLMs used today in general aren’t ready for diagnostic use completely.
