Optimizing theranostics chatbots with context-augmented large language models
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Ivyspring International Publisher
Abstract
IINTRODUCTION : Nuclear medicine theranostics is rapidly emerging, as an interdisciplinary therapy option with multi-dimensional considerations. Healthcare Professionals do not have the time to do in depth research on every therapy option. Personalized Chatbots might help to educate them. Chatbots using Large Language Models (LLMs), such as ChatGPT, are gaining interest addressing these challenges. However, chatbot performances often fall short in specific domains, which is critical in healthcare applications.
METHODS : This study develops a framework to examine the use of contextual augmentation to improve the performance of medical theranostic chatbots to create the first theranostic chatbot. Contextual augmentation involves providing additional relevant information to LLMs to improve their responses. We evaluate five state-of-the-art LLMs on questions translated into English and German. We compare answers generated with and without contextual augmentation, where the LLMs access pre-selected research papers via Retrieval Augmented Generation (RAG). We are using two RAG techniques: Naïve RAG and Advanced RAG.
RESULTS : A user study and LLM-based evaluation assess answer quality across different metrics. Results show that Advanced RAG techniques considerably enhance LLM performance. Among the models, the best-performing variants are CLAUDE 3 OPUS and GPT-4O. These models consistently achieve the highest scores, indicating robust integration and utilization of contextual information. The most notable improvements between Naive RAG and Advanced RAG are observed in the GEMINI 1.5 and COMMAND R+ variants.
CONCLUSION : This study demonstrates that contextual augmentation addresses the complexities inherent in theranostics. Despite promising results, key limitations include the biased selection of questions focusing primarily on PRRT, the need for comprehensive context documents. Future research should include a broader range of theranostics questions, explore additional RAG methods and aim to compare human and LLM evaluations more directly to enhance LLM performance further.
Description
Keywords
Large language model (LLM), Contextual augmentation, Retrieval augmented generation (RAG), Nuclear medicine, Theranostics, Artificial intelligence (AI), Health care professional (HCP)
Sustainable Development Goals
SDG-03: Good health and well-being
SDG-09: Industry, innovation and infrastructure
SDG-09: Industry, innovation and infrastructure
Citation
Koller, P., Clement, C., Van Eijk, A. et al. 2025, 'Optimizing theranostics chatbots with context-augmented large language models', Theranostics, vol. 15, no. 12, pp. 5693-5704, doi : 10.7150/thno.107757.