Optimizing theranostics chatbots with context-augmented large language models

IINTRODUCTION : Nuclear medicine theranostics is rapidly emerging, as an interdisciplinary therapy option with multi-dimensional considerations. Healthcare Professionals do not have the time to do in depth research on every therapy option. Personalized Chatbots might help to educate them. Chatbots using Large Language Models (LLMs), such as ChatGPT, are gaining interest addressing these challenges. However, chatbot performances often fall short in specific domains, which is critical in healthcare applications. METHODS : This study develops a framework to examine the use of contextual augmentation to improve the performance of medical theranostic chatbots to create the first theranostic chatbot. Contextual augmentation involves providing additional relevant information to LLMs to improve their responses. We evaluate five state-of-the-art LLMs on questions translated into English and German. We compare answers generated with and without contextual augmentation, where the LLMs access pre-selected research papers via Retrieval Augmented Generation (RAG). We are using two RAG techniques: Naïve RAG and Advanced RAG. RESULTS : A user study and LLM-based evaluation assess answer quality across different metrics. Results show that Advanced RAG techniques considerably enhance LLM performance. Among the models, the best-performing variants are CLAUDE 3 OPUS and GPT-4O. These models consistently achieve the highest scores, indicating robust integration and utilization of contextual information. The most notable improvements between Naive RAG and Advanced RAG are observed in the GEMINI 1.5 and COMMAND R+ variants. CONCLUSION : This study demonstrates that contextual augmentation addresses the complexities inherent in theranostics. Despite promising results, key limitations include the biased selection of questions focusing primarily on PRRT, the need for comprehensive context documents. Future research should include a broader range of theranostics questions, explore additional RAG methods and aim to compare human and LLM evaluations more directly to enhance LLM performance further.

Keywords

Large language model (LLM), Contextual augmentation, Retrieval augmented generation (RAG), Nuclear medicine, Theranostics, Artificial intelligence (AI), Health care professional (HCP)

Sustainable Development Goals

SDG-03: Good health and well-being
SDG-09: Industry, innovation and infrastructure

Citation

Koller, P., Clement, C., Van Eijk, A. et al. 2025, 'Optimizing theranostics chatbots with context-augmented large language models', Theranostics, vol. 15, no. 12, pp. 5693-5704, doi : 10.7150/thno.107757.

URI

http://hdl.handle.net/2263/103308

Collections

Research Articles (Nuclear Medicine)
Research Articles (University of Pretoria)

Full item page

Optimizing theranostics chatbots with context-augmented large language models

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Sustainable Development Goals

Citation

URI

Collections