How Does ChatGPT Affect Post-Test Probability?

Original title: ChatGPT and post-test probability

Authors: Samuel J. Weisenthal

The article explores how ChatGPT, a reinforcement learning-based language model, tackles probabilistic medical diagnostic reasoning, a crucial task in healthcare. It investigates ChatGPT’s capability in transitioning pre-test probabilities to post-test probabilities, crucial for medical diagnosis. Various prompts, ranging from pure probability language to medical terms, were used to query ChatGPT on applying Bayes’ rule in medical diagnosis. The study finds that ChatGPT encounters more errors when prompted with medical terminologies compared to purely probabilistic prompts. Interestingly, prompt engineering strategies show potential in reducing these errors. The research suggests that introducing medical variables increases ChatGPT’s error rate but also proposes techniques to mitigate these inaccuracies. The findings shed light on improving ChatGPT’s performance in medical diagnostic reasoning, addressing sensitivity, specificity concerns, and paving the way for future enhancements in large language models.

Original article: https://arxiv.org/abs/2311.12188