Medicine today is imprecise. Among the top 20 drugs in the U.S., up to 80% of patients are non-responders. The goal of precision health is to provide the right intervention for the right people at the right time. The key to realize this dream is to develop a data-driven, learning system that can instantly incorporate new health information to optimize care delivery and accelerate biomedical discovery. In reality, however, the health ecosystem is mired in overwhelming unstructured data and excruciating manual processing. For example, in cancer, standard of care often fails, and clinical trials are the last hope. Yet less than 3% of patients can find a matching trial, whereas 40% of trial failures simply stem from insufficient recruitment. Discovery is painfully slow as a new drug may take billions of dollars and over a decade to develop.
In this tutorial, we will explore how large language models (LLMs) can serve as a universal structuring tool to democratize biomedical knowledge work and usher in an intelligence revolution in precision health. We first review background for precision health and give a broad overview of the AI revolution that culminated in the development of large language models, highlighting key technical innovations and prominent trends such as consolidation of AI methods across modalities. We then give an in-depth review of biomedical LLMs and precision health applications, with a particular focus on scaling real-world evidence generation and drug discovery. To conclude, we discuss key technical challenges (e.g., bias, hallucination, cost), societal ramifications (e.g., privacy, regulation), as well as exciting research frontiers such as prompt programming, knowledge distillation, multi-modal learning, causal discovery.
As a resource, we provide a non-exhaustive list of papers and other resources that we referred to during the tutorial. We broadly categorize resources into three categories aligned with the structure of the tutorial: Precision Health, The Intelligence Revolution, LLMs for Precision Health, Application Challenges, and Research Frontiers.
Precision Health
LLMs for Precision Health
GPT-4 in Medicine
Biomedical LLMs
LLMs for Real-World Evidence
LLMs for Drug Discovery
Application Challenges
Bias
Hallucinations
Research Frontiers
Prompt Programming
Retrieval-Augmented Generation (RAG)
Knowledge Distillation
Multi-modal learning
Causal Discovery
@inproceedings{10.1145/3580305.3599568,
author = {Poon, Hoifung and Naumann, Tristan and Zhang, Sheng and Gonz\'{a}lez Hern\'{a}ndez, Javier},
title = {Precision Health in the Age of Large Language Models},
year = {2023},
isbn = {9798400701030},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3580305.3599568},
doi = {10.1145/3580305.3599568},
booktitle = {Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
pages = {5825–5826},
numpages = {2},
keywords = {artificial intelligence, large language model, precision health, machine learning},
location = {Long Beach, CA, USA},
series = {KDD '23}
}