Precision Health in the Age of LLMs

KDD 2023 Tutorial LS-21 | Thursday, August 10

  ▶ Microsoft Research

Precision Health in the Age of Large Language Models (LLMs) was presented as tutorial LS-21 at KDD 2023 on Thursday, August 10 10am-1pm. We provide the materials presented as well as additional resources for those interested in this topic.


Medicine today is imprecise. Among the top 20 drugs in the U.S., up to 80% of patients are non-responders. The goal of precision health is to provide the right intervention for the right people at the right time. The key to realize this dream is to develop a data-driven, learning system that can instantly incorporate new health information to optimize care delivery and accelerate biomedical discovery. In reality, however, the health ecosystem is mired in overwhelming unstructured data and excruciating manual processing. For example, in cancer, standard of care often fails, and clinical trials are the last hope. Yet less than 3% of patients can find a matching trial, whereas 40% of trial failures simply stem from insufficient recruitment. Discovery is painfully slow as a new drug may take billions of dollars and over a decade to develop.

In this tutorial, we will explore how large language models (LLMs) can serve as a universal structuring tool to democratize biomedical knowledge work and usher in an intelligence revolution in precision health. We first review background for precision health and give a broad overview of the AI revolution that culminated in the development of large language models, highlighting key technical innovations and prominent trends such as consolidation of AI methods across modalities. We then give an in-depth review of biomedical LLMs and precision health applications, with a particular focus on scaling real-world evidence generation and drug discovery. To conclude, we discuss key technical challenges (e.g., bias, hallucination, cost), societal ramifications (e.g., privacy, regulation), as well as exciting research frontiers such as prompt programming, knowledge distillation, multi-modal learning, causal discovery.


As a resource, we provide a non-exhaustive list of papers and other resources that we referred to during the tutorial. We broadly categorize resources into three categories aligned with the structure of the tutorial: Precision Health, The Intelligence Revolution, LLMs for Precision Health, Application Challenges, and Research Frontiers.

Precision Health

LLMs for Precision Health

GPT-4 in Medicine

Biomedical LLMs

LLMs for Real-World Evidence

LLMs for Drug Discovery

Application Challenges



Research Frontiers

Prompt Programming

Retrieval-Augmented Generation (RAG)

Knowledge Distillation

Multi-modal learning

Causal Discovery


            author = {Poon, Hoifung and Naumann, Tristan and Zhang, Sheng and Gonz\'{a}lez Hern\'{a}ndez, Javier},
            title = {Precision Health in the Age of Large Language Models},
            year = {2023},
            isbn = {9798400701030},
            publisher = {Association for Computing Machinery},
            address = {New York, NY, USA},
            url = {},
            doi = {10.1145/3580305.3599568},
            booktitle = {Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
            pages = {5825–5826},
            numpages = {2},
            keywords = {artificial intelligence, large language model, precision health, machine learning},
            location = {Long Beach, CA, USA},
            series = {KDD '23}