Machine Learning in Medicine (MLxMed) Seminar: Yifan Peng, Assistant Professor, Cornell University

Talk title:LLMs and Evidence Summarization

Talk abstract: Generative AI, exemplified by large language models (LLMs) shows great promise in assisting medical evidence summarization. However, concerns have been raised about the quality of outputs generated by pre-trained LLMs, which potentially results in harmful misinformation. In this talk, I will first discuss our investigation into the capabilities and limitations of LLMs, specifically GPT-3.5 and ChatGPT, in performing zero-shot medical evidence summarization across six clinical domains. Our study demonstrates that LLMs could be susceptible to generating factually inconsistent summaries and making overly convincing or uncertain statements, leading to potential harm due to misinformation. Furthermore, we observe that automatic metrics often do not strongly correlate with the quality of summaries. I will then discuss our research on the impact of fine-tuning LLMs to enhance their performance in evidence summarization. We found that compared to zero-shot learning, the fine-tuned LLMs improved the automatic evaluation metrics such as ROUGE, METEOR, CHRF, and PICO-F1. We also found that smaller fine-tuned models sometimes demonstrated superior performance compared to larger zero-shot models. The above trends of improvement were also manifested in both human and GPT4-simulated evaluations. Our findings confirmed the potential for LLMs to empower medical evidence summarization.

Bio: Peng is an assistant professor in the division of Health Sciences at Weill Cornell Medicine. His main research interests include BioNLP and medical image analysis, such as named entity recognition, information extraction, and disease diagnosis and prognosis. Before joining Cornell Medicine, he was a research fellow at the National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH). He obtained my Ph.D. degree from the University of Delaware. During his doctoral training, he investigated applications of machine learning in biomedical relation extraction, with a focus on deep analysis of the linguistic structures of biomedical texts. He is the first awardee at the NCBI to receive the NIH K99/R00 grant, which supports his work on using NLP and ML to extract radiology specific domain knowledge and build an automated reporting system.

When 2:00 pm to 3:00 pm on Friday, April 19, 2024
Location Zoom
Fees Free
Speakers Yifan Peng, Assistant Professor of Population Health Sciences, Population Health Sciences , Weill Cornell Medical College