Sharing open-source use of PandemIQ Llama LLM code

By Cassandra Kennedy (BEACON)

The Biothreats Emergence, Analysis and Communications Network (BEACON)  announced public availability of PandemIQ Llama, an open-source, domain-trained large language model designed specifically for pandemic surveillance and biothreat analysis. BEACON, based at Boston University’s Center on Emerging Infectious Diseases (CEID),  is a web-based platform and accompanying public health program which aims to address this need by leveraging advanced AI and a global network of human subject matter experts to rapidly collect, analyze, and disseminate information on emerging disease threats.

Developed by data scientists and engineers at the Hariri Institute with input from infectious disease experts from CEID, PandemIQ Llama helps analysts sift through incoming disease signals from sources like HealthMap, public health partners, and global outbreak reporters. These signals are first verified by human outbreak specialists through digital investigation and field communication, then selected by editorial staff for reporting. The model is used to generate initial report drafts, which human experts refine with additional context and analysis before publication—illustrating a hybrid system where AI supports, but does not replace, expert judgment.

Trained on 5.7 billion tokens of epidemiological data across 31 languages and 16 priority pathogens, PandemIQ Llama addresses a gap left by general-purpose language models, which often lack deep coverage of infectious disease literature. BEACON leaders emphasize that the tool is meant to reduce manual workload and improve speed and consistency in outbreak reporting, while remaining grounded in human expertise.

PandemIQ Llama can be found on GitHub and Hugging Face for open-source use.

Learn more about BEACON, PandemIQ Llama and how BEACON combines human expertise with AI in this BEACON story.