CISE Seminar: Rana Shahout, Postdoctoral Fellow, Harvard University

  • Starts: 3:00 pm on Friday, December 12, 2025
  • Ends: 4:00 pm on Friday, December 12, 2025

Prediction-Aware Algorithms for Efficient AI Systems

Large Language Models (LLMs) have transformed what machines can do—and how systems must be designed to serve them. These models are both computationally demanding and memory-bound, revealing the limits of traditional optimization methods that once sufficed for conventional systems. A central challenge in building LLM systems is achieving balance: minimizing computational and financial costs while ensuring response quality and meeting strict latency and throughput goals. Uniquely, LLMs also expose internal signals—predictions that guide their own execution. This talk introduces prediction-aware algorithms that leverage these signals to improve system performance. Specifically, it presents algorithms that use predictions to enhance scheduling and resource management across two key settings: standalone LLM inference and API-augmented LLMs that interact with external tools. We show how prediction-guided scheduling and memory handling can reduce latency and improve efficiency in diverse deployment environments.

Rana Shahout is a Postdoctoral Fellow at Harvard University, working with Michael Mitzenmacher and Minlan Yu. She received her Ph.D. in Computer Science from the Technion and previously worked as a Senior Software Engineer at Mellanox (now NVIDIA). Her research combines machine learning, systems, and algorithmic theory to build adaptive, high-performance infrastructures. Rana is a recipient of the Eric and Wendy Schmidt Postdoctoral Award, the Zuckerman Postdoctoral Fellowship, the Weizmann Institute Women’s Postdoctoral Career Development Award, and the ACC Feder Family Award for Best Student Work in Communications.

Faculty Hosts: Ayse Coskun and Brian Kulis

Student Host: Beste Oztop