CISE Seminar: Nathan Kallus, Assistant Professor, Cornell University & Cornell Tech
Friday, May 13, 2022
3:00-4:00pm
Virtual Event
Smooth Contextual Bandits
Contextual bandit problems model the inherent tradeoff between exploration and exploitation in personalized decision making in marketing, healthcare, revenue management, and more. Specifically, the tradeoff is characterized by the optimal growth rate of the regret in cumulative rewards compared to the optimal policy. Naturally, the optimal rate should depend on how complex the underlying supervised learning problem is, namely how much can observing reward in one context tell us about mean rewards in another context. Curiously, this obvious-seeming relationship is obscured in current theory that separately studies the easy, fully-extrapolatable case and hard, super-local case. To characterize the relationship more precisely I study a nonparametric contextual bandit problem where expected reward functions are β-smooth (roughly meaning β-times differentiable). I will show how this interpolates between the two extremes previously studied in isolation: non-differentiable-response ba
This talk is based on the following papers: Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes and Post-Contextual-Bandit
Nathan Kallus is an Assistant Professor in the School of Operations Research and Information Engineering and Cornell Tech at Cornell University. Nathan’s research interests include optimization, especially under uncertainty; causal inference; sequential decision making; and algorithmic fairness. He holds a PhD in Operations Research from MIT as well as a BA in Mathematics and a BS in Computer Science from UC Berkeley. Before coming to Cornell, Nathan was a Visiting Scholar at USC’s Department of Data Sciences and Operations and a Postdoctoral Associate at MIT’s Operations Research and Statistics group.
Faculty Host: Yannis Paschalidis
Student Host: Zili Wang