Distinguished CS Colloquium Talk by Prof. Kristen Grauman on Dec 4th

Title: First-person video understanding

Speaker: Prof. Kristen Grauman, University of Texas at Austin

When: Wednesday, December 4, 2024 @ 11am

Where: CDS 950

Abstract
First-person or “egocentric” perception requires understanding the video that streams to a wearable camera.  The egocentric view offers a special window into the camera wearer’s attention, goals, and interactions, making it an exciting avenue for the future of augmented reality and robot learning.   This talk will describe our AI research group’s recent explorations for multimodal perception, such as vision-language embeddings that inject semantics from text into powerful video representations, audio-visual video models that can anticipate the sounds of human actions or augment a user’s hearing in busy places, and models that can facilitate learning new skills from “how-to” videos.  I’ll also overview how we are advancing the frontier of egocentric perception for the broader community via large-scale open-sourced datasets called Ego4D and Ego-Exo4D—multi-year, multi-institutional efforts to capture daily-life and skilled activity of people around the world.

Bio
Kristen
Grauman is a Professor in the Department of Computer Science at the University of Texas at Austin and a Research Director in Meta’s Fundamental AI Research lab (FAIR).  Her research in computer vision and machine learning focuses on visual recognition, video, and embodied perception.  Before joining UT-Austin in 2007, she received her Ph.D. at MIT.  She is an IEEE Fellow, AAAS Fellow, AAAI Fellow, Sloan Fellow, and recipient of the 2013 Computers and Thought Award.  She and her collaborators have been recognized with several Best Paper awards in computer vision, including a 2011 Marr Prize and a 2017 Helmholtz Prize (test of time award).  She has served as Associate Editor-in-Chief for PAMI and Program Chair of CVPR 2015, NeurIPS 2018, and ICCV 2023.
http://www.cs.utexas.edu/~grauman/