- Starts: 11:00 am on Tuesday, March 4, 2025
- Ends: 12:30 pm on Tuesday, March 4, 2025
ECE/ME Seminar: Zhiwen Fan
Title: Building Spatially Intelligent Machines: Scalable 3D Reconstruction, Generation, and Perception
Abstract: Large-scale machine learning models have significantly advanced our ability to process single images and text, yet they remain limited in handling 3D data. In this talk, I will discuss how scaling 3D models in reconstruction, generation, and perception can address these shortcomings in autonomous systems and human-centered applications. To tackle the high computational cost and data scarcity in 3D reconstruction and generation, I will first demonstrate how multi-view stereo can be used to construct city-scale, high-definition maps and how 2D generative models can be adapted for large-scale 3D scene synthesis. Next, I will present an end-to-end 3D perception model that unifies 3D reconstruction and open-vocabulary understanding within a differentiable framework. These advances set the stage for my long-term goal: developing real-time, 3D-based models that understand, recreate, and interact with the physical world using spatial awareness and common sense.
Bio: Zhiwen ("Aaron") Fan is a Ph.D. candidate in the Department of Electrical and Computer Engineering at the University of Texas at Austin, advised by Prof. Zhangyang ("Atlas") Wang. He was the recipient of the Qualcomm Innovation Fellowship and interned at NVIDIA, Meta, and Google, where he developed 3D modeling tools for autonomous vehicles and the metaverse. Prior to his doctoral studies, he served as a senior research engineer at Alibaba Group, focusing on city-scale modeling.
- Location:
- PHO 339