AIR Seminar: Generating and Populating 3D Indoor Scenes
- Starts: 1:00 pm on Wednesday, October 2, 2024
Speaker: Huaizu Jiang
Abstract: Our lab's research aims to advance immersive 3D experiences by developing techniques for creating and populating virtual environments. Imagine taking a (virtual or physical) tour of your dream home, where you and your loved ones can together redesign spaces and swap out furniture virtually in 3D--all before making an offer with little cost. Picture stepping into a virtual room to observe Yo-Yo Ma's intricate fingering techniques on a challenging cello piece witnessing his masterful movements from any angle. The core of our work lies in generating photorealistic 3D environments and synthesizing lifelike virtual humans capable of natural interactions with their surroundings and each other. This research has broad applications in augmented/virtual reality (AR/VR), simulation for embodied AI and robotics, gaming, film making, health care, and ecommerce, etc.
In this talk, I'll share our recent effort toward this grand goal. In the first part, I'll present House Crafter, where we convert floorplans to 3D scenes. Our key insight is to adapt a 2D diffusion model, which is trained on web-scale images, to generate consistent multi-view color (RGB) and depth (D) images across different locations of the scene. Such RGBD images are then used to reconstruct the 3D scene. In the second part, I'll introduce OmniControl and HOI-Diff , where we use the guidance mechanism of diffusion models to incorporate context-aware constraints for human motion generation so they can have natural interactions with the surroundings.
Bio: Huaizu Jiang is an Assistant Professor in the Khoury College of Computer Sciences at Northeastern University. Prior to that, he was a Postdoc at Caltech and Visiting Researcher at NVIDIA. He received his Ph.D. degree in Computer Science from University of Massachusetts, Amherst. He got his bachelor and master’s degrees both from Xi’an Jiaotong University. His long-term research aims to teach machines to develop visual intelligence in a manner analogous to humans by combining 3D and multi-modal cues. In the short term, his research goal is to create smart tools to improve people’s life experiences of using cameras. He received Adobe Fellowship and NVIDIA Graduate Fellowship both in 2019. His research has been supported by NSF, MathWorks, Adobe, and Google.
- Location:
- CDS 1750