AIR Weekly Seminar

  • Starts: 2:00 pm on Wednesday, April 30, 2025
  • Ends: 3:00 pm on Wednesday, April 30, 2025

AIR Weekly Seminar

Speaker: Baifeng Shi (https://bfshi.github.io/), Fourth-year PhD student from Berkeley AI Research (BAIR) at UC Berkeley

Talk Title: “Scaling Vision Pre-Training to 4K Resolution”

Abstract: Modern vision models such as SigLIP and DINOv2 are all pre-trained at low resolutions (e.g., 384x384 pixels). Compared to human vision which can easily achieve an effective resolution of 10,000×10,000, such low-res pre-training limits performance in many real-world scenarios where high resolution is required. In this talk, I will first start from our previous work to briefly explain why scaling up image resolution is important compared to scaling along other dimensions such as number of parameters. Then I will share our recent work that scales up vision pre-training to 4K resolution. Specifically, I will introduce PS3, a vision encoder that scales up pre-training resolution to 4K with a near-constant cost via top-down patch selection; VILA-HD, a high-res multimodal LLM built on top of PS3 that achieves state-of-the-art performance and efficiency on various high-res benchmarks; and 4KPro, an image QA benchmark that strictly requires 4K-resolution perception. Finally, I will share some thoughts on the limitations of the current PS3 and VILA-HD models, as well as some future directions worth exploring.

Bio: Baifeng Shi (https://bfshi.github.io/) is a fourth-year PhD student from Berkeley AI Research (BAIR) at UC Berkeley, advised by Prof. Trevor Darrell. He is also a student researcher at NVIDIA. His research interests lie in learning general-purpose visual representations and building robust generalist robotic models. He has published and presented in top conferences such as CVPR, ECCV, ICCV, ICML, NeurIPS, ICLR, and CoRL. Previously, he graduated from Peking University with a B.S. degree in computer science.

Location:
Duan Family Center for Computing & Data Sciences, 665 Commonwealth Avenue, CDS 1001, 10th floor
Registration:
https://www.bu.edu/hic/4-30-air-weekly-seminar-baifeng-shi/

Information For...