Blurring Structure and Learning for Adaptive Local Recognition

AIR Distinguished Speaker Series

Speaker: Evan Shelhamer, Research Scientist at Adobe, Visiting Researcher at MIT
When: Monday, April 6, 2020
Time: 11:00am-12:30pm

ZOOM Info: https://bostonu.zoom.us/j/815792639?pwd=cWxMNG1FbytWQzB6ZE5EbWtVbmR2QT09
Meeting ID: 815 792 639  Password: 457178


Abstract: The visual world is vast and varied, but there is nevertheless ubiquitous structure. In this talk, I will focus on incorporating locality and scale structure into deep networks for image-to-image tasks like semantic segmentation. To help cope with scale variation, our approach optimizes receptive field size during learning and then dynamically adapts it during inference. We parameterize scale for end-to-end learning by composing structured Gaussian filters with free-form filters. In effect, the structured parameters control the degree of locality: changes in these parameters would require changes in architecture for standard networks. To adapt to different scales during inference, we experiment with feedforward prediction and iterative test-time training of an unsupervised entropy objective. Our compositional factorization points to a reconciliation of structure and learning, through which known visual structure is respected and unknown visual detail is learned freely.


Bio: Evan Shelhamer is a research scientist at Adobe and a visiting researcher at MIT. He received his PhD from UC Berkeley in 2019, advised by Trevor Darrell. He was the lead developer of the Caffe deep learning framework from version 0.1 to 1.0 and shared the Mark Everingham service award for Caffe at ICCV’17. His joint work on fully convolutional networks won the best paper honorable mention at CVPR’15. Before Berkeley, he studied computer science (AI concentration) and psychology at the University of Massachusetts Amherst advised by Erik Learned-Miller. He likes his networks deep and his coffee black.