Aishwarya Agrawal, Georgia Tech Wednesday, October 31, 2018 11:30am – 12:30pm in MCS 148 Hariri Institute for Computing 111 Cummington Mall Boston, MA 02215 Abstract: In this talk, I will present our work on Visual Question Answering (VQA) — I will provide a brief overview of the VQA task, dataset and baseline models, as well as, […]
Category: Archived AIR Seminars
BU postdoctoral associate Bryan Plummer introduces an approach to improving many image-language tasks where neural networks learn a set of models, each of which capture a different concept which is useful in the task.
Regina Barzilay, a professor in the Department of Electrical Engineering and Computer Science at MIT, describe a number of tasks where NLP-based models can make a difference in clinical practice and introduces new functionalities such as interpretable neural models that provide rationales underlying predictions and semi-supervised methods for information extraction.
Aaron Hertzmann, principal scientist at Adobe and an ACM Distinguished Scientist, describes his work in artistic image rendering and stylization, also called Non-Photorealistic Rendering.
Trevor Darrell, an adjunct professor at UC Berkeley’s Department of Electrical Engineering and Computer Sciences, presents recent long-term recurrent network models that learn cross-modal description and explanation, using implicit and explicit approaches, which can be applied to domains including fine-grained recognition and visuomotor policies.
Led by Kate Saenko, Stan Sclaroff, Brian Kulis, and Margrit Betke, the initiative was featured during the Data Science Initiative’s BUDS 2018 conference and has launched an exciting seminar series for the spring 2018 semester that will feature top AI experts from across the country.
This talk by Mike Jones, senior principal research scientist at Mitsubishi Electric Research Labs, will demonstrate how L2 distance is not the best basis of comparison to use in convolutional neural network (CNN) analysis for face verification and propose the hyperplane similarity as a more appropriate similarity function that is derived from the softmax loss function used to train the network.
This talk by Zhengming Ding, a graduate student at Northeastern University, outlines a proposal to build a large-scale face recognizer capable of fighting off the data imbalance difficulty that existing machine learning approaches experience in mimicking human visual intelligence. To seek a more effective general classifier, we develop a novel generative model attempting to synthesize meaningful data for one-shot classes by adapting the data variances from other normal classes.
This talk by Andrei Barbu, a research scientist at MIT, will discuss a program to unify research around a number of vision-language problems into a single mathematical framework culminating in a robotic platform that is able to follow natural language commands, store knowledge, and answer questions.
This talk by Guorong Li, an associate professor at the University of Chinese Academy of Sciences, will outline recent research in media analysis including learning label-specific features for multi-label classification, learning common space for cross-modal retrieval and car tracking in UAV video.