“Visual Question Answering and Beyond”

Aishwarya Agrawal, Georgia Tech
Wednesday, October 31, 2018
11:30am – 12:30pm in MCS 148
Hariri Institute for Computing
111 Cummington Mall
Boston, MA 02215

Abstract: In this talk, I will present our work on Visual Question Answering (VQA) — I will provide a brief overview of the VQA task, dataset and baseline models, as well as, highlight some of the problems with existing VQA models. Additionally, I will talk about our work on fixing some of these problems by proposing:

a new evaluation protocol
a new model architecture
a novel objective function

Towards the end of the talk, I will also present some very recent work towards building agents that can generate diverse programs for scenes when conditioned on instructions and trained using reinforced adversarial learning.

Aishwarya’s Bio: Aishwarya Agrawal is a fifth year Ph.D. student in the School of Interactive Computing at Georgia Tech, working with Dhruv Batra and Devi Parikh. Her research interests lie at the intersection of computer vision, machine learning and natural language processing. The Visual Question Answering (VQA) work by Aishwarya and her colleagues has witnessed tremendous interest in a short period of time (3 years). Aishwarya is a recipient of the NVIDIA Graduate Fellowship 2018-2019, she is one of the Rising Stars in EECS 2018, a finalist of the Foley Scholars Award 2018 and Microsoft and Adobe Research Fellowships 2017-2018. As a research intern Aishwarya has spent time at Google DeepMind, Microsoft Research and Allen Institute for Artificial Intelligence. Aishwarya received her bachelor’s degree in Electrical Engineering with a minor in Computer Science and Engineering from Indian Institute of Technology (IIT) Gandhinagar in 2014.

[Back to AIR Seminars webpage]