Description |
Title: Deep Learning Algorithms for Background Subtraction and People Detection
Presenter: M. Ozan Tezcan
Date: January 28, 2020
Time: 1:30PM- 3:30PM
Locations: PHO404/428
Advisor: Professor Janusz Konrad, ECE
Chair: Professor Prakash Ishwar, ECE
Committee: Professor Brian Kulis, ECE; Professor Pierre-Marc Jodoin, University of Sherbrooke
Abstract:
Background subtraction is a basic task in computer vision and video processing often applied as a preprocessing step for object tracking, action recognition, etc. To date, many successful background subtraction algorithms have been proposed. Unsurprisingly, the top performing methods on “changedetection.net” today are supervised, deep-learning approaches. Their excellent performance, however, hinges on the availability of some annotated frames from a test video during training.
In this prospectus, I propose an alternative framework for background subtraction via supervised deep learning and its application to people detection. To date, I developed a novel algorithm, Background Subtraction for Unseen Videos (BSUV-Net), that is based on a fully convolutional neural network. Unlike in past approaches, the input to the network consists of the current frame and two background frames captured at different time scales along with their semantic segmentation maps. To reduce the chance of overfitting, I introduced a new data-augmentation technique that further mitigates illumination variations.
In order to explore the potential of background subtraction as a pre-processing step for more complex visual analysis, I participated in a proof-of-concept study of people detection and counting from overhead fisheye cameras. Our most successful approach leverages people detection algorithms optimized for standard images (perspective-projection geometry) to detect people in fisheye images (radial geometry) via pre- and post-processing. Combining this approach with background subtraction affords a significant speed-up.
For the remainder of my PhD thesis research, I propose to further study background subtraction using video-agnostic neural networks that do not memorize video categories or specific challenges (e.g., snow, shadows), but instead focus exclusively on detecting changes. One approach I will consider is the GAN framework with a discriminator network that tries to classify the input video using BSUV-Net’s feature embedding, and an adversarial loss that tries to fool the discriminator. I expect this architecture to produce video-agnostic features. As for people detection, I will investigate the use of background subtraction in an attention mechanism that focuses a network on foreground pixels. Since people are typically part of the foreground, this is expected to improve the people-detection performance.
|