From Pose Estimation to Fine Grained Activity Recognition: Micha Andriluka, Max Planck Institute for Informatics
- Starts: 1:00 pm on Tuesday, October 2, 2012
- Ends: 2:00 pm on Tuesday, October 2, 2012
Speaker's name: Micha Andriluka
Affiliation: Department of Computer Vision and Multimodal Computing of
the Max Planck Institute for Informatics
Title: From Pose Estimation to Fine Grained Activity Recognition
Abstract:
Human pose estimation and activity recognition in monocular images are
challenging problems,
especially when these tasks must be solved in unconstrained
environments such as street scenes. The
major sources of complexity are cluttered and dynamically changing
backgrounds and the presence of
multiple people that often partially or fully occlude each other.
While previous work has largely neglected interactions between people,
we show that modeling them is
crucial for good performance. In the first part of the talk I will
demonstrate that for the case
of detection of people in crowded street scenes and for the case of
monocular 3D pose estimation. In
the case of people detection we propose a new occlusion-aware detector
that exploits the patterns
emerging from person-person occlusions, and quantify its performance
on several publicly available
benchmarks, improving over the state-of-the-art. In the case of human
pose estimation we propose to
incorporate interactions at two level. The 2D poses of people are
inferred with a multi-person
pictorial structures model that captures interactions between
subjects. The 3D poses are then
recovered by lifting 2D poses to 3D relying on the learned joined
prior model of human poses and
motion. We demonstrate that including interactions between subjects
both in 2D and in 3D improves
pose estimation results.
In the second part of the talk I will focus on the challenge of fine
grained activity recognition,
where the goal is to recognize a large number of visually similar
activities such as those performed
during a complex medical procedure, device maintaince or cooking. I
will rely on the cooking
activities as a working example and describe our recently introduced
dataset, containing over 65
cooking activities and about 9 hours of video footage. I will present
initial results on the dataset
and discuss open questions related to the use of pose estimation for
fine grained activity
recognition.
Bio:
Dr. Mykhaylo Andriluka has studied mathematics and computer science at
the I.I. Mechnikov National University in Odessa, Ukraine, and at the
TU Darmstadt, Germany. He graduated in 2010 with a Ph.D. in computer
science from the TU Darmstadt. His doctoral work in the area of
computer vision has resulted in several highly cited publications. The
approach to human pose estimation proposed in this work has been
widely used as foundation for further research in this area. He joined
the department of Computer Vision and Multimodal Computing of the Max
Planck Institute for Informatics as a postdoctoral researcher in 2011.
Prior to joining MPI he spent has been working at the Disney Research
Lab in Pittsburgh.
- Location:
- MCS 148