From Pose Estimation to Fine Grained Activity Recognition: Micha Andriluka, Max Planck Institute for Informatics

  • Starts: 1:00 pm on Tuesday, October 2, 2012
  • Ends: 2:00 pm on Tuesday, October 2, 2012
Speaker's name: Micha Andriluka Affiliation: Department of Computer Vision and Multimodal Computing of the Max Planck Institute for Informatics Title: From Pose Estimation to Fine Grained Activity Recognition Abstract: Human pose estimation and activity recognition in monocular images are challenging problems, especially when these tasks must be solved in unconstrained environments such as street scenes. The major sources of complexity are cluttered and dynamically changing backgrounds and the presence of multiple people that often partially or fully occlude each other. While previous work has largely neglected interactions between people, we show that modeling them is crucial for good performance. In the first part of the talk I will demonstrate that for the case of detection of people in crowded street scenes and for the case of monocular 3D pose estimation. In the case of people detection we propose a new occlusion-aware detector that exploits the patterns emerging from person-person occlusions, and quantify its performance on several publicly available benchmarks, improving over the state-of-the-art. In the case of human pose estimation we propose to incorporate interactions at two level. The 2D poses of people are inferred with a multi-person pictorial structures model that captures interactions between subjects. The 3D poses are then recovered by lifting 2D poses to 3D relying on the learned joined prior model of human poses and motion. We demonstrate that including interactions between subjects both in 2D and in 3D improves pose estimation results. In the second part of the talk I will focus on the challenge of fine grained activity recognition, where the goal is to recognize a large number of visually similar activities such as those performed during a complex medical procedure, device maintaince or cooking. I will rely on the cooking activities as a working example and describe our recently introduced dataset, containing over 65 cooking activities and about 9 hours of video footage. I will present initial results on the dataset and discuss open questions related to the use of pose estimation for fine grained activity recognition. Bio: Dr. Mykhaylo Andriluka has studied mathematics and computer science at the I.I. Mechnikov National University in Odessa, Ukraine, and at the TU Darmstadt, Germany. He graduated in 2010 with a Ph.D. in computer science from the TU Darmstadt. His doctoral work in the area of computer vision has resulted in several highly cited publications. The approach to human pose estimation proposed in this work has been widely used as foundation for further research in this area. He joined the department of Computer Vision and Multimodal Computing of the Max Planck Institute for Informatics as a postdoctoral researcher in 2011. Prior to joining MPI he spent has been working at the Disney Research Lab in Pittsburgh.
Location:
MCS 148