Video Condensation

Zhuang Li (EE ’09), Wei Liu (EE ’09), Huan-Yu Wu (EE ’10)
Prof. Prakash Ishwar, Prof. Janusz Konrad

Funding: National Science Foundation

Background: Efficient browsing of long video sequences is a key tool in visual surveillance, e.g., for post-event video forensics, but it can also be used to review motion pictures and home videos. While frame skipping is straightforward to implement, its performance is quite limited. More efficient techniques have been developed, such as video summarization and video montage, but they lose either the temporal or semantic context of events. A recent method called video synopsis provides even better performance, but it involves multiple processing stages and is fairly complex.

Description: We have been inspired by image seam carving, a method for content-aware re-sizing of still images, where vertical and horizontal seams (connected paths) with the lowest cost, e.g., sum of luminance-gradient magnitudes along the seam, are removed recursively to meet the target image size. Based on this idea, we developed a novel approach to video synopsis, that we call video condensation. Our approach extends the concept of image seam to video ribbon, a 3-D surface that is rigid either horizontally (vertical ribbon) or vertically (horizontal ribbon). This structure of the ribbon permits the use of dynamic programming, originally proposed in seam carving. The ribbon model is flexible and permits an easy adjustment of the compromise between temporal condensation ratio and anachronism of events. Although our approach permits the use of spatio-temporal luminance gradients, the most interesting results have been obtained for costs derived from motion labels (moving/static) computed from the video by means of background subtraction. The method is novel in the way information is removed from the space-time video volume and is conceptually simple, relatively easy to implement, and very effective.

Results: Video condensation applied to typical surveillance data (e.g., pedestrian or motor traffic) achieves about 10-fold video length reduction while preserving all moving objects. This is possible since objects are “moved” in time as shown in the figure below. Although this may lead to event anachronism, it is easily controllable by ribbon flexibility (a parameter).

Three frames from the original video show three different pedestrians whereas a condensed frame on the right shows the three pedestrians simultaneously thus allowing fast video browsing.
Three frames from the original video show three different pedestrians whereas a condensed frame on the right shows the three pedestrians simultaneously thus allowing fast video browsing.

Read more about video condensation at  http://vip.bu.edu/projects/video/video-condensation

Publications:

Z. Li, P. Ishwar, and J. Konrad, “Video condensation by ribbon carving,” IEEE Transactions on Image Processing, vol. 18, pp. 2572-2583, November 2009.

Website: http://vip.bu.edu