From Visual 'Stuff' to Obstacle Detection: Environment Interaction Trains Perceptual Structures

M. Anthony Lewis & Lucia S. Simó
Iguana Robotics, Inc., P.O. Box 628, Mahomet, IL 61853

Let us define an obstacle as a feature of the environment with the potential to destabilize or impede gait. Humans have the ability to visually detect obstacles and make necessary gait modification prior to reaching an obstacle. Because of variation in size, and capability, an obstacle to one observer, may not be an obstacle to another. It seems reasonable to assume that the determination of which features are obstacles and which are not is determined through interaction with the environment. Visual perception in some circumstances may be shaped by motor experience.

In recent experiments in our lab, we have used a bipedal robot mechanism to model the acquisition of obstacle perception, as well as appropriate adaptive motor response to the detection of an obstacle during ongoing locomotion.

A small 10 cm tall biped was constructed with two down-ward looking stereo cameras. As the robot walks, disparity data from the cameras varies periodically due to the natural up-and-down movement of the cameras. Thus the visual data is complex and dependent on the dynamics of the robot.

Proprioceptive, tactile and the phase of gait information is combined and used to predict the disparity map and its periodic variation. Training of this predictive system is done using a simple learning rule. The prediction is subtracted from the incoming visual data. The residual is further processed by multiplying each element by a gain. This gain field is a function of both the phase of the gait as well as the particular spatial location on the incoming image. Via a tuning algorithm, this gain is highest during portions of the gait where the prediction is best, and lowest in portions where the prediction is poorest. A simple thresholding of the result indicates sensory novelty. Novelty is thus defined as those sensory events that cannot be predicted from past experience.

These novel events set off an associated ëeligibility trace.í The peak of eligibility happens several 100ís of milliseconds after the first occurrence of novel stimuli. If the robot subsequently collides with an obstacles, an association is made between those cells signaling novel events, with high eligibility values, and a motor program which causes the robot to lift its foot to clear the obstacle. Further, the phase of gait at which the robot hits the obstacle is used to form an association between the cells signaling novel events, with high eligibility values and an adjustment in stride length just prior to going over the obstacle.

Our results indicate that an association can readily be made between visual stimuli and the motor activity needed to step over the obstacle. An association can also be made between visual stimuli and the motor activity needed to adjust stride length.

Interestingly, upon examination of the weights mapping visual stimuli to stride length modification, we found that the robot is responding to novel features moving toward the robot at a given speed. The ëreceptiveí field is a motion-tuned cell, scaled by the stride length of the robot (not by absolute speed). Thus the tuning is related to the size and motion of the robot, and receptive fields modify the stereo information to produce movement-depth sensitive cells.

Further, note that in the prediction layer, we have cells that create ësyntheticí visual images based on the weighted average of other sensory stimuli. These synthetic visual ënegativesí can be activated by appropriate stimulation of non-visual sensory channels.

It is reasonable to assume that a great deal of visual processing is used to support locomotion. We feel that these results support the position that important aspects of visual perception are shaped by environment interaction during the life of the observer.