Visual impairment resulting from normal aging or disease (e.g., stroke, head trauma, or retinal disorders) often interferes with visually guided mobility. Everyday experiences such as walking along a sidewalk among other pedestrians, monitoring where the other cars on the road are going during driving, trying to intercept or follow a target moving in a cluttered environment, foot travel within shopping malls or food stores, social gatherings, or reaching out to grasp and manipulate objects become very difficult for the visually impaired. Thus there is clearly an acute need for devices that help visually guided mobility since visual impairment is one of the most disabling and confining handicaps. The current solutions, such as sonar glasses and GPS-based systems do not work very well in real situations, since they require that people adapt to them (which is cumbersome and impractical), they are expensive, and by and large unreliable.
This project is the first step towards a new device to assist visually guided mobility, the Magic Hat, which is designed to overcome the above limitations. The Magic Hat, through compact and robust representation of the user's environment and fast learning methods, will adapt to the user's needs and will handle his/her specifi environment (different lighting/weather conditions), will be inexpensive because based on mass-produceable and cheap analog VLSI technology, and will use fast pattern recognition methods combined with learning to assure reliability.
A circular array of sensors placed on the hat will monitor changes in the visible environment due to the motion of the user and motion of objects. The major strength of this project that distinguishes it from previous approaches is a combination coarse representation, learning, and a wide field of view. Coarse representations, such as those obtained by broadly tuned spatiotemporal filtering, are rapidly computable and are ideal for global analysis of visual motion. In contrast, precise representations such as optic flow are hard to compute, and are more appropriate for local analysis. The robustness of the coarse representation will be enhanced by methods that learn to suppress the effects of noise. The wide field of view of the sensor will further provide redundant information for reliable classification of user motion.
During this project, different possible coarse representations will be explored and their performance coupled with learning applied to real images will be quantitatively and qualitatively compared. The most reliable representation which is compatible with VLSI architectures, with fast learning will be used for developing algorithms to classify the motion of the user and objects in the environment. The wide field of view (360 deg) of the environment play a key role in a novel approach to resolve classical ambiguities in estimating motion parameters. Methods to compute time to collision will be developed and tested in simulations of realistic situations. Experiments with various adaptive learning methods, including Hebbian learning, cooperative learning, and HyperBFs, will be done to choose the one that ensures fast and stable improvement in performance. The performance of the methods on realistic image sequences will be assessed in all stages of this project.