In this document, we report the methods that have been developed in the Magic Hat project (NSF SGER grant CDA9528079) for collision warning based on visual information.
The goal of the project is to demonstrate the feasibility of a device capable of warning blind and low vision people about impending danger (collision), by processing visual motion information. The primary consideration was to develop a collision warning method that is fast, and can be demonstrated using existing technology in as close to realtime as possible.
Visual motion information is obtained by processing sequence of video images from an offtheshelf camcorder. Our implementation of collision detection has the following setup:
Figure 1. Schematic of the collision detection demo setup
The NTSC signal from the camcorder is digitized by the builtin digitizer. The digitizer action is controlled from Collision Demo, a modified version of the publicdomain program NIH Image. Collision Demo works at about 3 frame s per second on the PowerMac 8500/120, and provides warning about impending collisions. The general outline of the collision warning method is as follows:
Figure 2. Flow chart of the collision warning method
We explain each of the steps in detail below:
Computation of derivatives
The x,y, and t derivatives are computed from two subsequent frames
in the following way. First, the two frames are smoothed (within the processing
loop, only the current frame is smoothed, the earlier frame has been smoothed
in the previous iteration). Th e smoothed first image is convolved with
x
and y derivative kernels shown below:
Figure 3. x and y derivative kernels
Results of applying this kernel on an image are shown in the figure
below. The image, its xderivative and yderivative are shown, respectively
from left to right.
The temporal derivative is computed as the pixelwise difference between the two smoothed frames.
Flow computation
Normal flow is computed at locations where the local spatial gradient
magnitude is above a threshold value. The gradient method based on the
optic flow constraint equation shown below is used to calculate the normal
flow:
Nearflat intensity distribution and noisy spots give rise to large normal flow magnitude, and rounding approximations give rise to small flow measurements where none exitst. Thus, normal flow measurements with very low or large magnitude are ignored.
If the size of the moving region is larger than a threshold, the 3D motion is considered to be selfmotion; otherwise, the moving object is isolated and all further processing is restricted to the moving object (the moving object is enclosed by a white re ctangle in the display window).
FOE Computation
The focus of expansion (FOE) is computed based on the principle that flow vectors are oriented in specific directions relative to the FOE. The horizontal component h_{L}of an optic flow vector Lto the left of the FOE points leftward. For an optic flow vector R to the right of the FOE, the horizontal component h_{R} points rightward, as shown in the figure on the left. By a c ounting method, we can determine the location about which a majority of horizontal components diverge. We have extended this principle to the field of normal flow vectors. Thus, we can find the FOE from a normal flow vector field, providing that the inten sity gradient (edge) distribution in the image is uniform (i.e., the probability that a certain orientation is seen in an image location is the same as for any other orientation). The proof for this method is given in the Appendix . 
Computation of Time to Collision (TTC)
The time to collision (TTC) can be computed from the parameters of
the firstorder approximation to the optic flow field. The firstorder
(affine) approximation is given by
The parameters a_{2} and a_{6} encode the divergence of the flow. It can be shown that for surfaces that give rise to a flow that is nearly affine, the TTC is giv en by the expression . The parameters a_{2} and a_{6} can be computed from the normal flow over a 2D regio n. However, to speed up the computation, we consider only the horizontal line and the vertical line passing through the FOE. The following figure illustrates the simplification achieved by this choice:
Arbitrary local edge directions have been chosen to show the normal
flow of two optic flow vectorsone along the vertical line and the other
along the horizontal line passing through the FOE. Note that these optic
flow vectors must lie along these li nes because of the structure of translational
flow fields. It can be easily shown that the magnitude of the horizontal
optic flow vector is
and that of the vertical opti c flow vector is ,
where E is the spatiotemporal intensity function. These
derivatives have been computed earlier. From several
measu rements along the horizontal and vertical lines, we can calculate
the desired affine parameters as follows:
where x_{h} and y_{v} are transformed coordinates with the FOE as the origin. The TTC can now be computed. 



Figure 5. Sample result sequences demostrating collision warning
Figure 6. A typical indoor image, and edge distribution of a sequence containing this image
Contrary to our assumption (but consistent with empirical observations in the literature on statistics of indoor images), we found a skewed distribution, with peaks corresponding to the horizontal and vertical edges. However, an analysis of the effect of this skewed distribution suggests that a preponderance of vertical and horizontal edges will not affectand in fact enhancethe performance of our FOE detection algorithm, provided that the 3D horizontal and vertical edges are projected as horizontal and vertical edges on the image plane (not necessarily in that orderfor e.g., a vertical edge could project as a horizontal edge on the image by rotating the camera by 90 degrees). This is due to the fact that we use the horizontal and vertical axes as our reference frames to project the components of the normal flow (see the explanation for vertical edges in the appendix). The preceding observations do not imply, however, that the method will fail when the distribution of edges is skewed in an unfavorable fashion. In order to perform a complete analysis of failure, we should consider not only the edge distribution, but also the motion direction distribution. Since this joint distribution is not wellstudie d, we will not conduct the extended analysis. We conclude by noting that in all real, indoor sequences that we processed, the collision warning computation was successful.
Theorem:With high probability (greater than 0.5), the signs of the components of a normal flow vector agree with the signs of the components of the corresponding optic flow vector, providing the local intensity gradient direction is distributed uniformly.
Proof:
Let the local gradient subtend an angle a, and the optic flow vector
subtend an angle q, with the x axis. Without loss of generality, we can
assume that both a and q are in the range 0 to p/2 (p*d is the circumference
of a circle with diameter d). The loc al gradient is along [cos(a), sin(a)]
and the optic flow vector is along [cos(q), sin(q)]. The normal flow vector
is in the direction cos (qa)[cos(a), sin(a)].
For the x component of this (normal) vector to have the same sign as the x component of the optic flow vector,
cos (qa) cos (a) cos(q) > 0.
We consider two possibilities:
2. cos(qa) < 0 and cos (a) cos (q) < 0.
The first situation occurs if qa < p/2, and if a and q are in the same vertical set of quadrants (first and fourth, or second and third). For a given q , a has a range of pq in which an agreement of signs results. For the second situation,  2 p + q  a  < p/2, and a and q must be in in the same horizontal set of quadrants (first and second, or third and fourth). This corresponds to a range of pq for a. Since the total possible range for a is 0 to 2 p, assuming a uniform distribution for a,the prob ability that the signs of the x components agree is ( p  q ) / p.
In passing, we note that for a =0, p (vertical edges), the signs of the x components always agree!
Using a similar argument, it can be easily shown that the probability that the y components agree is (p/2+q )/p.