This chapter describes vision chips that implement only a spatial image processing function, from simple local smoothing operations to more complicated and global object orientation detection. Several different categories can be easily recognized among these vision chips.
A majority of spatial image processing chips, which have been dubbed
silicon retinas, are based on models of the vertebrate retina.
Some of the general characteristics of the vertebrate retina, which
have been given considerable attention, are the adaptation to local
and global light intensity, and edge enhancement. Various models have
been proposed for the form and function of the retina, such as
Laplacian of Gaussian (LOG), Difference of Gaussian (DOG), a direct
derivate of the biharmonic equation, and linear and multiplicative
lateral inhibition. Not surprisingly, the form of the kernel
convolution function in all of these models has a mexican-hat shape
shown in Figure 2.1, though the underlying
mathematical or biological theories may be quite different. Which one of these models can best approximate the
function of the retina is still subject to more experience with these
models and the retina itself.
Figure 2.1: The mexican hat. A generic kernel with different
explanations and models.
The Gaussian filtering plays an important role in most of the models used in implementing silicon retinas. The smoothing operation performed at any stage, and specially at the front-end, may help in reducing the noise. In some silicon retinas Gaussian filtering is followed by a subtraction or division stage, to enhance the edges and make the image invariant to the local intensity, at a neighborhood determined by the characteristics of the Gaussian filtering. In many silicon retinas a simple 1-D or 2-D resistive network serves as the basic element for approximating the Gaussian smoothing function. Only one implementation utilizes a more accurate approximation to the Gaussian filtering [Kobayashi et al. 95b].
Another group of spatial processing vision chips target more global features of the image, such as the object position and orientation chip [Standley 91b] or the centroid computation chip [Deweerth 92].
Foveated sensors constitute another group of spatial vision chips. In these sensors the physical size and placement of the photodetectors form a log-polar mapping on the image. Log-polar mapping is rotation and scale invariant, with a high resolution in the centre, and logarithmically decreasing resolution off the centre.