Saccadic Motion Control for Monocular Fixation in a Robotic Vision Head: A Comparative Study

Waldmann, Jacques; Bispo, Edvaldo Marques

doi:10.1590/S0104-65001998000100008

Abstract

A comparative evaluation of two methods for visual tracking by saccade control of an active vision head with antropomorphic characteristics conducted at the ITA/INPE Active Computer Vision and Perception Laboratory is presented. The first method accomplishes fixation by detecting motion and controlling gaze direction based on gray-level segmentation. The second method aligns images of different viewpoints in order to apply static camera motion detection. Morphological opening is then employed to compensate for image alignment errors. Results from experiments in a controlled environment show that both approaches are capable of dealing with non-rigid forms and scenes with limited dynamics by operating at about 1 Hz. However, the comparative evaluation shows that image alignment improves tracking robustness to variations in lighting conditions and background texture. The results so far obtained encourage further applications in autonomous robotics and vision-aided robotic rotorcraft navigation.

Monocular fixation; active vision; robotic autonomous navigation; experimental evaluation of active vision algorithms

Saccadic Motion Control for Monocular Fixation

in a Robotic Vision Head:

A Comparative Study

Jacques Waldmann e Edvaldo Marques Bispo

CTA-ITA-IEEE 12228-900

São José dos Campos SP

jacques@ele.ita.cta.br

Abstract A comparative evaluation of two methods for visual tracking by saccade control of an active vision head with antropomorphic characteristics conducted at the ITA/INPE Active Computer Vision and Perception Laboratory is presented. The first method accomplishes fixation by detecting motion and controlling gaze direction based on gray-level segmentation. The second method aligns images of different viewpoints in order to apply static camera motion detection. Morphological opening is then employed to compensate for image alignment errors. Results from experiments in a controlled environment show that both approaches are capable of dealing with non-rigid forms and scenes with limited dynamics by operating at about 1 Hz. However, the comparative evaluation shows that image alignment improves tracking robustness to variations in lighting conditions and background texture. The results so far obtained encourage further applications in autonomous robotics and vision-aided robotic rotorcraft navigation.

Keywords: Monocular fixation, active vision, robotic autonomous navigation, experimental evaluation of active vision algorithms.

1 Introduction

Computer vision is an experimental science centered on visual information processing. Understanding vision processes as well as the creation and validation of algorithms for automatic information recovery from image data are certainly among its main purposes. Biological vision systems become a source of inspiration for the study of vision processes, and provide us with the motivation and useful solutions for conceiving and building computer vision systems. The observation of biological systems encourages an approach that considers not only the computational aspects of vision but its supporting behavior as well. Biological systems equipped with vision exert this ability by interacting in some way with its surroundings. The competence shown in such interaction is a measure of the performance and robustness of the vision system. Knowledge is derived and used to cope with uncertainties in every processing level.

One sound definition of early-vision is that it essentially consists of processes that realize inverse optics, i.e., that deal with the recovery of physical properties of the 3D world from 2D intensity arrays. Among such processes are edge detection, shape-from-X (X standing for contours, texture, shading, motion, stereo), optical flow estimation, stereo matching and others. The reconstructive paradigm advocates [3] that the above processes do not require domain-dependent knowledge but general constraints about the 3D world and the imaging process. This paradigm deals with the proper choice of a computational theory embedded within the visual process and with the issues of input/output representations coupled to the processing algorithm at various levels of increasing complexity. Successful completion of vision-based tasks are supposed to depend on accurate descriptions of the 3D world. These tasks are regarded as applications of a general purpose vision system. Obviously, such degree of generality imposes an extreme complexity on the design and operation of actual vision systems. Real implementations would involve so many simplifications and assumptions that their range of operating conditions would become quite limited. The reason for this is the fact that inverse optics is an ill-posed problem [13], as either its solution is not unique or it is very sensitive to noise in the input data. Information is lost in the imaging process and assumptions are then needed in order to derive a unique solution.

Real vision systems are to be conceived within a well-defined context. The context is dictated by those actions and/or behaviors which are consistent with the realization of a desired task. In its turn, this context motivates the need for vision. The paradigm known as active perception, or active vision, or action-oriented vision, or purposive vision, or animate vision [2,8,11,16], whose inspiration dates back to [6], claims that beings equipped with vision are not static but rather carry out intentional movements. A computational analysis allows us to conclude that, indeed, controlled actions and/or behaviors make it possible to find simple and robust solutions to a variety of ill-conditioned reconstruction problems that arise in attempts to design and implement vision systems [1,8,11,16]. Embedded within the active vision paradigm, the purposive and qualitative approach for conceiving and building vision systems is based on the selection of early-vision processes in accordance with the required task. The vision system becomes a collection of modules, each (or a group of which cooperate among themselves) solving a particular task. Quantitatively accurate descriptions are exchanged for qualitative statements about the surroundings. Finally, domain-dependent knowledge is used to keep the vision system design and operation simple. Therefore, we shall relate to vision as an ability by which creatures acquire information about its environment, followed by the activation of action and/or behaviors. Since a system that has been conceived under such paradigm could employ an infinity of data made available by the activation of adequate combinations of actions and/or behaviors, there exists a motivation for a minimalistic approach that searches those actions and/or behaviors strictly necessary for the successful completion of the desired task. This approach thus requires a definition of system purpose and its provision with some sort of strategy for visual attention.

Visual attention is motivated by the need to adequately allocate the visual field according to the tasks involved in the system purpose and under the constraint of limited computational resources. A basic mechanism in visual processes which is present in a wide variety of biological vision systems is the fixation of conspicuous objects. By fixation we mean the eye (camera) movements caused by visual stimuli. Previous works have proposed the importance of fixation in active computer vision systems [2,5,9]. In spite of fixation-related actions and/or behaviors having been separated in a series of distinct ocular motions [12], two basic ones are considered in [14]: pursuit and saccadic motions. Pursuit is a smooth and slow motion aiming at the stabilization of the image over the retina. As the stabilization error accumulates, fast saccadic motions occur to reduce the error. Saccadic motions also occur when the focus of attention is changed due to a new visual stimulus.

This paper is organized as follows. The next section describes the motivation for our study and evaluation of saccadic motion control algorithms for visual fixation purposes, namely, the task of robotic guidance and obstacle avoidance in an unstructured environment. Section 3 presents the methodology by describing two approaches for the implementation of saccade control. A description of the experiment is presented in Section 4. Finally, a comparative analysis along with conclusions and suggestions is presented in Section 5.

2 Motivation

A general framework for rotorcraft robotic navigation, guidance and obstacle avoidance based on the fusion of image and inertial data is depicted in Figure 1 [15]. Navigation is defined as the generation of a sequence of points through which a robotic vehicle should pass based on a low-resolution map of the 3D world, together with the estimation of its position and velocity relative to such map. Autonomous guidance requires planning and acting within the robot's local environment in order to detect and steer clear of obstacles. A 3D representation is needed for updating the low-resolution map based on evidence provided by sensors onboard the robot. Perceptions about the surroundings, available from a diversity of sensors, must be fused together with map data in a knowledge representation and management scheme to allow the robot to carry out the requested task. Often the robots are vehicles that contain an inertial navigation system (INS). Its use significantly reduces the computational load of image processing. Vision-aided navigation and guidance often relies on landmark location and its stabilization within the field of view. It is essential for the vision system to possess the ability to detect and track independently moving objects whose motion is not consistent with ego-motion.

Figure 1
- Rotorcraft robotic navigation, guidance and obstacle avoidance [15].

A vision system has been proposed in [7] for dealing with the task of obstacle detection in the near-field loop. Further research focused on the experimental evaluation of relaxation labeling-based image matching for 3D structure recovery [10]. Adequate depth estimates and robustness to changes in parameters selected by the operator were observed thus encouraging further studies on vision-aided robotic rotorcraft navigation.

This work presents results from experiments with monocular fixation carried out in the ITA/INPE Active Computer Vision and Perception Laboratory, aiming at the evaluation of two distinct vision algorithms for control of saccades under realistic conditions. The laboratory utilizes the binocular vision head Helpmate Bi-Sight with four mechanical degrees of freedom, namely, assymetric vergence, pan and tilt, as shown in Figure 2. Maximum angular acceleration and velocity, as well as motion repeatability, emulate the dynamics of neck and eye motions observed in biological vision systems with antropomorphic characteristics. Video acquisition produces a digital image with resolution 320 pixels by 240 pixels in RGB format. The vision head is equipped with two 2/3" Hitachi KP-M1U monochromatic cameras and two Fujinon H10x11E-MPX31 servo-actuated TV lenses for control of focus, zoom and aperture. 12-bit resolution D/A converters are used to command analog driving hardware for the purpose of lens control. Two 166MHz Pentium PC's are employed to control the vision head, the lenses and process the images. A graphic interface has been developed that allows the operator to input high level commands and reference levels to the vision head and to receive driver status information and encoder data. At the present stage, fixation should be achieved by control of the angular position of the head. Smooth pursuit is not considered here due to real-time constraints in the video acquisition hardware.

3 Methodology

3.1 Approach 1 - Motion Detection Followed by Gray-Level Segmentation

3.1.1 Activation of Focus of Attention by Motion Detection and Initial Centroid Location

The initiation of fixation demands a reaction of the system focus of attention to the detection of a moving stimulus in an otherwise static scene. This is accomplished by subtraction of consecutive images and thresholding the resulting image:

(1)

Pixels in the resulting subtraction image that have been detected because of a moving stimulus have the centroid location computed and the corresponding gray level is used to produce a gray level-based segmentation of the moving object in subsequent images.

(2)

Figure 2 -
ITA/INPE robotic vision head with its mechanical degrees of freedom.

where are coordinates of the Ns pixels that have been detected in the subtraction image in instant t1 and the gray level at the centroid location is denoted .

It should be stressed that no knowledge about the moving object is used in this work for improving the gray level-based segmentation. Adequate detection of motion followed by gaze direction control thus dictates the admissible magnitude of motion for successful fixation. Limiting the magnitude of the initial motion stimulus keeps the centroid as detected by the motion criterion (1) from falling upon an image area that corresponds to the static background, as shown in Figure 3. The centroid coordinates are then employed by the head angular position control module. The output of the latter are commands for an initial saccade with the purpose of shifting the gaze direction in order to bring the centroid towards the image center.

3.1.2 Gray Level-Based Object Segmentation and Centroid Location

The fixation process requires stabilization of the centroid within the visual field. However, camera motion induces an apparent image movement of the static background. The present implementation of the fixaton process does not compensate for this effect. As a result, the use of image subtraction for subsequent images is prevented as the camera tracks the moving object. Therefore, the fixation process in this stage is then based on the segmentation of pixels with a gray level which is compatible with that of the centroid. For this purpose a segmentation threshold is tuned.

(3)

The computation of the centroid of the moving object's projection on the image plane in this stage employs the coordinates of those pixels detected by the gray level-based segmentation, as seen in equation (4).

(4)

where are the coordinates of the Nc pixels that have been detected by gray level segmentation. The location of the computed centroid is then passed on to the fuzzy control module that commands the saccades. It should be stressed that the present implementation assumes the existence of only one independently moving object within the field of view, with gaze direction control being triggered only at the begining of the fixation process. Figure 4 depicts the information processing that occurs in this approach.

Figure 3
- Excessive initial motion magnitude and its effect on the gray level of the computed initial centroid. The cross locates the centroid position. i,j are coordinates in the sensor plane.

3.1.3 Fuzzy control of saccades

The fixation process (or image stabilization) is accomplished by driving the motors that control the angular position of the tilt and pan axes. A fuzzy approach has been motivated by the mechanical assembly of the vision head which induces a translation of the optical center during pan motion. The resulting optical flow thus has a component that depends on the 3-D scene structure. The reconstruction problem is known to be ill-posed [1,13]. The approach circumvents the need for the identification of actuator dynamics and calibration of system parameters.

Figure 4 -
Flowchart - approach 1.

3.2 - Approach 2 - Morphological Operation-Aided Motion Tracking

The thresholding described in (3) will work properly if the pixels belonging to the image of the tracked object can be associated with a range of gray levels. It is also necessary that other objects produce gray levels that do not fall in that range associated with the moving object. Another limitation is the maximum speed of the tracked object. Fixation will fail if the projection on the image plane in two consecutive images do not overlap because a wrong gray level will be associated with the centroid of the moving object. In order to overcome these problems an adaptation of the algorithm proposed in [4] has been evaluated. The algorithm is based in the idea that a moving object can be detected by comparing the image acquired at time tk (IMAGE_T) and an estimate of the stabilized image (IMAGE_EST). The latter assumes a static environment. This estimate is obtained using the previous image acquired at time tk-1 (IMAGE_T1) and a measurement of the vision head motion between time tk and time tk-1. The comparison between IMAGE_T and IMAGE_EST is done by thresholding the difference of these two images (IMAGE_DIF). The thresholding operation marks not only the pixels belonging to the moving object in IMAGE_T, but those pixels belonging to the moving object in IMAGE_T1 as well. Marked pixels in IMAGE_T are separated from those marked in IMAGE_T1 by carrying out a logical AND operation between the difference image and the edges of IMAGE_T. Unfortunately, this method will fail if static objects in IMAGE_T produce strong edges that coincide with those pixels labeled as belonging to the moving object in IMAGE_T1. This effect occurs when background texture which is occluded by the moving object in IMAGE_T1 becomes apparent in IMAGE_T.

Due to approximations in the computation of IMAGE_EST and other sources of error related to the image acquisition, the identification of the moving object in the image plane is not perfect. Small patches in the image that correspond to stationary objects are detected erroneously. These incorrect patches always appear in small areas of the image near the edges of stationary objects where the error in the computation of IMAGE_EST is more critical. Morphological opening is applied as proposed in [4] before thresholding the absolute difference image IMAGE_DIF. The idea is first to erode IMAGE_DIF in order to eliminate the incorrect patches and afterwards to apply dilation in order to recover the original size of the moving object. The detected moving edges are then used to control the vision head movements. An overview of the tracking algorithm is presented in Figure 5. It is discussed in more detail in Sections 3.2.1 to 3.2.4.

3.2.1 Moving the vision head

A combination of saccadic movements of vergence, pan and tilt are employed in order to keep the moving object in the center of the image plane. These motions are controlled in such a way as to emulate the behavior of a human being. Eye motion uses vergence movements and neck motion employs pan and tilt movements. Pure vergence movement allows the computation of IMAGE_EST because the camera rotates approximately about the camera optical center. This is no longer possible when a pan movement occurs, since the camera rotation axis in this case is located far away from the camera optical center and the distance between the pan rotation axis and the optical center is unknown. Therefore, each time a pan movement occurs the tracking algorithm goes back to its initial state and it will wait for the acquisition of two images before resuming tracking.

3.2.2 Computing the stabilized image

In order to detect a moving object using the difference between two consecutive images acquired at times tk and tk-1, the compensation of camera motion between these instants is required. This process, background compensation, finds a correspondence between pixels representing the same 3D point in the consecutive images.

Figure 5 
Flowchart - approach 2.

As discussed in the previous section, the background compensation will be applied only when vergence movements of the vision head occurs. It is reasonable to assume that the rotation axes pass through the camera optical center. Thus, in order to derive the equations for the background compensation, a sensor coordinate system is assumed attached to the camera (xcam, ycam, zcam ) and defined in such a way that: 1) its origin coincides with the camera optical center; 2) the zcam axis points in the viewing direction; 3) the xcam axis is defined in such a way that vergence corresponds to a rotation around it. Consequently, tilt corresponds to a rotation around ycam. Measurements of the tilt and vergence angles q vergence (tk-1), q tilt (tk-1), q vergence (tk) and q tilt (tk) with relation to the respective xcam and ycam axes are available from the encoders. Background compensation requires the computation of rotation matrix D. This transformation relates the 3D coordinates of a scene point in the sensor coordinate system at time tk-1 with the coordinates of the same point in the sensor coordinate system at time tk as follows:

(5)

The rotation matrix D is composed of a sequence of rotations given by:

D=R(q_vergenc(t_k)R(q_tilt(t_k)t_tilt(t_k-1))R(-q_vergenc(t_k-1)) (6)

where Rx(q ) represents a rotation of angle q around the x axis. This rotation sequence yields the entries dij, i,j=1,2,3 in matrix D. The relationship between 3D points in the sensor coordinate system and the corresponding image plane coordinates is assumed to be described by pin-hole camera optics. The correspondence between the projection of a stationary 3D point in the image plane at time tk-1 (defined by (x(tk-1), y(tk-1)) ) and the projection of the same 3D point at time tk (defined by (x(tk), y(tk)) ) is then:

(7)

Equation (7) is used to compute the stabilized image IMAGE_EST by means of data supplied by the encoders.

3.2.3 Morphological Operations: Erosion and Dilation

Consider the following functions:

I
M

The gray level erosion and dilation of I(x,y) by M(x,y), (I ero M)(x,y) and (I dil M)(x,y), respectively, can be defined as:

The tracking algorithm uses a morphological gray level opening (gray level erosion followed by gray level dilation) to avoid the erroneous detection of moving edges due to errors in image stabilization. M(x,y) is equal to zero for any (x,y) belonging to MD. The objective of the opening is to eliminate from IMAGE_DIF pixels that are associated to stationary objects. To reduce the computational load inherent to the implementation of morphological operations, the opening has been just applied to pixels of IMAGE_DIF with intensities larger than L1.(see Figure 5). The performance of the opening operation depends critically on the shape and size of the structured element M(x,y) used in the morphlogical opening. Experiments have been conducted with many different shapes and sizes of masks leading to an adequate trade-off between performance and computational load.

3.2.4 - Computing the gradient image

The gradient image has been computed with the Prewitt operator. The first experiments with the tracking algorithm employed image blur as a previous stage, either by averaging or smoothing with a Gaussian filter within a neighborhood, in order to reduce the effect of image noise. However, the final version of the algorithm dispensed with this previous smoothing as no significant performance improvement has been observed.

4 - Experimental results

Some representative results of our experiments using the algorithms for monocular tracking are described in Sections 4.1 and 4.2. All experiments have been conducted with a monochromatic camera VPC-500 of Pacific Corporation with a 1/3" CCD 510 horizontal by 492 vertical sensor array and equipped with a 3.7mm focal length lens providing a horizontal field of view of approximately 90o. The experiments were carried out in the computer vision laboratory with the moving object at a distance of approximately 2m from the camera. Controlled setup conditions in terms of lighting and background have been arranged in the particular case of approach 1 with the purpose of providing an adequate contrast between object and background.

4.1 - Experimental results - approach 1

An example of the results obtained using approach 1 can be seen in Figure 6 which shows the qualitative performance of the fixation behavior as the moving subject tries to evade the visual field of the camera. Each frame shown in the figure corresponds to the situation just after a saccade has been completed. The cycle comprising the control actions and the vision algorithm resulted in the system operating at a frequency of approximately 1 Hz. The existence of a limit-cycle while the object remained practically static was observed. Even though the computed motion centroid is not displayed, the target is clearly seen to be kept within the field of view. Nevertheless, it should be noted that under certain illumination conditions the system ended up fixating spurious objects entering the background as the camera moved. Such objects were seen to produce gray levels similar to that of the moving subject. The experiments confirmed that robust gray-level segmentation is required for successful fixation in realistic conditions.

4.2 - Experimental results - approach 2

The implementation of the second tracking algorithm has been carried out on 80 X 60 sub-sampled images. It enabled operation at a frequency of approximately 1.4 Hz while practically maintaining the same performance as that obtained with the original resolution. Figure 7 shows the result of a typical iteration of the tracking algorithm. IMAGE_T is the image acquired at the present time (tk), IMAGE_T1 is the image acquired at the previous time (tk-1), and IMAGE_EST is an estimate of how IMAGE_T should look like if all the objects in the environment were stationary (see Section 3.2.2). The white regions in IMAGE_EST correspond to regions of IMAGE_T that could not be estimated using IMAGE_T1 and the encoder measurements. These regions appear either due to digitizing errors in the calculation of IMAGE_EST, or to the changing view between tk-1 and tk due to camera motion. IMAGE_DIF is the difference between IMAGE_T and IMAGE_EST, IMAGE_EROS is the result of the erosion of IMAGE_DIF, and IMAGE_DIL is the result of the dilation of IMAGE_EROS. The saturated white regions in IMAGE_DIF, IMAGE_EROS and IMAGE_DIL correspond to regions in which these images are not defined. Pixels in IMAGE_DIL with gray level larger than L1 appear black in A: and are identified as belonging to those regions in IMAGE_T and IMAGE_T1 associated to a moving object. B: depicts the thresholded gradient of IMAGE_T, IMAGE_GRAD. Black pixels belonging to both A: and B: are labeled by the tracking algorithm as being associated to a moving object. These pixels are superimposed as black on the last picture. The white cross corresponds to the computed motion centroid. This centroid is used by the tracking algorithm, see Section 3.2.1, to control the movements of the vision head.

Figure 6 -
Monocular fixation - approach 1.

Figure 8 is very illustrative of the typical performance of the algorithm. The experiments showed that the performance of the algorithm can deteriorate considerably depending on the texture of the image background. As the morphological opening is not able to eliminate all spurious regions the algorithm performance deteriorates. Despite the problems so far discussed the tracking algorithm was very robust and only failed completely in a few extreme situations in which the background presented a strong texture.

Figure 7 - Monocular fixation - approach 2.

5- Conclusions and future work

A comparison between the two approaches revealed the clear superiority of approach 2 as it presents increased robustness and advantages such as: 1) the image of the moving object does not have to present a gray level range completely different from that of the background, and 2) the speed of the moving object does not have to be such that two consecutive projections of the moving object in the image plane overlap, as in approach 1. Approach 2 enabled the vision head to follow a moving target even when a great variation in background scenery occurred as shown in Figure 8. The fuzzy approach to head motion control was employed in both methods. It compared equally to the heuristics briefly described in Section 3.2.1 and used in method 2 solely. The main differences found so far between fixation algorithms for saccade control were dictated essentially by the visual processing.

The experimental results allow us to conclude that image stabilization and gray-level morphological opening as proposed in [4] greatly improves the robustness to background texture and lighting conditions of a motion segmentation-based fixation algorithm when compared to plain gray level-based segmentation which does not consider available infomation about camera motion. Present work is now focussing on integrating smooth pursuit to saccadic motion in order to explore the full potential of the vision head. Future work shall concentrate on stereo fixation and vergence control for centering the target on both cameras. The results obtained so far encourage further studies on vision-aided robotic rotorcraft navigation. Efforts are now being made to set up additional experiments with motion tracking by arranging a mechanical platform and installing a microcamera and video-link in a radio-controlled rotorcraft. Vision algorithms developed in the laboratory will be useful for evaluating the feasibility of controlling helicopter motion subject to the constraints of the platform linkages with data provided by onboard inertial sensors as well as information extracted from the acquired images.

Figure 8
- Tracking a target subject to large variations in background texture

6 - References

[1] C.Fermüller and Y.Aloimonos. Vision and Action. Image and Vision Computing, 13(10):725-744, 1995.
[2] D.H.Ballard. Animate Vision. Journal of Artificial Intelligence, 48:57-86, 1991.
[3] D.Marr. Vision. W.H.Freeman, 1981.
[4] D.Murray and A.Basu. Motion Tracking with an Active Camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5):449-459, 1994.
[5] D.W.Murray, P.F.McLaughlin, I.D.Read, and P.M.Sharkey. Reactions to Peripheral Image Motion Using a Head/Eye Platform. In Proc. of 4th International Conference on Computer Vision, IEEE Computer Society Press, Berlin, Germany, pages 403-411, 1993.
[6] J.J.Gibson. The Perception of The Visual World, Houghton Mifflin, Boston, 1950.
[7] J.Waldmann and S. Merhav. Fusion of Depth Estimates from Instantaneous Stereo and Recursive Motion for 3-D Reconstruction. In Proc. of 11th International Congress on Pattern Recognition, 1:5-8, The Hague, Netherlands, 1992.
[8] K.Pahlavan, T.Uhlin and J.-O.Eklundh. Active Vision as a Methodology. In Active Vision (Y.Aloimonos, Ed.), Advances in Computer Science Series, 19-46, Lawrence Erlbaum Associates, 1993.
[9] K.Pahlavan, T.Uhlin, and J.-O.Eklundh. Dynamic Fixation. In Proc. of 4th International Conference on Computer Vision, IEEE Computer Society Press, Berlin, Germany, pages 412-419, 1993.
[10] N.M.F.de Oliveira and J.Waldmann. Experimental Sensitivity Analysis of Image Matching by Relaxation Labelling. In Proc. of II Workshop on Cybernetic Vision, São Carlos, Brazil, IEEE Computer Society, pages 213-218, 1996.
[11] R.Bajcsy. Active Perception vs. Passive Perception. In Proc. of The Third IEEE Workshop on Computer Vision, pages 55-59, 1985.
[12] R.H.S.Carpenter. Movements of the Eyes. Pion Press, London, 1988.
[13] T.Poggio;V.Torre and C.Koch. Computational Vision and Regularization Theory. Nature, 317:314-319, 1985.
[14] T.Uhlin. Fixation and Seeing Systems. PhD Thesis, Kungl Tekniska Högskolan (Instituto Tecnológico Real), TRITA-NA-P96/10, Sweden, 1996.
[15] V.H.L.Cheng. Obstacle-Avoidance Automatic Guidance: A Concept-Development Study. In Proc. of AIAA Conference on Guidance, Navigation and Control, Minneapolis, pages 1142-1151, 1988.
[16] Y.Aloimonos, I.Weiss and A.Bandyopadhyay. Active Vision. In Proc. of The First International Conference on Computer Vision, pages 35-54, 1987.

Publication Dates

Publication in this collection
08 Oct 1998
Date of issue
Apr 1998

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

[1] [1] C.Fermüller and Y.Aloimonos. Vision and Action. Image and Vision Computing, 13(10):725-744, 1995.

[2] [2] D.H.Ballard. Animate Vision. Journal of Artificial Intelligence, 48:57-86, 1991.

[3] [3] D.Marr. Vision. W.H.Freeman, 1981.

[4] [4] D.Murray and A.Basu. Motion Tracking with an Active Camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5):449-459, 1994.

[5] [5] D.W.Murray, P.F.McLaughlin, I.D.Read, and P.M.Sharkey. Reactions to Peripheral Image Motion Using a Head/Eye Platform. In Proc. of 4th International Conference on Computer Vision, IEEE Computer Society Press, Berlin, Germany, pages 403-411, 1993.

[6] [6] J.J.Gibson. The Perception of The Visual World, Houghton Mifflin, Boston, 1950.

[7] [7] J.Waldmann and S. Merhav. Fusion of Depth Estimates from Instantaneous Stereo and Recursive Motion for 3-D Reconstruction. In Proc. of 11th International Congress on Pattern Recognition, 1:5-8, The Hague, Netherlands, 1992.

[8] [8] K.Pahlavan, T.Uhlin and J.-O.Eklundh. Active Vision as a Methodology. In Active Vision (Y.Aloimonos, Ed.), Advances in Computer Science Series, 19-46, Lawrence Erlbaum Associates, 1993.

[9] [9] K.Pahlavan, T.Uhlin, and J.-O.Eklundh. Dynamic Fixation. In Proc. of 4th International Conference on Computer Vision, IEEE Computer Society Press, Berlin, Germany, pages 412-419, 1993.

[10] [10] N.M.F.de Oliveira and J.Waldmann. Experimental Sensitivity Analysis of Image Matching by Relaxation Labelling. In Proc. of II Workshop on Cybernetic Vision, São Carlos, Brazil, IEEE Computer Society, pages 213-218, 1996.

[11] [11] R.Bajcsy. Active Perception vs. Passive Perception. In Proc. of The Third IEEE Workshop on Computer Vision, pages 55-59, 1985.

[12] [12] R.H.S.Carpenter. Movements of the Eyes. Pion Press, London, 1988.

[13] [13] T.Poggio;V.Torre and C.Koch. Computational Vision and Regularization Theory. Nature, 317:314-319, 1985.

[14] [14] T.Uhlin. Fixation and Seeing Systems. PhD Thesis, Kungl Tekniska Högskolan (Instituto Tecnológico Real), TRITA-NA-P96/10, Sweden, 1996.

[15] [15] V.H.L.Cheng. Obstacle-Avoidance Automatic Guidance: A Concept-Development Study. In Proc. of AIAA Conference on Guidance, Navigation and Control, Minneapolis, pages 1142-1151, 1988.

[16] [16] Y.Aloimonos, I.Weiss and A.Bandyopadhyay. Active Vision. In Proc. of The First International Conference on Computer Vision, pages 35-54, 1987.

Brasil

Brasil

Saccadic Motion Control for Monocular Fixation in a Robotic Vision Head: A Comparative Study

Abstract

Publication Dates