5 Implementation and Experiments
The system was implemented on a Sun SPARCstation 10 with a Data Cell
S2200 frame grabber. The manipulator is a Scorbot ER-7 robot arm, which
has 5 degrees of freedom and a parallel-jawed gripper.
The robot has its own 68000-based controller which implements
the low-level control loop and provides a Cartesian kinematic model.
Images are obtained from two inexpensive CCD cameras placed 1m-3m
from the robot's workspace. The angle between the cameras is in the range of
15-30 degrees (figure 4).
The experimental setup. Uncalibrated stereo cameras viewing
a robot gripper and target object.
When the system is started up, it begins by opening and closing the jaws of
the gripper. By observing the image difference, it is able to locate the
gripper and set up a pair of affine trackers as instances of a hand-made 2D
template. The trackers will then follow the
gripper's movements continuously. Stereo tracking can be implemented on the
Sun at over 10 Hz.
The robot moves to four preset points to calibrate the system in
terms of the controller's coordinate space.
A target object is found by similar means - observing the image
changes when it is placed in the
manipulator's workspace. Alternatively it may be selected from a monitor screen
using the mouse. There is no pre-defined model of the target shape, so
a pair of `expanding' B-spline snakes 
are used to locate the contours delimiting the target surface in each of the
images. The snakes are then converted to a pair of affine
The target surface is then tracked, to compensate for unexpected motions of
either the target or the two cameras.
By introducing modifications and offsets to the feedback mechanism
(which would otherwise
try to superimpose the gripper and the target), two `behaviours' have
The tracking behaviour causes it to follow the target continuously,
hovering a few centimetres above it (figure 5).
The grasping behaviour causes the gripper
to approach the
target from above (to avoid collisions) with the gripper turned through an
angle of 90 degrees, to grasp it normal to its visible surface
The robot is tracking its quarry, guided by the
position and orientation of the target contour (view through left
camera). On the target surface is an affine snake - an affine
tracker obtained by
`expanding' a B-spline snake from the centre of the object.
A slight offset has been
introduced into the control loop to cause the gripper to hover
above it. Last frame: one of the cameras has been rotated and
zoomed, but the system continues to operate successfully with visual
Affine stereo and visual feedback used to grasp a planar
Without feedback control,
the robot locates its target only approximately (typically to within
5cm in a 50cm workspace) reflecting the approximate nature of
affine stereo and
calibration from only four points. With a feedback gain of 0.75 the gripper
converges on its target in
three or four control iterations. If the system is not disturbed it
will take a straight-line path.
The system has so far demonstrated its robustness by continuing to
track and grasp objects despite:
Linear offsets or scalings of the controller's coordinate system are
absorbed by the self-calibration process with complete transparency.
Slight nonlinear distortions to the kinematics are corrected for by
the visual feedback loop, though large errors introduce a risk of
ringing and instability unless the gain is reduced.
The system continues to function when its cameras are subjected to
small translations (e.g. 20cm), rotations (e.g. 30 degrees)
and zooms (e.g. 200% change in focal length),
even after it has self-calibrated.
Large disturbances to camera geometry cause the gripper to take a
curved path towards the target, and require more control iterations to
The condition of weak perspective throughout the robot's workspace does not
seem to be essential for image-based control
and the system can function when the cameras are as close as 1.5
metres (the robot's reach is a little under 1 metre). However the
feedback gain must be reduced or the system will overshoot on
motions towards the cameras.
Figure 5 shows four frames from a tracking sequence (all
taken through the same camera). The cameras are about two metres from
the workspace. Tracking of position and orientation is maintained even
when one of the cameras is rotated about its optical axis and zoomed
(figure 5, bottom right).