Omnidirectional Visual Odometry for a Planetary
Rover
Peter Corke†and Dennis Strelow†and Sanjiv Singh†
CSIRO ICT Centre,Queensland,Australia
†Robotics Institute,Carnegie-Mellon University,Pittsburgh,USA
dstrelow,u.edu
Abstract—Position estimation for planetary rovers has been typically limited to odometry bad on proprioceptive measurements such as the integration of distance traveled and measurement of heading change.Here we prent and compare two methods of online visual odometry suited for planetary rovers.Both methods u omnidirectional imagery to estimate motion of the rover.One method is bad on robust estimation of opticalflow and subquent integration of theflow.The cond method is a full structure-from-motion solution.To make the comparison meaningful we u the same s
et of raw corresponding visual features for each method.The datat is an quence of2000images taken during afield experiment in the Atacama dert,for which high resolution GPS ground truth is available.
I.I NTRODUCTION
Since GPS is not available on Mars,estimating rover motion over long distances by integrating odometry mea-surements inevitably produces estimates that drift due to odometry measurement noi and wheel slippage.Recent experiments have shown that in planetary analog environ-ments such as the Atacama Dert in Chile,odometric error is approximately5percent of distance traveled.Such error can increa further in loo soil becau wheels can slip considerably.We would like a method that compensates for such error.
Much has been written in the biological literature about estimation of motion using quences of visual images and veral rearch efforts have attempted to u the concepts for [1]).This type of work eks inspiration from a number of different ways in which incts u cues derived from opticalflow for navigational purpos,such as safe landing,obstacle avoidance and dead reckoning.We have similar motivations but ek analytical methods with high accuracy.
关于春节的英语手抄报
Motion estimation from imagery taken from onboard cameras has the potential to greatly increa the accu-racy of rover motion estimation becau images and the rover’s motion can be ud together to establish the three-dimensional positions of environmental features relative to the rover,and becau the rover’s position can in turn be estimated with respect to the external landmarks over the subquent motion.While visual odometry has its own drift due to discretization and mistracking of visual features, the advantage is that it is not correlated with the errors associated with wheel and gyro bad
odometry.
Fig.1
H YPERION IS A SOLAR POWER ROBOT DEVELOPED AT C ARNEGIE M ELLON U NIVERSITY INTENDED FOR AUTONOMOUS NAVIGATION IN PLANETARY ANALOG ENVIRONMENTS.
Relative to conventional cameras,omnidirectional (panospheric)cameras trade resolution for an incread field of view.In our experience this tradeoff is benefi-cial for motion estimation,and as others have shown, estimating camera motion from omnidirectional images does not suffer from some ambiguities that conventional image motion estimation suffers from[2].This is primarily becau in an environment with sufficient optical texture, motion in any direction produces good opticalflow.This is in contrast to conventional cameras that require that cameras be pointed orthogonal to the direction of motion. In addition,as the camera moves through the environment, environmental features who three-dimensional positions are established are retained longer in the widefield of view of an omnidirectional camera than in a conventional camera’sfield of view,providing a stronger reference for motion estimation over long intervals.
Our approach us a single camera rather than stereo cameras for motion estimation.An advantage
of this method is that the range of external points who three-dimensional positions can be established is larger than the
range of external points who three-dimensional positions can be established by stereo cameras since the baline over which points can be estimated in the former method can be much larger.A strategy that we have not investigated,but that has been examined by some other , [3])integrates both stereo pairs and feature tracking over time,and this is a promising approach for the future. This paper compares two methods of online(not batch) visual odometry.Thefirst method is bad on robust optical flow from salient visual features tracked between pairs of images.The terrain around the robot is approximated to be a plane and a displacement is computed for each frame in the image using an optimization method that also computes camera intrinsics and extrinsics at every step.Motion esti-mation is done by integrating the three-DOF displacement found at each step.The cond method,implemented as an iterated extended Kalmanfilter estimates both the motion of the camera as well as the three dimensional location of visual features in the environment.This method makes no assumption on the planarity of visual features and tracks the features over many successive images.In this ca the six-DOF po of the camera as well as the three-DOF position of the feature points are extracted.We report comparisons of visual odometry generated by the two
methods on a quence of2000images taken during a dert traver.To make the comparison meaningful we u the same t of raw corresponding visual features for each method.
II.E XPERIMENTAL PLATFORM
A.Hyperion
Carnegie Mellon’s Hyperion,shown in Figure1,is a solar powered rover testbed for the development of science and autonomy techniques suitable for large-scale explo-rations of life in planetary analogs such as the Atacama Dert in Chile.Hyperion’s measurement and exploration technique combines long travers,sampling measurements on a regional scale,and detailed measurements of individ-ual targets.Becau Hyperion eks to emulate the long communication delays between Earth and Mars,it must be able to perform much of this exploration autonomously, including the estimation of the rover’s position without GPS.Having been demonstrated to autonomously navigate over extended periods of time in the Arctic Circle during the summer of2001,Hyperion was recently ud infield tests in Chile’s Atacama Dert on April5-28,2003. B.Omnidirectional camera
Recent omnidirectional camera designs combine a con-ventional camera with a convex mirror that greatly expands the camera’sfield of view,typically to360degrees in az-imuth and90-140degrees in el
evation.Onfive days during Hyperion’sfield test,the rover carried an omnidirectional camera developed at Carnegie Mellon and logged high-resolution color images from the camera for visualization and for motion estimation experiments.This camera is shown in Figure
女人挣钱
3.
Fig.2
A N EXAMPLE OMNIDIRECTIONAL IMAGE FROM OUR SEQUENCE,
TAKEN BY H YPERION IN THE A TACAMA DESERT
.
Fig.3
T HE OMNIDIRECTIONAL CAMERA USED IN OUR EXPERIMENTS.T HE MIRROR USED HAS A PROFILE THAT PRODUCES EQUI-ANGULAR RESOLUTION.T HAT IS,EACH PIXEL IN THE RADIAL DIRECTION HAS EXACTLY THE SAME VERTICAL FIELD OF VIEW.T HIS CAMERA WAS DESIGNED AND FABRICATED AT C ARNEGIE M ELLON U NIVERSITY.
An example image taken from the omnidirectional cam-era while mounted on Hyperion is shown in2.The dark circle in the center of the image is the center of the mirror, while the rover solar panel is visible in the bottom central part of the image.The ragged border around the outside of the image is an ad-hoc iris constructed in thefield to prevent the sun from being captured in and saturating the images.
The camera design is described by[4]and is summarized in Figure5.The mirror has the property that the angle of the outgoing ray from vertical is proportional to the angle of the ray from the camera to the mirror,e Figure5.The
50
100150200250300350400450500550
50100150200250300
350400450Fig.4
T YPICAL FEATURE FLOW
BETWEEN CONSECUTIVE FRAMES .
scale factor αis the elevation gain and is approximately 10.
III.F EATURE
TRACKING
In this work we have investigated two approaches to feature tracking.The first is bad on independently ex-tracting salient features in image pairs and using correlation to establish correspondence.The arch problem can be greatly reduced by using first-order image-plane feature motion prediction and constraints on possible inter-frame motion.This strategy involves no history or
long term feature tracking.An example of this approach is [5]which us the Harris corner extractor to identify salient feature points followed by zero-mean normalized cross correlation to establish correspondence.
An alternate strategy is to extract features in one image,and then u some variant of correlation to find the feature’s position in the cond image.Using this approach,the feature’s location can be identified not only in the cond image,but in every subquent image where it is visible,and this advantage can be exploited to improve the estimates of both the point’s position and the rover’s motion by algorithms such as the online shape-from-motion algorithm described in ction V.
One method for performing the best
correlated feature location in the cond and subquent images is Lucas-Kanade [6],which us Gauss-Newton minimization to minimize the sum of squared intensity errors between the intensity mask of the feature being tracked and the intensities visible in the current image,with respect to the location of the feature in the new image.Coupled with bilinear interpolation for computing intensities at non-integer location in the current image,Lucas-Kanade is capable of tracking features to subpixel resolution,and one-tenth of a pixel is an accuracy that is commonly cited.One method for extracting features suitable for track-ing with Lucas-Kanade choos features in the original image that provide the best conditioning for the system that Lucas-Kanade’s Gauss-Newton minimization solves on
Fig.5
谋害P ANORAMIC CAMERA NOTATION .
each iteration.We have ud this method in our experiment,but in practice any sufficiently textured region of the image can be tracked using Lucas-Kanade.
In this paper we have adopted the cond of the paradigms,and ud Lucas-Kanade to track features through the image quence as long as they are visible.Although only pairwi correspondences are required by the robust optical flow approach described in Section IV,correspon
dences through multiple images are required for the online shape-from-motion approach that we describe in Section V.So,adopting this approach allows us to perform an meaningful comparison between the two methods using the same tracking data.A typical inter-frame feature flow pattern is shown in Figure 4.
IV.R OBUST OPTICAL FLOW METHOD
A.Algorithm
For each visual feature,u v ,we can compute a ray in space as shown in Figure 5.From similar triangles we can write tan θu f .We will approximate the origin to be at the center of the mirror so an arbitrary ray can be written in parametric form as
x y z
λ
a b 1
where a tan αtan 1u f cos βand b tan αtan 1v f sin β.We will further assume that the ground is an arbitrary plane Ax By Cz 1which the ray intercts at the point on the line,λ,where
λ
A B C
a b 1
雪浓汤1
Thus an image-plane point u v is projected onto the ground plane at x y.If the robot moves by∆x∆y∆θthat point becomes x y which can be mapped back to the image plane as u v from which we can compute the image-plane displacement or opticalflow
ˆduˆdv P u v u0v0fα∆x∆y∆θ(1) which is a function of the feature coordinate,the camera intrinsic parameters(principle point u0v0,focal length f, and elevation gainα)and the vehicle motion.We assume that camera height,h,is known.Our obrvation is the displacement at a number of image coordinates,and our cost function is bad on the median of the error norm between the estimated and obrved displacement
e1med
Value Units
u0247
v0199
α
f999
200
400
600
800
10001200
1400
1600
1800
−1−0.500.51
d x (m )
d θ (r a d )
d y (m )
Fig.6
R ESULTS OF SIMULTANEOUS FIT TO MOTION (TOP )AND CAMERA INTRINSICS (BOTTOM ).dx ,dy ,AND d θARE THE ESTIMATED
INCREMENTAL MOTION BETWEEN FRAMES .
u 0v 0
ARE THE
COORDINATES OF THE PRINCIPLE POINT ,αTHE MIRROR ’S ELEVATION
GAIN AND
f THE FOCAL LENGTH .
The re-projection of point j is:
x j
ΠR ρT X j
t
(2)
Here,ρand t are the camera-to-world rotation Euler angles and translation of the camera,R ρis the rotation matrix described by ρ,and X j is the three-dimensional world coordinate system position of point j ,so that R ρT X j t is the camera coordinate system location of point j .Πis the omnidirectional projection model that computes the image location of the camera coordinate system point.This measurement equation is nonlinear in the estimated parameters,which motivates our u of the iterated ex-tended Kalman filter rather than the standard Kalman filter,which assumes that obrvations are a linear function of the estimated parameters corrupted by Gaussian noi.We typically assume that the Gaussian errors in the obrved辱母
Fig.7
C OMPARISON OF PATH FROM INTEGRATE
D VELOCITY (SOLID )WITH
GROUND TRUTH FROM
GPS (DASHED ).T OP IS ALL 2000FRAMES ,
BOTTOM IS THE REGION AROUND THE STARTING POINT .
feature locations are isotropic with variance (2.0pixels)2in both image x and y directions.
As described,the filter is susceptible to gross errors in the two-dimensional tracking.To improve performance in the face of mis-tracking,we discard the point with highest residual after the measurement step if the residual is over some threshold.The measurement step is then re-executed from the propagation step estimate,and this process is repeated until no points have a residual greater than the threshold.We have found this to be an effective method for identifying points that are mis-tracked,become occluded,or are on independently moving objects in the scene.We typically choo this threshold to be some fraction or small multiple of the expected obrvation variances,and in our experience choosing a threshold of less than a pixel generally produces the highest accuracy in the estimated motion.However,this requires a highly accurate camera calibration,and we revisit this point in our experimental results.
An initial state estimate distribution must be available before online operation can begin.We initialize both the mean and covariance that specify the distribution using a batch algorithm,which simultaneo
usly estimates the six de-gree of freedom camera positions corresponding to the first
gps and estimated (x, y) translations
(meters)
(m e t e r s )
Fig.8
T HE GPS ESTIMATES OF THE (X ,Y )TRANSLATION AND THE
ESTIMATES FROM ONLINE SHAPE -FROM -MOTION ARE SHOWN AS THE
二月份日历SOLID AND DOTTED LINES ,RESPECTIVELY .
veral images in the quence and the three-dimensional positions of the tracked features visible in tho im-age.This estimation is performed by using Levenberg-Marquardt to minimize the re-projection errors for the first veral images with respect to the camera positions and point positions,and is described in detail in [7].We add points that become visible after online operation has begun to the state estimate distribution by adapting the method for incorporating new obrvations in simultaneous mapping and localization with active range nsors,described by [11],to the ca of image data.B.Results
牙龈肿了吃什么药
As mentioned in previous ctions,a potential advantage of online shape-from-motion is that it can exploit features tracked over a large portion of the image stream.In the first 300images of the omnidirectional image stream from Hyperion,565points were tracked,with an average of 92.37points per image and an average of 49.04images per point.
The ground truth (i.e.,GPS)and estimated x y trans-lations that result from applying the online shape-from-motion to the first 300images are shown together in Figure 8,as the solid and dotted lines,respectively.Becau shape-from-motion only recovers shape and motion up to an unknown sc
ale factor,we have applied the scaled rigid transformation to the recovered estimate that best aligns it with the ground truth values.In the estimates,the average and maximum three-dimensional translation errors over the 300estimated positions are 22.9and 72.7cm,respectively;this average error is less than 1%of the approximately 29.2m traveled during the traver.The errors,which are largest at the ends of the quence,are due primarily to the unknown transformation between the camera and mirror.After image 300this error increas until the filter fails.
科技发展的利弊We are still investigating the details of this behavior.
VI.C ONCLUSION
In this paper we have compared two approaches to visual odometry from a omnidirectional image quence.The robust optical flow method is able to estimate camera intrinsic parameters as well as an estimate of vehicle ve-locity.The shape-from-motion technique produces higher precision estimation of vehicle motion but it comes at the expen of a larger computation expen.Our current experiments also indicate that it is important to have accurate calibration between the camera and the curved mirror for this technique.
We plan to extend this work to include fisheye lens,and to incorporate inertial nsor data so as to
improve the robustness and reliability of the recovered position.
A CKNOWLEDGMENTS
This work was conducted while the first author was a Visiting Scientist at the Robotics Institute over the period July-October 2003.
R EFERENCES
[1]J.S.Chahl and M.V .Srinivasan,“Visual computation of egomotion
using an image interpolation technique,”Biological Cybernetics ,1996.
[2] C.F.Patrick Baker,Abhijit S.Ogale and Y .Aloimonos,“New eyes
for robotics,”in Proc.Int.Conf on Intelligent Robots and Systems (IROS),2003.
[3] C.F.Olson,L.H.Matthies,M.Schoppers,and M.W.Maimone,
“Robust stereo ego-motion for long distance navigation,”in IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2000),vol.2,Hilton Head,South Carolina,June 2000,pp.453–458.
[4]M.Ollis,H.Herman,and S.Singh,“Analysis and design of
panoramic stereo vision using equi-angular pixel cameras,”Carnegie Mellon University,Pittsburgh,Pennsylvania,Tech.Rep.CMU-RI-TR-99-04,January 1999.
[5]P.Corke,“An inertial and visual nsing system for a small
autonomous helicopter,”J.Robotic Systems ,vol.21,no.2,pp.43–51,Feb.2004.
[6] B.D.Lucas and T.Kanade,“An iterative image registration tech-nique with an application to stereo vision,”in Seventh Interna-tional Joint Conference on Artificial Intelligence ,vol.2,Vancouver,Canada,August 1981,pp.674–679.
[7] D.Strelow,J.Mishler,S.Singh,and H.Herman,“Extending shape-from-motion to noncentral omnidirectional cameras,”in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2001),Wailea,Hawaii,October 2001.
[8] A.Gelb,Ed.,Applied Optimal Estimation .Cambridge,Mas-sachutts:MIT Press,1974.
[9]T.J.Broida,S.Chandrashekhar,and R.Chellappa,“Recursive
3-D motion estimation from a monocular image quence,”IEEE Transactions on Aerospace and Electronic Systems ,vol.26,no.4,pp.639–656,July 1990.
[10] A.Azarbayejani and A.P.Pentland,“Recursive estimation of
motion,structure,and focal length,”IEEE Transations on Pattern Analysis and Machine Intelligence ,vol.17,no.6,pp.562–575,June 1995.
[11]R.Smith,M.Self,and P.Cheeman,“Estimating uncertain spatial
relationships in robotics,”in Autonomous Robot Vehicles ,I.J.Cox and G.T.Wilfong,Eds.New York:Springer-Verlag,1990,pp.167–193.