Image based localization in urban environments
2006
Abstract
In this paper we present a prototype system for image based localization in urban environments. Given a database of views of city street scenes tagged by GPS locations, the system computes the GPS location of a novel query view. We first use a wide-baseline matching technique based on SIFT features to select the closest views in the database. Often due to a large change of viewpoint and presence of repetitive structures, a large percentage of matches (> 50%) are not correct correspondences. The subsequent motion estimation between the query view and the reference view, is then handled by a novel and efficient robust estimation technique capable of dealing with large percentage of outliers. This stage is also accompanied by a model selection step among the fundamental matrix and the homography. Once the motion between the closest reference views is estimated, the location of the query view is then obtained by triangulation of translation directions. Approximate solutions for cases when triangulation cannot be obtained reliably are also described. The presented system is tested on the dataset used in ICCV 2005 Computer Vision Contest and is shown to have higher accuracy than previous reported results.
References (16)
- T. Berg A. Berg and J. Malik, "Shape matching and object recognition using low distortion correspondences," in CVPR, 2005.
- W. Zhang and J. Kosecka, "A new inlier identification proce- dure for robust estimation problems," in Robotics: Science and Systems, 2006.
- D. Robertson and R. Cipolla, "An image-based system for urban navigation," in BMVC, 2004.
- H. Shao, T. Svoboda, T. Tuytelaars, and L. Van Gool, "Hpat indexing for fast object/scene recognition based on local ap- pearance," in Computer Lecture Notes on Image and Video Retrieval, July 2003, pp. 71-80.
- T. Goedeme and T. Tuytelaars, "Fast wide baseline matching for visual navigation," in CVPR'04, 2004, pp. 24 -29.
- F. Schaffalitzky and A. Zisserman, "Multi-view matching for unordered image sets," in ECCV'02, 2002, pp. 414-431.
- Lucas Paletta and Gerald Fritz, "Urban object detection from mobile phone imagery using informative sift descriptors," in SCIA, 2005.
- T. Yeh, K. Tollmar, and T. Darrell, "Searching the web with mobile images for location recognition," in CVPR, 2004.
- D. Lowe, "Distinctive image features from scale-invariant keypoints," IJCV, 2004.
- C. Schmid and R. Mohr, "Local greyvalue invariants for im- age retrieval," Pattern Analysis and Machine Intelligence, vol. 19, pp. 530-535, 1997.
- T. Tuytelaars and L. Van Gool., "Matching widely separated views based on affine invariant regions," IJCV, vol. 59, 2004.
- Jiri Matas, Ondrej Chum, M. Urban, and Tomas Pajdla, "Ro- bust wide baseline stereo from maximally stable extremal re- gions," in BMVC'02, 2002, pp. 384-393.
- K. Mikolajczk and C. Schmid, "A performance evaluation of local descriptors," in CVPR 2003, 2003.
- Yi Ma, Stefano Soatto, Jana Kosecka, and Shankar Sastry, An Invitation to 3D Vision: From Images to Models, Springer Verlag, 2003.
- J. Kosecka and W. Zhang, "Video compass," in Proceedings of European Conference on Computer Vision, 2002, pp. 657 -673.
- M. A. Fischler and R. C. Bolles, "Random sample consen- sus: a paradigm for model fitting with applications to image analysis and automated cartography," in ECCV'96, 1996, pp. 683 -695.