We present a minimal, non-iterative solution to the absolute pose problem for images from rolling... more We present a minimal, non-iterative solution to the absolute pose problem for images from rolling shutter cameras. Absolute pose problem is a key problem in computer vision and rolling shutter is present in a vast majority of today's digital cameras. We propose several rolling shutter camera models and verify their feasibility for a polynomial solver. A solution based on linearized camera model is chosen and verified in several experiments. We use a linear approximation to the camera orientation, which is meaningful only around the identity rotation. We show that the standard P3P algorithm is able to estimate camera orientation within 6 degrees for camera rotation velocity as high as 30deg/frame. Therefore we can use the standard P3P algorithm to estimate camera orientation and to bring the camera rotation matrix close to the identity. Using this solution, camera position, orientation, translational velocity and angular velocity can be computed using six 2D-to-3D correspondences, with orientation error under half a degree and relative position error under 2%. A significant improvement in terms of the number of inliers in RANSAC is demonstrated.
In this paper we aim at reconstructing 3D scenes from images with unknown focal lengths downloade... more In this paper we aim at reconstructing 3D scenes from images with unknown focal lengths downloaded from photosharing websites such as Flickr. First we provide a minimal solution to finding the relative pose between a completely calibrated camera and a camera with an unknown focal length given six point correspondences. We show that this problem has up to nine solutions in general and present two efficient solvers to the problem. They are based on Gröbner basis, resp. on generalized eigenvalues, computation. We demonstrate by experiments with synthetic and real data that both solvers are correct, fast, numerically stable and work well even in some situations when the classical 6-point algorithm fails, e.g. when optical axes of the cameras are parallel or intersecting. Based on this solution we present a new efficient method for large-scale structure from motion from unordered data sets downloaded from the Internet. We show that this method can be effectively used to reconstruct 3D scenes from collection of images with very few (in principle single) images with known focal lengths 1 .
A number of minimal problems of structure from motion for cameras with radial distortion have rec... more A number of minimal problems of structure from motion for cameras with radial distortion have recently been studied and solved in some cases. These problems are known to be numerically very challenging and in several cases there exist no known practical algorithm yielding solutions in floating point arithmetic. We make some crucial observations concerning the floating point implementation of Gröbner basis computations and use these new insights to formulate fast and stable algorithms for two minimal problems with radial distortion previously solved in exact rational arithmetic only: (i) simultaneous estimation of essential matrix and a common radial distortion parameter for two partially calibrated views and six image point correspondences and (ii) estimation of fundamental matrix and two different radial distortion parameters for two uncalibrated views and nine image point correspondences. We demonstrate on simulated and real experiments that these two problems can be efficiently solved in floating point arithmetic. 1
Visual localization enables autonomous vehicles to navigate in their surroundings and augmented r... more Visual localization enables autonomous vehicles to navigate in their surroundings and augmented reality applications to link virtual to real worlds. Practical visual localization approaches need to be robust to a wide variety of viewing condition, including day-night changes, as well as weather and seasonal variations, while providing highly accurate 6 degree-of-freedom (6DOF) camera pose estimates. In this paper, we introduce the first benchmark datasets specifically designed for analyzing the impact of such factors on visual localization. Using carefully created ground truth poses for query images taken under a wide variety of conditions, we evaluate the impact of various factors on 6DOF camera pose estimation accuracy through extensive experiments with state-of-the-art localization approaches. Based on our results, we draw conclusions about the difficulty of different conditions, showing that long-term localization is far from solved, and propose promising avenues for future work, including sequence-based localization approaches and the need for better local features. Our benchmark is available at visuallocalization.net.
IEEE Transactions on Pattern Analysis and Machine Intelligence, Feb 1, 2022
We address the problem of finding reliable dense correspondences between a pair of images. This i... more We address the problem of finding reliable dense correspondences between a pair of images. This is a challenging task due to strong appearance differences between the corresponding scene elements and ambiguities generated by repetitive patterns. The contributions of this work are threefold. First, inspired by the classic idea of disambiguating feature matches using semi-local constraints, we develop an end-to-end trainable convolutional neural network architecture that identifies sets of spatially consistent matches by analyzing neighbourhood consensus patterns in the 4D space of all possible correspondences between a pair of images without the need for a global geometric model. Second, we demonstrate that the model can be trained effectively from weak supervision in the form of matching and non-matching image pairs without the need for costly manual annotation of point to point correspondences. Third, we show the proposed neighbourhood consensus network can be applied to a range of matching tasks including both category-and instance-level matching, obtaining the state-of-the-art results on the PF, TSS, InLoc and HPatches benchmarks.
Zenodo (CERN European Organization for Nuclear Research), Jul 17, 2020
Most consumer cameras are equipped with electronic rolling shutter, leading to image distortions ... more Most consumer cameras are equipped with electronic rolling shutter, leading to image distortions when the camera moves during image capture. We explore a surprisingly simple camera configuration that makes it possible to undo the rolling shutter distortion: two cameras mounted to have different rolling shutter directions. Such a setup is easy and cheap to build and it possesses the geometric constraints needed to correct rolling shutter distortion using only a sparse set of point correspondences between the two images. We derive equations that describe the underlying geometry for general and special motions and present an efficient method for finding their solutions. Our synthetic and real experiments demonstrate that our approach is able to remove large rolling shutter distortions of all types without relying on any specific scene structure.
We propose a new method for constructing elimination templates for efficient polynomial system so... more We propose a new method for constructing elimination templates for efficient polynomial system solving of minimal problems in structure from motion, image matching, and camera tracking. We first construct a particular affine parameterization of the elimination templates for systems with a finite number of distinct solutions. Then, we use a heuristic greedy optimization strategy over the space of parameters to get a template with a small size. We test our method on 34 minimal problems in computer vision. For all of them, we found the templates either of the same or smaller size compared to the state-of-the-art. For some difficult examples, our templates are, e.g., 2.1, 2.5, 3.8, 6.6 times smaller. For the problem of refractive absolute pose estimation with unknown focal length, we have found a template that is 20 times smaller. Our experiments on synthetic data also show that the new solvers are fast and numerically accurate. We also present a fast and numerically accurate solver for the problem of relative pose estimation with unknown common focal length and radial distortion.
Many automatic methods for reconstructing camera motion and scene geometry from a large number of... more Many automatic methods for reconstructing camera motion and scene geometry from a large number of images evaluate the quality of the reconstruction by error propagation from measurements to the estimated parameters. Unfortunately, uncertainty propagation is computationally challenging for large scenes and hence cannot be used for large scenes in practice. We present a new algorithm for efficient uncertainty propagation which works with millions of feature points, thousands of cameras and millions of 3D points on a single computer and achieves about twenty times speedup.
We address the problem of finding reliable dense correspondences between a pair of images. This i... more We address the problem of finding reliable dense correspondences between a pair of images. This is a challenging task due to strong appearance differences between the corresponding scene elements and ambiguities generated by repetitive patterns. The contributions of this work are threefold. First, inspired by the classic idea of disambiguating feature matches using semi-local constraints, we develop an end-to-end trainable convolutional neural network architecture that identifies sets of spatially consistent matches by analyzing neighbourhood consensus patterns in the 4D space of all possible correspondences between a pair of images without the need for a global geometric model. Second, we demonstrate that the model can be trained effectively from weak supervision in the form of matching and non-matching image pairs without the need for costly manual annotation of point to point correspondences. Third, we show the proposed neighbourhood consensus network can be applied to a range of...
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
Most consumer cameras are equipped with electronic rolling shutter, leading to image distortions ... more Most consumer cameras are equipped with electronic rolling shutter, leading to image distortions when the camera moves during image capture. We explore a surprisingly simple camera configuration that makes it possible to undo the rolling shutter distortion: two cameras mounted to have different rolling shutter directions. Such a setup is easy and cheap to build and it possesses the geometric constraints needed to correct rolling shutter distortion using only a sparse set of point correspondences between the two images. We derive equations that describe the underlying geometry for general and special motions and present an efficient method for finding their solutions. Our synthetic and real experiments demonstrate that our approach is able to remove large rolling shutter distortions of all types without relying on any specific scene structure.
Estimating uncertainty of camera parameters computed in Structure from Motion (SfM) is an importa... more Estimating uncertainty of camera parameters computed in Structure from Motion (SfM) is an important tool for evaluating the quality of the reconstruction and guiding the reconstruction process. Yet, the quality of the estimated parameters of large reconstructions has been rarely evaluated due to the computational challenges. We present a new algorithm which employs the sparsity of the uncertainty propagation and speeds the computation up about ten times w.r.t. previous approaches. Our computation is accurate and does not use any approximations. We can compute uncertainties of thousands of cameras in tens of seconds on a standard PC. We also demonstrate that our approach can be effectively used for reconstructions of any size by applying it to smaller sub-reconstructions.
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
The quality and speed of Structure from Motion (SfM) methods depend significantly on the camera m... more The quality and speed of Structure from Motion (SfM) methods depend significantly on the camera model chosen for the reconstruction. In most of the SfM pipelines, the camera model is manually chosen by the user. In this paper, we present a new automatic method for camera model selection in large scale SfM that is based on efficient uncertainty evaluation. We first perform an extensive comparison of classical model selection based on known Information Criteria and show that they do not provide sufficiently accurate results when applied to camera model selection. Then we propose a new Accuracy-based Criterion, which evaluates an efficient approximation of the uncertainty of the estimated parameters in tested models. Using the new criterion, we design a camera model selection method and fine-tune it by machine learning. Our simulated and real experiments demonstrate a significant increase in reconstruction quality as well as a considerable speedup of the SfM process.
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018
Visual localization enables autonomous vehicles to navigate in their surroundings and augmented r... more Visual localization enables autonomous vehicles to navigate in their surroundings and augmented reality applications to link virtual to real worlds. Practical visual localization approaches need to be robust to a wide variety of viewing condition, including day-night changes, as well as weather and seasonal variations, while providing highly accurate 6 degree-of-freedom (6DOF) camera pose estimates. In this paper, we introduce the first benchmark datasets specifically designed for analyzing the impact of such factors on visual localization. Using carefully created ground truth poses for query images taken under a wide variety of conditions, we evaluate the impact of various factors on 6DOF camera pose estimation accuracy through extensive experiments with state-of-the-art localization approaches. Based on our results, we draw conclusions about the difficulty of different conditions, showing that long-term localization is far from solved, and propose promising avenues for future work, including sequence-based localization approaches and the need for better local features. Our benchmark is available at visuallocalization.net.
2015 IEEE International Conference on Computer Vision (ICCV), 2015
The estimation of the epipolar geometry of two cameras from image matches is a fundamental proble... more The estimation of the epipolar geometry of two cameras from image matches is a fundamental problem of computer vision with many applications. While the closely related problem of estimating relative pose of two different uncalibrated cameras with radial distortion is of particular importance, none of the previously published methods is suitable for practical applications. These solutions are either numerically unstable, sensitive to noise, based on a large number of point correspondences, or simply too slow for real-time applications. In this paper, we present a new efficient solution to this problem that uses 10 image correspondences. By manipulating ten input polynomial equations, we derive a degree 10 polynomial equation in one variable. The solutions to this equation are efficiently found using the Sturm sequences method. In the experiments, we show that the proposed solution is stable, noise resistant, and fast, and as such efficiently usable in a practical Structure-from-Motion pipeline.
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
We present a solution to the rolling shutter (RS) absolute camera pose problem with known vertica... more We present a solution to the rolling shutter (RS) absolute camera pose problem with known vertical direction. Our new solver, R5Pup, is an extension of the general minimal solution R6P, which uses a double linearized RS camera model initialized by the standard perspective P3P. Here, thanks to using known vertical directions, we avoid double linearization and can get the camera absolute pose directly from the RS model without the initialization by a standard P3P. Moreover, we need only five 2D-to-3D matches while R6P needed six such matches. We demonstrate in simulated and real experiments that our new R5Pup is robust, fast and a very practical method for absolute camera pose computation for modern cameras on mobile devices. We compare our R5Pup to the state of the art RS and perspective methods and demonstrate that it outperforms them when vertical direction is known in the range of accuracy available on modern mobile devices. We also demonstrate that when using R5Pup solver in structure from motion (SfM) pipelines, it is better to transform already reconstructed scenes into the standard position, rather than using hard constraints on the verticality of up vectors.
Zenodo (CERN European Organization for Nuclear Research), Mar 10, 2020
We present a complete classification of minimal problems for generic arrangements of points and l... more We present a complete classification of minimal problems for generic arrangements of points and lines in space observed partially by three calibrated perspective cameras when each line is incident to at most one point. This is a large class of interesting minimal problems that allows missing observations in images due to occlusions and missed detections. There is an infinite number of such minimal problems; however, we show that they can be reduced to 140616 equivalence classes by removing superfluous features and relabeling the cameras. We also introduce camera-minimal problems, which are practical for designing minimal solvers, and show how to pick a simplest camera-minimal problem for each minimal problem. This simplification results in 74575 equivalence classes. Only 76 of these were known; the rest are new. In order to identify problems that have potential for practical solving of image matching and 3D reconstruction, we present several smaller natural subfamilies of camera-minimal problems as well as compute solution counts for all camera-minimal problems which have less than 300 solutions for generic data.
Zenodo (CERN European Organization for Nuclear Research), Jul 17, 2020
We present a method for solving two minimal problems for relative camera pose estimation from thr... more We present a method for solving two minimal problems for relative camera pose estimation from three views, which are based on three view correspondences of (i) three points and one line and (ii) three points and two lines through two of the points. These problems are too difficult to be efficiently solved by the state of the art Gröbner basis methods. Our method is based on a new efficient homotopy continuation (HC) solver, which dramatically speeds up previous HC solving by specializing HC methods to generic cases of our problems. We characterize their number of solutions and show with simulated experiments that our solvers are numerically robust and stable under image noise. We show in real experiments that (i) SIFT feature location and orientation provide good enough point-and-line correspondences for three-view reconstruction and (ii) that we can solve difficult cases with too few or too noisy tentative matches where the state of the art structure from motion initialization fails.
SIAM Journal on Applied Algebra and Geometry, Dec 20, 2022
We consider Galois/monodromy groups arising in computer vision applications, with a view towards ... more We consider Galois/monodromy groups arising in computer vision applications, with a view towards building more efficient polynomial solvers. The Galois/monodromy group allows us to decide when a given problem decomposes into algebraic subproblems, and whether or not it has any symmetries. Tools from numerical algebraic geometry and computational group theory allow us to apply this framework to classical and novel reconstruction problems. We consider three classical cases-3-point absolute pose, 5-point relative pose, and 4-point homography estimation for calibrated cameras-where the decomposition and symmetries may be naturally understood in terms of the Galois/monodromy group. We then show how our framework can be applied to novel problems from absolute and relative pose estimation. For instance, we discover new symmetries for absolute pose problems involving mixtures of point and line features. We also describe a problem of estimating a pair of calibrated homographies between three images. For this problem of degree 64, we can reduce the degree to 16, the latter better reflecting the intrinsic difficulty of algebraically solving the problem. As a byproduct, we obtain new constraints on compatible homographies, which may be of independent interest.
We present a complete classification of minimal problems for generic arrangements of points and l... more We present a complete classification of minimal problems for generic arrangements of points and lines in space observed partially by three calibrated perspective cameras when each line is incident to at most one point. This is a large class of interesting minimal problems that allows missing observations in images due to occlusions and missed detections. There is an infinite number of such minimal problems; however, we show that they can be reduced to 140616 equivalence classes by removing superfluous features and relabeling the cameras. We also introduce camera-minimal problems, which are practical for designing minimal solvers, and show how to pick a simplest camera-minimal problem for each minimal problem. This simplification results in 74575 equivalence classes. Only 76 of these were known; the rest are new. In order to identify problems that have potential for practical solving of image matching and 3D reconstruction, we present several smaller natural subfamilies of camera-minimal problems as well as compute solution counts for all camera-minimal problems which have less than 300 solutions for generic data.
Uploads
Papers by Tomas Pajdla