Abstract
Image-to-point cloud (I2P) registration is a fundamental task for robots and autonomous vehicles to achieve cross-modality data fusion and localization.
Existing I2P registration methods estimate correspondences at the point/pixel level, often overlooking global alignment.
However, I2P matching can easily converge to a local optimum when performed without high-level guidance from global constraints.
To address this issue, this paper introduces CoFiI2P, a novel I2P registration network that extracts correspondences in a coarse-to-fine manner
to achieve the globally optimal solution. First, the image and point cloud data are processed through a Siamese encoder-decoder network for
hierarchical feature extraction. Second, a coarse-to-fine matching module is designed to leverage these features and establish robust feature correspondences.
Specifically, In the coarse matching phase, a novel I2P transformer module is employed to capture both homogeneous and heterogeneous global information from
the image and point cloud data. This enables the estimation of coarse super-point/super-pixel matching pairs with discriminative descriptors.
In the fine matching module, point/pixel pairs are established with the guidance of super-point/super-pixel correspondences. Finally, based on matching pairs,
the transform matrix is estimated with the EPnP-RANSAC algorithm. Extensive experiments conducted on the KITTI dataset demonstrate that CoFiI2P
achieves impressive results, with a relative rotation error (RRE) of 1.14 degrees and a relative translation error (RTE) of 0.29 meters.
These results represent a significant improvement of 84% in RRE and 89% in RTE compared to the current state-of-the-art (SOTA) method.
Introduction Video
If you are not interested in our in-depth analysis of Mobile-Seed, please skip ahead to 38s for qualitative results.
Acknowledgements:
We borrow this template from Mobile-Seed.