CoFiI2P: Coarse-to-Fine Correspondences-Based Image-to-Point Cloud Registration

Shuhao Kang*¹ Youqi Liao*^2,3 Jianping Li^2,4,† Fuxun Liang² Yuhao Li² Xianghong Zou² Fangning Li⁵ Xieyuanli Chen⁶ Zhen Dong^2,3 Bisheng Yang²
¹ Technical University of Munich ² Wuhan University ³ Hubei Luojia Laboratory ⁴ Nanyang Technological University
⁵ Beijing Urban Construction Exploration and Surveying Design Research Institute ⁶ National University of Defense Technology ^*The first two authors contribute equally. ^†Corresponding author.

[Paper] [Video] [Code] [BibTeX]

What can CoFiI2P do?

CoFiI2P is an image-to-point cloud registration network with coarse-to-fine pipeline. (a) shows the one-stage registration pipeline, where matching pairs are directly established at the point/pixel level, leading to a significant number of mismatches. (b) shows our coarse-to-fine matching pipeline. Under the guidance of super point-to-pixel pairs, point-to-pixel pairs are generated from the existing super pairs, which effectively eliminates most mismatches.

Abstract

Image-to-point cloud (I2P) registration is a fundamental task for robots and autonomous vehicles to achieve cross-modality data fusion and localization. Existing I2P registration methods estimate correspondences at the point/pixel level, often overlooking global alignment. However, I2P matching can easily converge to a local optimum when performed without high-level guidance from global constraints. To address this issue, this paper introduces CoFiI2P, a novel I2P registration network that extracts correspondences in a coarse-to-fine manner to achieve the globally optimal solution. First, the image and point cloud data are processed through a Siamese encoder-decoder network for hierarchical feature extraction. Second, a coarse-to-fine matching module is designed to leverage these features and establish robust feature correspondences. Specifically, In the coarse matching phase, a novel I2P transformer module is employed to capture both homogeneous and heterogeneous global information from the image and point cloud data. This enables the estimation of coarse super-point/super-pixel matching pairs with discriminative descriptors. In the fine matching module, point/pixel pairs are established with the guidance of super-point/super-pixel correspondences. Finally, based on matching pairs, the transform matrix is estimated with the EPnP-RANSAC algorithm. Extensive experiments conducted on the KITTI dataset demonstrate that CoFiI2P achieves impressive results, with a relative rotation error (RRE) of 1.14 degrees and a relative translation error (RTE) of 0.29 meters. These results represent a significant improvement of 84% in RRE and 89% in RTE compared to the current state-of-the-art (SOTA) method.

Introduction Video

If you are not interested in our in-depth analysis of Mobile-Seed, please skip ahead to 38s for qualitative results.

Registration results

Correspondences quality

BibTex

 @article{kang2023cofii2p,

    title={CoFiI2P: Coarse-to-Fine Correspondences-Based Image-to-Point Cloud Registration},

    author={Shuhao Kang and Youqi Liao and  and Jianping Li and Fuxun Liang and Yuhao Li and Xianghong Zou and Fangning Li and 
    Xieyuanli Chen and Zhen Dong and Bisheng Yang and Xieyuanli Chen},

    journal={arXiv preprint 2309.14660},

    year={2023}

  }

Acknowledgements: We borrow this template from Mobile-Seed.