PatchAugNet: Patch feature augmentation-based heterogeneous point cloud place recognition in large-scale street scenes

Xianghong Zou1 Jianping Li4,† Yuan Wang2 Fuxun Liang1
Weitong Wu1 Haiping Wang1 Bisheng Yang1,† Zhen Dong1,3
1 State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University
2 School of Geography and Environment, Jiangxi Normal University
3 Hubei Luojia Laboratory
4 School of Electrical and Electronic Engineering, Nanyang Technological University
Corresponding authors.   

[Paper]      [Code]     [BibTeX]

Abstract

Point Cloud Place Recognition (PCPR) in street scenesis an essential task in the fields of autonomous driving, robot navigation, and urban map updating. However, the domain gap between heterogeneous point clouds and the difficulty of feature characterization in large-scale complex street scenes pose significant challenges for existing PCPR methods. Most PCPR methods only take into account point clouds collected by the same platforms and sensors, thus they are with poor domain transferability. In this paper, we propose PatchAugNet, which utilizes patch feature augmentation and adaptive pyramid feature aggregation to achieve better performance and generalizability for Heterogeneous Point Cloud-based Place Recognition (HPCPR) tasks. Firstly, multi-scale local features are extracted by the pyramid feature extraction module. Secondly, local features are enhanced by the patch feature augmentation module to overcome the domain gap problem and achieve better feature representation as well as network generalizability. Finally, a global feature is generated using an adaptive pyramid feature aggregation module, which automatically adjusts and balances the proportion of intra-scale and inter-scale features according to the scene content. To evaluate the performance of PatchAugNet, a large-scale heterogeneous point cloud dataset consisting of high-precision Mobile Laser Scanning (MLS) point clouds and helmet-mounted Portable Laser Scanning (PLS) point clouds is collected. The dataset covers various street scenes with a length of over 20km. The comprehensive experimental results indicate that PatchAugNet achieves State-Of-The-Art (SOTA) performance with 83.43% recall@top1% and 60.34% recall@top1 on unseen large-scale street scenes, outperforming existing SOTA PCPR methods by +9.57 recall@top1% and +15.50 recall@top1, while exhibiting better generalizability.


PatchAugNet, based on patch feature augmentation and adaptive pyramid feature aggregation, achieves better performance and generalizability for Heterogeneous Point Cloud-based Place Recognition tasks. 1)The patch feature augmentation module greatly overcomes the domain gap problem and achieves better feature representation as well as network generalization. 2)The adaptive pyramid feature aggregation module automatically adjusts and balance the proportion of intra-scale and inter-scale features in feature aggregation according to the scene content, which effectively improves the discrimination of global features.

Introduction

[Youtube]

Experimental Datasets (Self-collected heterogeneous point clouds)

Overview of Experimental Data: The red and blue lines indicate the acquisition trajectories of the MLS and helmet-mounted PLS systems (manually offset for display purposes). The dotted box represents the MLS and PLS point clouds collected at the same location (colored by elevation), and the MLS and PLS systems in black boxes are CHCNAV Alpha 3D and WHU-Helmet respectively.

Compare to baseline methods

Recall curves of different methods on experimental data.

Recall curves of different methods on public datasets.

Cases are presented to demonstrate the proposed method for place recognition on dataset B. Each case is represented by a single row, with the first column containing the query submap and the 2nd to 6th columns containing the top 1 to top 5 retrieved submaps, respectively.


Visualization of patch feature augmentation

Patch reconstruction results of PatchAugNet: the reconstruction results of all patches in the submap are plotted together for the convenience of presentation.

Cases of hard negative patch mining based on geometric similarity (the value below the image is the value of KL scatter, which is one-thousandth of the true value).


Effect of PFA and APFA

Place recognition cases using/without patch feature augmentation (PFA) and adaptive pyramid feature aggregation (APFA) modules show red boxes indicating failure and green boxes indicating success.


BibTex

@article{zou2023PatchAugNet,
  title={PatchAugNet: Patch feature augmentation-based heterogeneous point cloud place recognition in large-scale street scenes},
  author={Xianghong Zou and Jianping Li and Yuan Wang and Fuxun Liang and Weitong Wu and Haiping Wang and Bisheng Yang and Zhen Dong},
  journal={ISPRS Journal of Photogrammetry and Remote Sensing},
  volume={206},
  pages={273--292},
  year={2023}
  publisher={Elsevier}
}

Acknowledgements: We borrow this template from FreeReg.