Mobile-Seed: Joint Semantic Segmentation and Boundary Detection for Mobile Robots

Youqi Liao1 Shuhao Kang2 Jianping Li1,3,† Yang Liu4
Yun Liu5 Zhen Dong1 Bisheng Yang1 Xieyuanli Chen6
1 Wuhan University 2 Technical University of Munich 3 Nanyang Technological University 4 King's College London
5 Agency for Science, Technology and Research (A*STAR) 6 National University of Defense Technology
Corresponding author.   

[Paper]      [Video]     [Code]     [BibTeX]

What can Mobile-Seed do?


Mobile-Seed is able to simultaneously inference the boundary map and semantic map of a 2D RGB image in real-time. (a) shows the motivation of our Mobile-Seed, which could provide strong constraints for downstream tasks, e.g instance segmentation, semantic SLAM and sensors calibration. (b) and (c) show that the key idea of Mobile-Seed is to assemble semantic segmentation stream and boundary detection stream in a shared framework and learn in a mutually reinforcing manner while maintaning real-time efficiency.

Abstract

Precise and rapid delineation of sharp boundaries and robust semantics is essential for numerous downstream robotic tasks, such as robot grasping and manipulation, real-time semantic mapping, and online sensor calibration performed on edge computing units. Although boundary detection and semantic segmentation are complementary tasks, most studies focus on lightweight models for semantic segmentation but overlook the critical role of boundary detection. In this work, we introduce Mobile-Seed, a lightweight, dual-task framework tailored for simultaneous semantic segmentation and boundary detection. Our framework features a two-stream encoder, an active fusion decoder (AFD) and a dual-task regularization approach. The encoder is divided into two pathways: one captures category-aware semantic information, while the other discerns boundaries from multi-scale features. The AFD module dynamically adapts the fusion of semantic and boundary information by learning channel-wise relationships, allowing for precise weight assignment of each channel. Furthermore, we introduce a regularization loss to mitigate the conflicts between dual-task learning and deep diversity supervision. Compared to existing methods, the proposed Mobile-Seed offers a lightweight framework to simultaneously improve semantic segmentation performance and accurately locate object boundaries. Experiments on the Cityscapes val dataset have shown that Mobile-Seed achieves notable improvement over state-of-the-art (SOTA) baseline by 2.2 percentage points (pp) in mIoU and 4.2 pp in mF-score, while maintaining an online inference speed of 23.9 frames-per-second (FPS) with 1024x2048 resolution input on an RTX 2080Ti GPU. Additional experiments on CamVid and PASCAL Context datasets confirm our method's generalizability.

Introduction Video

If you are not interested in our in-depth analysis of Mobile-Seed, please skip ahead to 3min35s for qualitative results

Semantic and Boundary Results

Semantic Segmentation



Semantic Boundary

BibTex

@article{liao2023mobilesed,
  title={Mobile-Seed: Joint Semantic Segmentation and Boundary Detection for Mobile Robots},
  author={Youqi Liao and Shuhao Kang and Jianping Li and Yang Liu and Yun Liu and Zhen Dong and Bisheng Yang and Xieyuanli Chen},
  journal={arXiv preprint 2311.12651},
  year={2023}
}

Acknowledgements: We borrow this template from FreeReg.