Abstract
3D single object tracking is the task of localizing a target object in a search point cloud frame. In this letter, we present a multi-level structure-enhanced tracking model to improve the tracking performance in sparse 3D point clouds. Towards this end, we first encode the target and the search point clouds efficiently in two near-neighbors graphs, which allows structural information to flow between neighbors along graph edges. We then design a cross-graph attention mechanism, which associates similar nodes across the target graph and the search graph, and further dissimilar graph nodes for apart. Integrating the proposed mechanism into the above Siamese feature learning of both the target and the search frame, we strengthen the structural correlation between the target and the search frame. In that case, distinguishing the potential target from the background in the search frame would be much simpler. Finally, we design a U-shaped sparse convolutional block to aggregate the structural features of the potential target in the search frame. Integrating the proposed block into an existing target localization module from (Hui et al., 2021), we localize target centers accurately. Experiments on the KITTI benchmark demonstrate that our method outperforms some state-of-the-art models, achieving at least a 3.2% improvement in terms of average tracking precision.
| Original language | English |
|---|---|
| Pages (from-to) | 9-16 |
| Number of pages | 8 |
| Journal | IEEE Robotics and Automation Letters |
| Volume | 8 |
| Issue number | 1 |
| DOIs | |
| State | Published - 1 Jan 2023 |
| Externally published | Yes |
Keywords
- Cross-graph learning
- deep learning for visual perception
- deep learning methods
- space refinement
- visual tracking
Fingerprint
Dive into the research topics of 'Multi-Level Structure-Enhanced Network for 3D Single Object Tracking in Sparse Point Clouds'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver