Abstract
Object manipulation in object-stacking scenes is a significant but challenging skill for intelligent robots. In most cases, the relationships among objects should be considered before manipulation to prevent chaos and damages. However, the analysis of object relationships in object-stacking scenes, especially for robotic manipulation, remains to be unsolved. To this end, this paper presents a new convolutional neural network (CNN) architecture, called Visual Manipulation Relationship Network (VMRN), to recognize the visual manipulation relationships (VMR) between objects in real-time. By considering the manipulation relationships in object-stacking scenes, it ensures that the robot can complete manipulation tasks safely and reliably. The core of our model is the Object Pairing Pooling Layer (OP2L), which makes it possible to recognize objects and all possible VMRs in one forward process. Moreover, to train VMRN, we contribute a dataset named Visual Manipulation Relationship Dataset (VMRD) consisting of 4683 images with more than 16,000 object instances and the VMRs between each object pair. The experimental results show that the proposed network architecture can detect objects and predict VMRs.
| Original language | English |
|---|---|
| Pages (from-to) | 34-42 |
| Number of pages | 9 |
| Journal | Pattern Recognition Letters |
| Volume | 140 |
| DOIs | |
| State | Published - Dec 2020 |
| Externally published | Yes |
Keywords
- Grasp precondition
- Robot vision
- Visual manipulation relationship
Fingerprint
Dive into the research topics of 'Visual manipulation relationship recognition in object-stacking scenes'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver