TY - GEN
T1 - Multi-label X-Ray Imagery Classification via Bottom-Up Attention and Meta Fusion
AU - Hu, Benyi
AU - Zhang, Chi
AU - Wang, Le
AU - Zhang, Qilin
AU - Liu, Yuehu
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Automatic security inspection has received increasing interests in recent years. Due to the fixed top-down perspective of X-ray scanning of often tightly packed luggages, such images typically suffer from penetration-induced occlusions, severe object overlapping and violent changes in appearance. For this particular application, few research efforts have been made. To deal with the overlapping in X-ray images classification, we propose a novel Security X-ray Multi-label Classification Network (SXMNet). Our hypothesis is that different overlapping levels and scale variations are the primary challenges in the multi-label classification problem of prohibited items. To address these challenges, we propose to incorporate 1) spatial attention to locate prohibited items despite shape, color and texture variations; and 2) anisotropic fusion of per-stage predictions to dynamically fuse hierarchical visual information under violent variations. Motivated by these, our SXMNet is boosted by bottom-up attention and neural-guided Meta Fusion. Raw input image is exploited to generate high-quality attention masks in a bottom-up way for pyramid feature refinement. Subsequently, the per-stage predictions according to the refined features are automatically re-weighted and fused via a soft selection guided by neural knowledge. Comprehensive experiments on the Security Inspection X-ray (SIXray) and Occluded Prohibited Items X-ray (OPIXray) datasets demonstrate the superiority of the proposed method.
AB - Automatic security inspection has received increasing interests in recent years. Due to the fixed top-down perspective of X-ray scanning of often tightly packed luggages, such images typically suffer from penetration-induced occlusions, severe object overlapping and violent changes in appearance. For this particular application, few research efforts have been made. To deal with the overlapping in X-ray images classification, we propose a novel Security X-ray Multi-label Classification Network (SXMNet). Our hypothesis is that different overlapping levels and scale variations are the primary challenges in the multi-label classification problem of prohibited items. To address these challenges, we propose to incorporate 1) spatial attention to locate prohibited items despite shape, color and texture variations; and 2) anisotropic fusion of per-stage predictions to dynamically fuse hierarchical visual information under violent variations. Motivated by these, our SXMNet is boosted by bottom-up attention and neural-guided Meta Fusion. Raw input image is exploited to generate high-quality attention masks in a bottom-up way for pyramid feature refinement. Subsequently, the per-stage predictions according to the refined features are automatically re-weighted and fused via a soft selection guided by neural knowledge. Comprehensive experiments on the Security Inspection X-ray (SIXray) and Occluded Prohibited Items X-ray (OPIXray) datasets demonstrate the superiority of the proposed method.
UR - https://www.scopus.com/pages/publications/85103252220
U2 - 10.1007/978-3-030-69544-6_11
DO - 10.1007/978-3-030-69544-6_11
M3 - 会议稿件
AN - SCOPUS:85103252220
SN - 9783030695439
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 173
EP - 190
BT - Computer Vision – ACCV 2020 - 15th Asian Conference on Computer Vision, 2020, Revised Selected Papers
A2 - Ishikawa, Hiroshi
A2 - Liu, Cheng-Lin
A2 - Pajdla, Tomas
A2 - Shi, Jianbo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 15th Asian Conference on Computer Vision, ACCV 2020
Y2 - 30 November 2020 through 4 December 2020
ER -