TY - GEN
T1 - POWER-LLAVA
T2 - 31st IEEE International Conference on Image Processing, ICIP 2024
AU - Wang, Jiahao
AU - Li, Mingxuan
AU - Luo, Haichen
AU - Zhu, Jinguo
AU - Yang, Aijun
AU - Rong, Mingzhe
AU - Wang, Xiaohua
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The inspection of power transmission line has achieved notable achievements in the past few years, primarily due to the integration of deep learning technology. However, current inspection approaches continue to encounter difficulties in generalization and intelligence, which restricts their further applicability. In this paper, we introduce Power-LLaVA, the first large language and vision assistant designed to offer professional and reliable inspection services for power transmission line by engaging in dialogues with humans. Moreover, we also construct a large-scale and high-quality dataset specialized for the inspection task. By employing a two-stage training strategy on the constructed dataset, Power-LLaVA demonstrates exceptional performance at a comparatively low training cost. Extensive experiments further prove the great capabilities of Power-LLaVA within the realm of power transmission line inspection. Code shall be released.
AB - The inspection of power transmission line has achieved notable achievements in the past few years, primarily due to the integration of deep learning technology. However, current inspection approaches continue to encounter difficulties in generalization and intelligence, which restricts their further applicability. In this paper, we introduce Power-LLaVA, the first large language and vision assistant designed to offer professional and reliable inspection services for power transmission line by engaging in dialogues with humans. Moreover, we also construct a large-scale and high-quality dataset specialized for the inspection task. By employing a two-stage training strategy on the constructed dataset, Power-LLaVA demonstrates exceptional performance at a comparatively low training cost. Extensive experiments further prove the great capabilities of Power-LLaVA within the realm of power transmission line inspection. Code shall be released.
KW - Large language-vision assistant
KW - Power transmission line inspection
KW - Two-stage training strategy
UR - https://www.scopus.com/pages/publications/85213467773
U2 - 10.1109/ICIP51287.2024.10648271
DO - 10.1109/ICIP51287.2024.10648271
M3 - 会议稿件
AN - SCOPUS:85213467773
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 963
EP - 969
BT - 2024 IEEE International Conference on Image Processing, ICIP 2024 - Proceedings
PB - IEEE Computer Society
Y2 - 27 October 2024 through 30 October 2024
ER -