TY - GEN
T1 - GSA-Gaze
T2 - 26th IEEE International Conference on Intelligent Transportation Systems, ITSC 2023
AU - Han, Hongcheng
AU - Tian, Zhiqiang
AU - Liu, Yuying
AU - Li, Shengpeng
AU - Zhang, Dong
AU - Du, Shaoyi
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Estimating driver gaze accurately is critical for the human-machine cooperative driving, but the significant facial appearance diversions caused by background, illumination, personal characteristics, etc. pose a challenge to the generalizability of gaze estimation models. In this paper, we propose the generative self-adversarial learning mechanism for generalized gaze estimation that aims to learn general gaze features while eliminating sample-specific features and preventing cross-domain feature over-fitting. Firstly, to reduce information redundancy, the feature encoder is designed based on pyramid-grouped convolution to extract a sparse feature representation from the facial appearance. Secondly, the gaze regression module supervises the model to learn as many gaze-relevant features as possible. Thirdly, the adversarial image reconstruction task prompts the model to eliminate the domain-specific features. The adversarial learning of the gaze regression and the image reconstruction tasks guides the model to learn only general gaze features across domains, preventing cross-domain feature over-fitting, enhancing the domain generalization capability. The results of cross-domain testing of four active gaze datasets prove the effectiveness of the proposed method. The code is available at https://github.com/HongchengHan/GSA-Gaze
AB - Estimating driver gaze accurately is critical for the human-machine cooperative driving, but the significant facial appearance diversions caused by background, illumination, personal characteristics, etc. pose a challenge to the generalizability of gaze estimation models. In this paper, we propose the generative self-adversarial learning mechanism for generalized gaze estimation that aims to learn general gaze features while eliminating sample-specific features and preventing cross-domain feature over-fitting. Firstly, to reduce information redundancy, the feature encoder is designed based on pyramid-grouped convolution to extract a sparse feature representation from the facial appearance. Secondly, the gaze regression module supervises the model to learn as many gaze-relevant features as possible. Thirdly, the adversarial image reconstruction task prompts the model to eliminate the domain-specific features. The adversarial learning of the gaze regression and the image reconstruction tasks guides the model to learn only general gaze features across domains, preventing cross-domain feature over-fitting, enhancing the domain generalization capability. The results of cross-domain testing of four active gaze datasets prove the effectiveness of the proposed method. The code is available at https://github.com/HongchengHan/GSA-Gaze
UR - https://www.scopus.com/pages/publications/85186533180
U2 - 10.1109/ITSC57777.2023.10421891
DO - 10.1109/ITSC57777.2023.10421891
M3 - 会议稿件
AN - SCOPUS:85186533180
T3 - IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC
SP - 1610
EP - 1615
BT - 2023 IEEE 26th International Conference on Intelligent Transportation Systems, ITSC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 24 September 2023 through 28 September 2023
ER -