TY - GEN
T1 - A design of reward function in multi-target trajectory recovery with deep reinforcement learning
AU - He, Liang
AU - Chu, Yanjie
AU - Shen, Chao
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - It attracts a lot of attention in the field of object trajectory detection that detectors always receive several geographical locations without any information about the targets, and furthermore it comes into a problem to use the geographical location information received by the sensors to reconstruct the trajectory of each targets as well as to distinguish the targets in each frame, which is called multi-target trajectory recovery and can be solved by the Deep Reinforcement Learning (DRL). A mathematically model of the direction and curvature of the target trajectory according to the peculiarity of trajectories is proposed. Then, a reward function based on Trajectory Osculating Circle (TOC) is designed based on this mathematical model. Firstly, the issue of the recovery of multi-target trajectories is introduced and it can be switched into a model which can be implemented by DRL. Secondly, a structure of DRL on this issue is come up with, and is tested with the proposed reward function. Finally, a mathematical derivation and physical interpretation of the proposed reward function is implemented. The experimental result shows that with the guidance of the TOC reward function, DRL can reverse the trajectory more effectively than the state-of-the-art clustering method, and the trace is corresponding with the actual trajectory.
AB - It attracts a lot of attention in the field of object trajectory detection that detectors always receive several geographical locations without any information about the targets, and furthermore it comes into a problem to use the geographical location information received by the sensors to reconstruct the trajectory of each targets as well as to distinguish the targets in each frame, which is called multi-target trajectory recovery and can be solved by the Deep Reinforcement Learning (DRL). A mathematically model of the direction and curvature of the target trajectory according to the peculiarity of trajectories is proposed. Then, a reward function based on Trajectory Osculating Circle (TOC) is designed based on this mathematical model. Firstly, the issue of the recovery of multi-target trajectories is introduced and it can be switched into a model which can be implemented by DRL. Secondly, a structure of DRL on this issue is come up with, and is tested with the proposed reward function. Finally, a mathematical derivation and physical interpretation of the proposed reward function is implemented. The experimental result shows that with the guidance of the TOC reward function, DRL can reverse the trajectory more effectively than the state-of-the-art clustering method, and the trace is corresponding with the actual trajectory.
KW - Deep reinforcement learning
KW - Q-function
KW - Sequential decision
KW - Trajectory osculating circle
UR - https://www.scopus.com/pages/publications/85071082516
U2 - 10.1109/ITAIC.2019.8785878
DO - 10.1109/ITAIC.2019.8785878
M3 - 会议稿件
AN - SCOPUS:85071082516
T3 - Proceedings of 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference, ITAIC 2019
SP - 286
EP - 293
BT - Proceedings of 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference, ITAIC 2019
A2 - Xu, Bing
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th IEEE Joint International Information Technology and Artificial Intelligence Conference, ITAIC 2019
Y2 - 24 May 2019 through 26 May 2019
ER -