TY - JOUR
T1 - Token-based deep reinforcement learning for Heterogeneous VRP with Service Time Constraints
AU - Wang, Yujun
AU - Hong, Xiaopeng
AU - Wang, Yabin
AU - Zhao, Junzhou
AU - Sun, Guanghui
AU - Qin, Baoxing
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/9/27
Y1 - 2024/9/27
N2 - Heterogeneous Vehicle Routing aims to construct routes for various vehicles while optimizing an objective with a series of constraints. However, existing deep reinforcement learning-based methods often ignore the service time constraints, which prohibits vehicles from leaving current nodes until the service time is met. This limitation restricts their practical application. To address these concerns, we introduce the Heterogeneous Vehicle Routing Problem with Service Time Constraints (HVRP-STC) and formulate it as a Markov Decision Process with Service Time Constraints. We propose a novel deep reinforcement learning-based model, Token-based Deep Reinforcement Learning (TDRL), to solve this problem. To provide sufficient and timely information for decision making, we design a State Token Coding (STC) mechanism that encodes and updates individual and overall vehicle and node states as tokens of different types. To determine the pairs of vehicles and nodes and generate actions, we propose a Heterogeneous Decoder (HD) with a vehicle-selector and multiple vehicle-specific node-selectors. This decouples the vehicle-node selection tasks and customizes the task of choosing nodes to visit for individual vehicles, better catering to the heterogeneous nature of HVRP-STC. We evaluate the proposed method on four types of datasets with instances of different sizes, large spatial coverage, and varied mathematical model. Our results show that TDRL consistently outperforms state-of-the-art DRL methods. We will release the datasets and the source code of this benchmark with the paper via https://github.com/Vision-Intelligence-and-Robots-Group/ToDRL.
AB - Heterogeneous Vehicle Routing aims to construct routes for various vehicles while optimizing an objective with a series of constraints. However, existing deep reinforcement learning-based methods often ignore the service time constraints, which prohibits vehicles from leaving current nodes until the service time is met. This limitation restricts their practical application. To address these concerns, we introduce the Heterogeneous Vehicle Routing Problem with Service Time Constraints (HVRP-STC) and formulate it as a Markov Decision Process with Service Time Constraints. We propose a novel deep reinforcement learning-based model, Token-based Deep Reinforcement Learning (TDRL), to solve this problem. To provide sufficient and timely information for decision making, we design a State Token Coding (STC) mechanism that encodes and updates individual and overall vehicle and node states as tokens of different types. To determine the pairs of vehicles and nodes and generate actions, we propose a Heterogeneous Decoder (HD) with a vehicle-selector and multiple vehicle-specific node-selectors. This decouples the vehicle-node selection tasks and customizes the task of choosing nodes to visit for individual vehicles, better catering to the heterogeneous nature of HVRP-STC. We evaluate the proposed method on four types of datasets with instances of different sizes, large spatial coverage, and varied mathematical model. Our results show that TDRL consistently outperforms state-of-the-art DRL methods. We will release the datasets and the source code of this benchmark with the paper via https://github.com/Vision-Intelligence-and-Robots-Group/ToDRL.
KW - Deep reinforcement learning
KW - Heterogeneous Vehicle Routing
KW - Service Time Constraints
KW - Task allocation
KW - Task scheduling
UR - https://www.scopus.com/pages/publications/85198019849
U2 - 10.1016/j.knosys.2024.112173
DO - 10.1016/j.knosys.2024.112173
M3 - 文章
AN - SCOPUS:85198019849
SN - 0950-7051
VL - 300
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 112173
ER -