TY - JOUR
T1 - A two-stage framework based on RL for Truck-Drone Collaborative Delivery Problem
AU - Li, Yuanbo
AU - Zhang, Chengwei
AU - Liu, Wanting
AU - Li, Chao
AU - An, Dou
AU - Wang, Qi
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2025
Y1 - 2025
N2 - With the explosive growth of e-commerce, efficient last-mile delivery has emerged as a critical challenge. Truck-drone collaborative delivery has garnered significant attention as a promising solution to this problem. In this work, we formulate the truck-drone collaborative delivery problem as a collaborative optimization problem, aiming to minimize the total completion time for delivering packages to customers by leveraging the complementary strengths of the drone’s speed and the truck’s endurance. We propose a two-stage framework to address this challenge. In the first stage, the Lin-Kernighan Helsgaun (LKH) algorithm is employed to generate a high-quality initial Traveling Salesman Problem (TSP) solution, serving as a robust starting point. In the second stage, a Sequence Allocate Policy (SAPPO), based on Proximal Policy Optimization, refines the TSP solution by optimizing the truck-drone collaborative path using a specially designed action space. Extensive experiments conducted on both random dataset and TSPLIB benchmarks demonstrate that our method significantly outperforms existing algorithms regarding delivery time, while exhibiting improved scalability and less training time.
AB - With the explosive growth of e-commerce, efficient last-mile delivery has emerged as a critical challenge. Truck-drone collaborative delivery has garnered significant attention as a promising solution to this problem. In this work, we formulate the truck-drone collaborative delivery problem as a collaborative optimization problem, aiming to minimize the total completion time for delivering packages to customers by leveraging the complementary strengths of the drone’s speed and the truck’s endurance. We propose a two-stage framework to address this challenge. In the first stage, the Lin-Kernighan Helsgaun (LKH) algorithm is employed to generate a high-quality initial Traveling Salesman Problem (TSP) solution, serving as a robust starting point. In the second stage, a Sequence Allocate Policy (SAPPO), based on Proximal Policy Optimization, refines the TSP solution by optimizing the truck-drone collaborative path using a specially designed action space. Extensive experiments conducted on both random dataset and TSPLIB benchmarks demonstrate that our method significantly outperforms existing algorithms regarding delivery time, while exhibiting improved scalability and less training time.
KW - Drones
KW - Reinforcement learning
KW - Traveling salesman problem
KW - Vehicle routing
UR - https://www.scopus.com/pages/publications/105024710745
U2 - 10.1109/JIOT.2025.3641887
DO - 10.1109/JIOT.2025.3641887
M3 - 文章
AN - SCOPUS:105024710745
SN - 2327-4662
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
ER -