TY - JOUR
T1 - Energy Minimization in UAV-Aided Networks
T2 - Actor-Critic Learning for Constrained Scheduling Optimization
AU - Yuan, Yaxiong
AU - Lei, Lei
AU - Vu, Thang X.
AU - Chatzinotas, Symeon
AU - Sun, Sumei
AU - Ottersten, Bjorn
N1 - Publisher Copyright:
© 1967-2012 IEEE.
PY - 2021/5
Y1 - 2021/5
N2 - In unmanned aerial vehicle (UAV) applications, the UAV's limited energy supply and storage have triggered the development of intelligent energy-conserving scheduling solutions. In this paper, we investigate energy minimization for UAV-aided communication networks by jointly optimizing data-transmission scheduling and UAV hovering time. The formulated problem is combinatorial and non-convex with bilinear constraints. To tackle the problem, firstly, we provide an optimal algorithm (OPT) and a golden section search heuristic algorithm (GSS-HEU). Both solutions are served as offline performance benchmarks which might not be suitable for online operations. Towards this end, from a deep reinforcement learning (DRL) perspective, we propose an actor-critic-based deep stochastic online scheduling (AC-DSOS) algorithm and develop a set of approaches to confine the action space. Compared to conventional RL/DRL, the novelty of AC-DSOS lies in handling two major issues, i.e., exponentially-increased action space and infeasible actions. Numerical results show that AC-DSOS is able to provide feasible solutions, and save around 25-30% energy compared to two conventional deep AC-DRL algorithms. Compared to the developed GSS-HEU, AC-DSOS consumes around 10% higher energy but reduces the computational time from second-level to millisecond-level.
AB - In unmanned aerial vehicle (UAV) applications, the UAV's limited energy supply and storage have triggered the development of intelligent energy-conserving scheduling solutions. In this paper, we investigate energy minimization for UAV-aided communication networks by jointly optimizing data-transmission scheduling and UAV hovering time. The formulated problem is combinatorial and non-convex with bilinear constraints. To tackle the problem, firstly, we provide an optimal algorithm (OPT) and a golden section search heuristic algorithm (GSS-HEU). Both solutions are served as offline performance benchmarks which might not be suitable for online operations. Towards this end, from a deep reinforcement learning (DRL) perspective, we propose an actor-critic-based deep stochastic online scheduling (AC-DSOS) algorithm and develop a set of approaches to confine the action space. Compared to conventional RL/DRL, the novelty of AC-DSOS lies in handling two major issues, i.e., exponentially-increased action space and infeasible actions. Numerical results show that AC-DSOS is able to provide feasible solutions, and save around 25-30% energy compared to two conventional deep AC-DRL algorithms. Compared to the developed GSS-HEU, AC-DSOS consumes around 10% higher energy but reduces the computational time from second-level to millisecond-level.
KW - UAV
KW - actor-critic
KW - deep reinforcement learning
KW - energy optimization
KW - hovering time allocation
KW - user scheduling
UR - https://www.scopus.com/pages/publications/85105077754
U2 - 10.1109/TVT.2021.3075860
DO - 10.1109/TVT.2021.3075860
M3 - 文章
AN - SCOPUS:85105077754
SN - 0018-9545
VL - 70
SP - 5028
EP - 5042
JO - IEEE Transactions on Vehicular Technology
JF - IEEE Transactions on Vehicular Technology
IS - 5
M1 - 9416816
ER -