TY - JOUR
T1 - Federated multi-objective reinforcement learning
AU - Zhao, Fangyuan
AU - Ren, Xuebin
AU - Yang, Shusen
AU - Zhao, Peng
AU - Zhang, Rui
AU - Xu, Xinxin
N1 - Publisher Copyright:
© 2022 Elsevier Inc.
PY - 2023/5
Y1 - 2023/5
N2 - Multi-objective reinforcement Learning (MORL) has significant potential for solving complex decision problems with conflicting objectives. Desiring sufficient training samples, it is promising to achieve federated MORL in large-scale distributed settings. However, itstill suffers from poor efficiency and high privacy risks. To mitigate the inefficiency issue, we first propose a novel probablistic algorithm PMORL that can seek an optimal policy via the expectation maximization (EM) algorithm with high efficiency. To extend PMORL to distributed settings with privacy protection, we then present the first federated MORL algorithm Fed-PMORL with client-level differential privacy (DP). In Fed-PMORL, personalized actors are trained and maintained at local clients whereas critics are aggregated and sanitized at the central server. Extensive experimental results in benchmark MORL environments demonstrate that Fed-PMORL under DP guarantees can achieve superior performance with high efficiency. In particular, compared with the state-of-the-art methods, PMORL and Fed-PMORL can save up to 50% training episodes for achieving the same model utility. With a sufficient number of clients (e.g., 1000 clients), Fed-PMORL with a formal DP guarantee shows utility comparable to that of the non-private algorithm.
AB - Multi-objective reinforcement Learning (MORL) has significant potential for solving complex decision problems with conflicting objectives. Desiring sufficient training samples, it is promising to achieve federated MORL in large-scale distributed settings. However, itstill suffers from poor efficiency and high privacy risks. To mitigate the inefficiency issue, we first propose a novel probablistic algorithm PMORL that can seek an optimal policy via the expectation maximization (EM) algorithm with high efficiency. To extend PMORL to distributed settings with privacy protection, we then present the first federated MORL algorithm Fed-PMORL with client-level differential privacy (DP). In Fed-PMORL, personalized actors are trained and maintained at local clients whereas critics are aggregated and sanitized at the central server. Extensive experimental results in benchmark MORL environments demonstrate that Fed-PMORL under DP guarantees can achieve superior performance with high efficiency. In particular, compared with the state-of-the-art methods, PMORL and Fed-PMORL can save up to 50% training episodes for achieving the same model utility. With a sufficient number of clients (e.g., 1000 clients), Fed-PMORL with a formal DP guarantee shows utility comparable to that of the non-private algorithm.
KW - Differential privacy
KW - Federated learning
KW - Graph model
KW - Multi-objective optimization
KW - Reinforcement learning
UR - https://www.scopus.com/pages/publications/85146099661
U2 - 10.1016/j.ins.2022.12.083
DO - 10.1016/j.ins.2022.12.083
M3 - 文章
AN - SCOPUS:85146099661
SN - 0020-0255
VL - 624
SP - 811
EP - 832
JO - Information Sciences
JF - Information Sciences
ER -