TY - JOUR
T1 - Multi-Task Multi-Agent Reinforcement Learning With Task-Entity Transformers and Value Decomposition Training
AU - Zhu, Yuanheng
AU - Huang, Shangjing
AU - Zuo, Binbin
AU - Zhao, Dongbin
AU - Sun, Changyin
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Multi-task multi-agent reinforcement learning aims to control multiple agents to perform well on multiple tasks. It encounters three core challenges: the varying number of agents and entities, the disparities in cooperative behaviors among different tasks, and the training imbalance caused by varying task difficulty levels. To address these issues, we propose a novel framework named Task-Entity Transformer Qmix (TETQmix), which employs pretrained language models for task encoding, utilizes proposed Task-Entity Transformer to handle observations across various tasks, and adjusts task learning weights to achieve balanced multi-task training. Task-Entity Transformer not only enables handling multi-task scenarios with varying numbers of agents and entities, but also leverages cross-attention modules to integrate observation and task embeddings, so that each agent can obtain individual values and decisions for multiple tasks. We then utilize a transformer-based mixer to monotonically combine the individual values, and train the whole network’s parameters using temporal-difference errors. To facilitate multi-task training, we define task regret as the difference between the current-stage return and the candidate best one, and adjust the learning weight of each task based on its task regret. Experiments are conducted on both simulated multi-particle environments and real-world multi-robot systems. Compared with existing baselines, our method not only is superior in multi-task learning efficiency, but also shows promising transfer ability on unseen tasks.
AB - Multi-task multi-agent reinforcement learning aims to control multiple agents to perform well on multiple tasks. It encounters three core challenges: the varying number of agents and entities, the disparities in cooperative behaviors among different tasks, and the training imbalance caused by varying task difficulty levels. To address these issues, we propose a novel framework named Task-Entity Transformer Qmix (TETQmix), which employs pretrained language models for task encoding, utilizes proposed Task-Entity Transformer to handle observations across various tasks, and adjusts task learning weights to achieve balanced multi-task training. Task-Entity Transformer not only enables handling multi-task scenarios with varying numbers of agents and entities, but also leverages cross-attention modules to integrate observation and task embeddings, so that each agent can obtain individual values and decisions for multiple tasks. We then utilize a transformer-based mixer to monotonically combine the individual values, and train the whole network’s parameters using temporal-difference errors. To facilitate multi-task training, we define task regret as the difference between the current-stage return and the candidate best one, and adjust the learning weight of each task based on its task regret. Experiments are conducted on both simulated multi-particle environments and real-world multi-robot systems. Compared with existing baselines, our method not only is superior in multi-task learning efficiency, but also shows promising transfer ability on unseen tasks.
KW - Multi-agent systems
KW - Multi-task learning
KW - Pretrained language model
KW - Reinforcement learning
KW - Transformer
UR - https://www.scopus.com/pages/publications/105001962369
U2 - 10.1109/TASE.2024.3501580
DO - 10.1109/TASE.2024.3501580
M3 - 文章
AN - SCOPUS:105001962369
SN - 1545-5955
VL - 22
SP - 9164
EP - 9177
JO - IEEE Transactions on Automation Science and Engineering
JF - IEEE Transactions on Automation Science and Engineering
ER -