TY - GEN
T1 - Uni-Perceiver-MoE
T2 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
AU - Zhu, Jinguo
AU - Zhu, Xizhou
AU - Wang, Wenhai
AU - Wang, Xiaohua
AU - Li, Hongsheng
AU - Wang, Xiaogang
AU - Dai, Jifeng
N1 - Publisher Copyright:
© 2022 Neural information processing systems foundation. All rights reserved.
PY - 2022
Y1 - 2022
N2 - To build an artificial neural network like the biological intelligence system, recent works have unified numerous tasks into a generalist model, which can process various tasks with shared parameters and do not have any task-specific modules. While generalist models achieve promising results on various benchmarks, they have performance degradation on some tasks compared with task-specialized models. In this work, we find that interference among different tasks and modalities is the main factor to this phenomenon. To mitigate such interference, we introduce the Conditional Mixture-of-Experts (Conditional MoEs) to generalist models. Routing strategies under different levels of conditions are proposed to take both the training/inference cost and generalization ability into account. By incorporating the proposed Conditional MoEs, the recently proposed generalist model Uni-Perceiver can effectively mitigate the interference across tasks and modalities, and achieves state-of-the-art results on a series of downstream tasks via prompt tuning on 1% of downstream data. Moreover, the introduction of Conditional MoEs still holds the generalization ability of generalist models to conduct zero-shot inference on new tasks, e.g., video-text retrieval and video caption. Code and pre-trained generalist models are publicly released at https://github.com/fundamentalvision/Uni-Perceiver.
AB - To build an artificial neural network like the biological intelligence system, recent works have unified numerous tasks into a generalist model, which can process various tasks with shared parameters and do not have any task-specific modules. While generalist models achieve promising results on various benchmarks, they have performance degradation on some tasks compared with task-specialized models. In this work, we find that interference among different tasks and modalities is the main factor to this phenomenon. To mitigate such interference, we introduce the Conditional Mixture-of-Experts (Conditional MoEs) to generalist models. Routing strategies under different levels of conditions are proposed to take both the training/inference cost and generalization ability into account. By incorporating the proposed Conditional MoEs, the recently proposed generalist model Uni-Perceiver can effectively mitigate the interference across tasks and modalities, and achieves state-of-the-art results on a series of downstream tasks via prompt tuning on 1% of downstream data. Moreover, the introduction of Conditional MoEs still holds the generalization ability of generalist models to conduct zero-shot inference on new tasks, e.g., video-text retrieval and video caption. Code and pre-trained generalist models are publicly released at https://github.com/fundamentalvision/Uni-Perceiver.
UR - https://www.scopus.com/pages/publications/85153043789
M3 - 会议稿件
AN - SCOPUS:85153043789
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
A2 - Koyejo, S.
A2 - Mohamed, S.
A2 - Agarwal, A.
A2 - Belgrave, D.
A2 - Cho, K.
A2 - Oh, A.
PB - Neural information processing systems foundation
Y2 - 28 November 2022 through 9 December 2022
ER -