跳到主要导航 跳到搜索 跳到主要内容

Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs

  • Jinguo Zhu
  • , Xizhou Zhu
  • , Wenhai Wang
  • , Xiaohua Wang
  • , Hongsheng Li
  • , Xiaogang Wang
  • , Jifeng Dai
  • Xi'an Jiaotong University
  • Shanghai Artificial Intelligence Laboratory
  • SenseTime Group Limited
  • Chinese University of Hong Kong
  • Tsinghua University

科研成果: 书/报告/会议事项章节会议稿件同行评审

48 引用 (Scopus)

摘要

To build an artificial neural network like the biological intelligence system, recent works have unified numerous tasks into a generalist model, which can process various tasks with shared parameters and do not have any task-specific modules. While generalist models achieve promising results on various benchmarks, they have performance degradation on some tasks compared with task-specialized models. In this work, we find that interference among different tasks and modalities is the main factor to this phenomenon. To mitigate such interference, we introduce the Conditional Mixture-of-Experts (Conditional MoEs) to generalist models. Routing strategies under different levels of conditions are proposed to take both the training/inference cost and generalization ability into account. By incorporating the proposed Conditional MoEs, the recently proposed generalist model Uni-Perceiver can effectively mitigate the interference across tasks and modalities, and achieves state-of-the-art results on a series of downstream tasks via prompt tuning on 1% of downstream data. Moreover, the introduction of Conditional MoEs still holds the generalization ability of generalist models to conduct zero-shot inference on new tasks, e.g., video-text retrieval and video caption. Code and pre-trained generalist models are publicly released at https://github.com/fundamentalvision/Uni-Perceiver.

源语言英语
主期刊名Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
编辑S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
出版商Neural information processing systems foundation
ISBN(电子版)9781713871088
出版状态已出版 - 2022
活动36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, 美国
期限: 28 11月 20229 12月 2022

出版系列

姓名Advances in Neural Information Processing Systems
35
ISSN(印刷版)1049-5258

会议

会议36th Conference on Neural Information Processing Systems, NeurIPS 2022
国家/地区美国
New Orleans
时期28/11/229/12/22

学术指纹

探究 'Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs' 的科研主题。它们共同构成独一无二的指纹。

引用此