跳到主要导航 跳到搜索 跳到主要内容

IconDM: Text-Guided Icon Set Expansion Using Diffusion Models

  • Jiawei Lin
  • , Zhaoyun Jiang
  • , Jiaqi Guo
  • , Shizhao Sun
  • , Ting Liu
  • , Zijiang Yang
  • , Jian Guang Lou
  • , Dongmei Zhang
  • Xi'an Jiaotong University
  • Microsoft USA

科研成果: 书/报告/会议事项章节会议稿件同行评审

3 引用 (Scopus)

摘要

Icons are ubiquitous visual elements in graphic design, yet their creation is often complex and time-consuming. To resolve this problem, we draw inspiration from the booming text-to-image field and propose Text-Guided Icon Set Expansion, a novel task that helps users design high-quality icons using textual descriptions. Besides, users can control the style consistency of the created icons by inputting a few hand-crafted icons as style reference. Despite its practicality, the task poses two unique challenges. (i) Abstract Concept Visualization. Abstract concepts like technology and health are frequently encountered in icon creation, but their visualization is not straightforward and requires a grounding process that translates them into physical, easy-to-depict objects. (ii) Fine-grained Style Transfer. Unlike ordinary images, icons exhibit richer fine-grained stylistic elements, including tones, line widths, shapes, shadow effects, etc., which puts higher demands on capturing and preserving detailed styles during icon generation. To address the challenges, we propose IconDM, a method based on pre-trained text-to-image (T2I) diffusion models. Our approach incorporates a one-time domain adaptation process and an online style transfer process. In domain adaptation, we enhance the existing T2I model's capability to understand abstract concepts by fine-tuning it on high-quality icon-text pairs. To achieve this, we construct a large-scale dataset IconBank containing 2.3 million icon samples, and leverage a state-of-the-art vision-language model to generate textual descriptions for each icon. In style transfer, we introduce a Style Enhancement Module into the T2I model. It explicitly extracts the fine-grained style features from the given reference icons and is jointly optimized with the T2I model during DreamBooth tuning. To assess IconDM, we present IconBench, a structured evaluation suite with 30 icon sets and 100 concepts (including 50 abstract concepts). Quantitative results, qualitative analysis, and extensive ablation studies demonstrate the effectiveness of IconDM.

源语言英语
主期刊名MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
出版商Association for Computing Machinery, Inc
156-165
页数10
ISBN(电子版)9798400706868
DOI
出版状态已出版 - 28 10月 2024
活动32nd ACM International Conference on Multimedia, MM 2024 - Melbourne, 澳大利亚
期限: 28 10月 20241 11月 2024

出版系列

姓名MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia

会议

会议32nd ACM International Conference on Multimedia, MM 2024
国家/地区澳大利亚
Melbourne
时期28/10/241/11/24

学术指纹

探究 'IconDM: Text-Guided Icon Set Expansion Using Diffusion Models' 的科研主题。它们共同构成独一无二的指纹。

引用此