面向多功能张量加速器的细粒度结构化稀疏设计

Translated title of the contribution: Fine-Grained Structured Sparse Design for Versatile Tensor Accelerator
  • Huazheng Zhao
  • , Shanmin Pang
  • , Yinghai Zhao
  • , Gaohui Hua
  • , Chenyang Li
  • , Zhansheng Duan
  • , Kuizhi Mei

Research output: Contribution to journalArticlepeer-review

Abstract

In order to address the compatibility issue between model compression algorithms and the versatile tensor accelerator (VTA) , an adaptive fine-grained structured sparse design tailored for this accelerator is proposed by enhancing the classical YOLObile block-wise pruning method and evaluates its performance. In light of the multi-dimensional loop unfolding characteristics of VTA, the model's weight tensors are divided into 32X32 blocks. This approach integrates temporal distillation and spatial distillation to align multidimensional features. Through a single-stage iterative training method, the calculation process of the original ADMM algorithm is refined to improve model deployment accuracy while reducing training costs. An adaptive layer pruning rate module is introduced to dynamically allocate the total pruning rate, facilitating end-to-end automated pruning. The experimental results demonstrate that this improved method effectively reduces floating-point computations by approximately 2.4% and enhances the accuracy of compressed models across various tasks such as image classification and object detection, with a maximum growth percentage of 2. 6%. This method offers an efficient and lightweight software solution for the sparse deployment of deep learning models on VTAs.

Translated title of the contributionFine-Grained Structured Sparse Design for Versatile Tensor Accelerator
Original languageChinese (Traditional)
Pages (from-to)176-184
Number of pages9
JournalHsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University
Volume58
Issue number11
DOIs
StatePublished - Nov 2024

Fingerprint

Dive into the research topics of 'Fine-Grained Structured Sparse Design for Versatile Tensor Accelerator'. Together they form a unique fingerprint.

Cite this