Exploiting Variable Precision Computation Array for Scalable Neural Network Accelerators

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we present a flexible Variable Precision Computation Array (VPCA) component for different accelerators, which leverages a sparsification scheme for activations and a low bits serial-parallel combination computation unit for improving the efficiency and resiliency of accelerators. The VPCA can dynamically decompose the width of activation/weights (from 32bit to 3bit in different accelerators) into 2-bits serial computation units while the 2bits computing units can be combined in parallel computing for high throughput. We propose an on-the-fly compressing and calculating strategy SLE-CLC (single lane encoding, cross lane calculation), which could further improve performance of 2-bit parallel computing. The experiments results on image classification datasets show VPCA can outperforms DaDianNao, Stripes, Loom-2bit by 4.67×, 2.42×, 1.52× without other overhead on convolution layers.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages315-319
Number of pages5
ISBN (Electronic)9781728149226
DOIs
StatePublished - Aug 2020
Event2020 IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2020 - Genova, Italy
Duration: 31 Aug 20202 Sep 2020

Publication series

NameProceedings - 2020 IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2020

Conference

Conference2020 IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2020
Country/TerritoryItaly
CityGenova
Period31/08/202/09/20

Keywords

  • Accelerator
  • Deep Neural Networks
  • Dynamic Quantization
  • Energy Efficiency Computing Array
  • Resiliency

Fingerprint

Dive into the research topics of 'Exploiting Variable Precision Computation Array for Scalable Neural Network Accelerators'. Together they form a unique fingerprint.

Cite this