REMAP: A Spatiotemporal CNN Accelerator Optimization Methodology and Toolkit Thereof

  • Boran Zhao
  • , Tian Xia
  • , Haiming Zhai
  • , Fulun Ma
  • , Yan Du
  • , Hanzhi Chang
  • , Wenzhe Zhao
  • , Pengju Ren

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Designing convolutional neural network (CNN) accelerators is getting more difficult owing to the fast-increasing types of CNN models. Some approaches use constant dataflow and microarchitecture that have lower design complexity. However, these accelerators are difficult to adapt with the highly-diverse CNN models and often suffer from low process element utilization. Some other accelerators resort to reconfigurable devices, such as field-programmable gate array (FPGA) and coarse-grained reconfigurable array to support flexible dataflows in order to fit diverse CNN layers. However, layer-by-layer processing may require more energy for frequent reconfiguration and off-chip DDR access. In this work, we introduce a reconfigurable pipeline accelerator (RPA) that can reduce the latency and DDR access by pipelining the compuptation of CNN layers. Although there have been several researches that try to speedup the design process by automatically exploring subset of the accelerator design space, identifying an available automated design tool that can effectively find the complete and optimal design scheme remains a problem, especially for the novel RPA architecture type. Unfortunately, comprehensive exploration of the whole design space faces an excessive large searching space. To tackle this problem, we propose REMAP, a toolkit for designing CNN accelerators based on the Monte Carlo tree search (MCTS) method. To efficiently search the huge design space, we propose several methods to improve searching efficiency. Evaluations show that REMAP significantly outperforms some state-of-the-art approaches; compared with GAMMA, it achieves an average speed increase of 14.75×, and an energy reduction of 45.45%; it also achieves a speed increase of 32.6× against ConfuciuX on MobileNetV2 and ResNet50. We also show an FPGA accelerator implementation which is based on REMAP's search result, and it achieves high performance in real-time CNN tasks. This indicates that REMAP can provide high-quality design exploration with valuable insights and useful architecture design guidances.

Original languageEnglish
Pages (from-to)1691-1704
Number of pages14
JournalIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Volume42
Issue number5
DOIs
StatePublished - 1 May 2023

Keywords

  • Convolutional neural network (CNN) accelerators
  • Monte Carlo tree search (MCTS)
  • design toolkit
  • pipeline

Fingerprint

Dive into the research topics of 'REMAP: A Spatiotemporal CNN Accelerator Optimization Methodology and Toolkit Thereof'. Together they form a unique fingerprint.

Cite this