An Empirical Study on Model Pruning and Quantization

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

In machine learning, model compression is vital for resource-constrained Internet of Things (IoT) devices, such as unmanned aerial vehicles (UAVs) and smart phones. Currently there are some state-of-the-art (SOTA) compression methods, but little study is conducted to evaluate these techniques across different models and datasets. In this paper, we present an in-depth study on two SOTA model compression methods, pruning and quantization. We apply these methods on AlexNet, ResNet18, VGG16BN and VGG19BN, with three well known datasets, Fashion-MNIST, CIFAR-10, and UCI-HAR. Through our study, we draw the conclusion that, applying pruning and retraining could keep the performance (average less than degrade) while reducing the model size (at compression rate) on spatial domain datasets (e.g. pictures); the performance on temporal domain datasets (e.g. motion sensors data) degrades more (average about degrade); the performance of quantization is related with the pruning rate and the network architecture. We also compare different clustering methods and reveal the impact on model accuracy and quantization ratio. Finally, we provide some interesting directions for future research.

Original languageEnglish
Title of host publicationBroadband Communications, Networks, and Systems - 13th EAI International Conference, BROADNETS 2022, Proceedings
EditorsWei Wang, Jun Wu
PublisherSpringer Science and Business Media Deutschland GmbH
Pages111-125
Number of pages15
ISBN (Print)9783031404665
DOIs
StatePublished - 2023
Externally publishedYes
EventProceedings of the 13th EAI International Conference on Broadband Communications, Networks, and Systems, BROADNETS 2022 - Virtual, Online
Duration: 12 Mar 202313 Mar 2023

Publication series

NameLecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST
Volume511 LNICST
ISSN (Print)1867-8211
ISSN (Electronic)1867-822X

Conference

ConferenceProceedings of the 13th EAI International Conference on Broadband Communications, Networks, and Systems, BROADNETS 2022
CityVirtual, Online
Period12/03/2313/03/23

Keywords

  • Deep neural network
  • Edge computing
  • Model compression

Fingerprint

Dive into the research topics of 'An Empirical Study on Model Pruning and Quantization'. Together they form a unique fingerprint.

Cite this