ZENG Huanqiang, HU Haolin, LIN Xiangwei, HOU Junhui, CAI Canhui. Deep Neural Network Compression and Acceleration: An Overview[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(1): 183-194. DOI: 10.16798/j.issn.1003-0530.2022.01.021
Citation: ZENG Huanqiang, HU Haolin, LIN Xiangwei, HOU Junhui, CAI Canhui. Deep Neural Network Compression and Acceleration: An Overview[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(1): 183-194. DOI: 10.16798/j.issn.1003-0530.2022.01.021

Deep Neural Network Compression and Acceleration: An Overview

  • In recent years, with the rapid improvement of graphic processor unit(GPU) performance, deep neural network (DNN) has made great achievements in many artificial intelligence tasks. However, the mainstream deep learning network model has some defects, such as high computational complexity, large memory consumption and long time-consuming, which makes it difficult to be deployed in mobile devices with limited computing resources or applications with strict delay requirements. Therefore, on the premise of maintaining the accuracy of the model, it gradually attracts a lot of attention from both academia and industry to reduce the weight of the model by compressing and accelerating the DNN. This paper reviews the compression and acceleration techniques of DNNs in recent years. These technologies can be divided into four categories: quantization, model pruning, lightweight convolution kernel design and knowledge distillation. For each technology category, this paper firstly analyzes the development status and existing defects. Then, this paper summarizes the performance evaluation methods of model compression and acceleration. Finally, the challenges in the field of model compression and acceleration, and the possible future research directions are discussed.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return