147 0

Direct Conversion: Accelerating Convolutional Neural Networks Utilizing Sparse Input Activation

Title
Direct Conversion: Accelerating Convolutional Neural Networks Utilizing Sparse Input Activation
Author
정기석
Keywords
convolutional neural network; sparsity-aware acceleration; embedded system
Issue Date
2020-11
Publisher
IEEE
Citation
IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society, page. 441-446
Abstract
The amount of computation and the number of parameters of neural networks are increasing rapidly as the depth of convolutional neural networks (CNNs) is increasing. Therefore, it is very crucial to reduce both the amount of computation and that of memory usage. The pruning method, which compresses a neural network, has been actively studied. Depending on the layer characteristics, the sparsity level of each layer varies significantly after the pruning is conducted. If weights are sparse, most results of convolution operations will be zeroes. Although several studies have proposed methods to utilize the weight sparsity to avoid carrying out meaningless operations, those studies lack consideration that input activations may also have a high sparsity level. The Rectified Linear Unit (ReLU) function is one of the most popular activation functions because it is simple and yet pretty effective. Due to properties of the ReLU function, it is often observed that the input activation sparsity level is high (up to 85%). Therefore, it is important to consider both the input activation sparsity and the weight one to accelerate CNN to minimize carrying out meaningless computation. In this paper, we propose a new acceleration method called Direct Conversion that considers the weight sparsity under the sparse input activation condition. The Direct Conversion method converts a 3D input tensor directly into a compressed format. This method selectively applies one of two different methods: a method called image to Compressed Sparse Row (im2CSR) when input activations are sparse and weights are dense; the other method called image to Compressed Sparse Overlapped Activations (im2CSOA) when both input activations and weights are sparse. Our experimental results show that Direct Conversion improves the inference speed up to 2.82× compared to the conventional method.
URI
https://ieeexplore.ieee.org/document/9254473https://repository.hanyang.ac.kr/handle/20.500.11754/172630
ISBN
978-1-7281-5414-5; 978-1-7281-5413-8
ISSN
2577-1647; 1553-572X
DOI
10.1109/IECON43393.2020.9254473
Appears in Collections:
COLLEGE OF ENGINEERING[S](공과대학) > ELECTRONIC ENGINEERING(융합전자공학부) > Articles
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE