임베디드 시스템에서의 INT8 양자화 인식 학습을 통한 비용효율적인 합성곱 신경망 구현

Saeed Ahmad; Sharjeel Masood; Xufeng Hu; Namjung Kim; Changjoon Park; Jeonghwan Gwak

Oral Session Ⅲ 인공지능 및 기계학습

임베디드 시스템에서의 INT8 양자화 인식 학습을 통한 비용효율적인 합성곱 신경망 구현

원문정보

Implementing Cost-Effective CNNs through INT8 Quantization Aware Training on Embedded Systems

Saeed Ahmad, Sharjeel Masood, Xufeng Hu, Namjung Kim, Changjoon Park, Jeonghwan Gwak

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 2024 한국차세대컴퓨팅학회 춘계학술대회 2024.04 pp.279-282

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

The rising popularity of intelligent embedded systems, coupled with the substantial computational and memory requirements of convolutional neural networks (CNNs), necessitates cost-effective on-device model inference. Various post-optimization techniques are used to reduce the model size and precision bits. However, these techniques often result in a significant reduction in performance. To solve these issues, we propose a quantization-aware training (QAT) strategy for optimizing the CNNs to low-bit integers, resulting in faster inference and less memory utilization. We inject fake quantization modules into the original architecture, train the model in complete precision, and then convert the model to an 8-bit integer (INT8). The resultant QAT model performs all the computation of the convolution layers, activation layers, and batch-normalization in INT8. Our method reduces the size of ResNet50 and ResNet101 by a factor of 3.9x and improves the inference speed by more than 2x. We utilize the CIFAR-10 and CIFAR-100 datasets to test the performance of the models.

키워드

저자정보

Saeed Ahmad Dept. of Software Korea National University of Transportation Chungju-si, Republic of Korea
Sharjeel Masood Dept. of IT·Energy Convergence Korea National University of Transportation Chungju-si, Republic of Korea
Xufeng Hu Dept. of IT·Energy Convergence Korea National University of Transportation
Namjung Kim Dept. of Software Korea National University of Transportation Chungju-si, Republic of Korea
Changjoon Park Dept. of IT·Energy Convergence Korea National University of Transportation Chungju-si, Republic of Korea
Jeonghwan Gwak Dept. of Computer Software Korea National University of Transportation Chungju-si, Republic of Korea

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

0개의 논문이 장바구니에 담겼습니다.

earticle

임베디드 시스템에서의 INT8 양자화 인식 학습을 통한 비용효율적인 합성곱 신경망 구현

원문정보

초록

목차

키워드

저자정보

참고문헌

함께 이용한 논문