Share

Export Citation

APA
MLA
Chicago
Harvard
Vancouver
BIBTEX
RIS
Universitas Hasanuddin
Research output:Contribution to journalArticlepeer-review

Design of Quantized Deep Neural Network Hardware Inference Accelerator Using Systolic Architecture

Rifqie D.M.

Journal of Applied Science Engineering Technology and Education

Published: 2024

Abstract

This paper presents a hardware inference accelerator architecture of quantized deep neural networks (DNN). The proposed accelerator implements all computation in a quantize version of DNN including linear transformations like matrix multiplications, nonlinear activation functions such as ReLU, quantization and dequantization operation. The hardware accelerator of quantized DNN consists of matrix multiplication core which is implemented in systolic array architecture, and the QDR core for computing the operation of quantization, dequantization, and ReLU. This proposed hardware architecture is implemented in Verilog Hardware Description Language (HDL) code using modelsim. To validate, we simulated the quantized DNN using Python programming language and compared the results with our proposed hardware accelerator. The result of this comparison shows a very slight difference, confirming the validity of our quantized DNN hardware accelerator.

Access to Document

10.35877/454RI.asci2689

Other files and links

Fingerprint

InferenceSciences
Computer scienceSciences
Systolic arraySciences
ArchitectureSciences
Computer architectureSciences
Artificial neural networkSciences
Hardware accelerationSciences
Computer hardwareSciences
Embedded systemSciences
Artificial intelligenceSciences
Field-programmable gate arraySciences
Very-large-scale integrationSciences
ArtSciences
Visual artsSciences