Leveraging the VTA-TVM Hardware-Software Stack for FPGA Acceleration of 8-bit ResNet-18 Inference
In: Proceedings of the 1st on Reproducible Quality-Efficient Systems Tournament on Co-designing Pareto-efficient Deep Learning, 2018-06-20
Online
unknown
Zugriff:
We present a full-stack design to accelerate deep learning inference with FPGAs. Our contribution is two-fold. At the software layer, we leverage and extend TVM, the end-to-end deep learning optimizing compiler, in order to harness FPGA-based acceleration. At the the hardware layer, we present the Versatile Tensor Accelerator (VTA) which presents a generic, modular, and customizable architecture for TPU-like accelerators. Our results take a ResNet-18 description in MxNet and compiles it down to perform 8-bit inference on a 256-PE accelerator implemented on a low-cost Xilinx Zynq FPGA, clocked at 100MHz. Our full hardware acceleration stack will be made available for the community to reproduce, and build upon at http://github.com/uwsaml/vta.
Titel: |
Leveraging the VTA-TVM Hardware-Software Stack for FPGA Acceleration of 8-bit ResNet-18 Inference
|
---|---|
Autor/in / Beteiligte Person: | Chen, Tianqi ; Ceze, Luis ; Moreau, Thierry |
Link: | |
Zeitschrift: | Proceedings of the 1st on Reproducible Quality-Efficient Systems Tournament on Co-designing Pareto-efficient Deep Learning, 2018-06-20 |
Veröffentlichung: | ACM, 2018 |
Medientyp: | unknown |
DOI: | 10.1145/3229762.3229766 |
Schlagwort: |
|
Sonstiges: |
|