Volltext verfügbar nach Anmeldung bzw. im Campus-Netz.
Accelerating AI performance with the incorporation of TVM and MediaTek NeuroPilot.
In: Connection Science, Jg. 35 (2023-03-01), Heft 1, S. 1-27
Online
academicJournal
Zugriff:
The continuing prominence of machine learning has led to an increased focus on enhancing the inference performance of edge devices to reduce latency and improve efficiency. Two widely adopted strategies for accelerating computational performance are quantisation and the utilisation of AI hardware accelerators. Each type of accelerator or inference engine offers distinct advantages, with accelerators primarily designed to optimise neural network operations. In this paper, we present an innovative method for integrating TVM's quantisation flow with the MediaTek Neuropilot AI accelerator. We outline the process of converting the TVM relay intermediate-representation quantised neural network dialect model to a tensor-oriented quantisation format, with the aim of harnessing the full potential of both TVM and MediaTek NeuroPilot. This integration enables more efficient neural network inference while preserving the accuracy of the results. We assessed the effectiveness of our proposed integration by conducting a series of experiments and comparing the performance of our approach with that of TVM equipped with an autotuning mechanism. The findings indicate that our approach substantially outperforms TVM in both floating-point model inference and quantised model inference, with inference speedups of up to 11× and up to 70×, respectively. These results underscore the potential of our approach in accelerating AI performance across a diverse range of applications and edge devices. Moreover, a key contribution of our work is providing a valuable practical method for other hardware companies interested in integrating TVM with their own accelerators to achieve performance gains. [ABSTRACT FROM AUTHOR]
Copyright of Connection Science is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Titel: |
Accelerating AI performance with the incorporation of TVM and MediaTek NeuroPilot.
|
---|---|
Autor/in / Beteiligte Person: | Lee, Chao-Lin ; Chung, Chun-Ping ; Cheng, Sheng-Yuan ; Lee, Jenq-Kuen ; Lai, Robert |
Link: | |
Zeitschrift: | Connection Science, Jg. 35 (2023-03-01), Heft 1, S. 1-27 |
Veröffentlichung: | 2023 |
Medientyp: | academicJournal |
ISSN: | 0954-0091 (print) |
DOI: | 10.1080/09540091.2023.2272586 |
Sonstiges: |
|