Algorithm 1039: Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM.
In: ACM Transactions on Mathematical Software, Jg. 50 (2024-03-01), Heft 1, S. 1-34
Online
academicJournal
Zugriff:
We explore the utilization of the Apache TVM open source framework to automatically generate a family of algorithms that follow the approach taken by popular linear algebra libraries, such as GotoBLAS2, BLIS, and OpenBLAS, to obtain high-performance blocked formulations of the general matrix multiplication (gemm). In addition, we fully automatize the generation process by also leveraging the Apache TVM framework to derive a complete variety of the processor-specific micro-kernels for gemm. This is in contrast with the convention in high-performance libraries, which hand-encode a single micro-kernel per architecture using Assembly code. In global, the combination of our TVM-generated blocked algorithms and micro-kernels for gemm (1) improves portability, maintainability, and, globally, streamlines the software life cycle; (2) provides high flexibility to easily tailor and optimize the solution to different data types, processor architectures, and matrix operand shapes, yielding performance on a par (or even superior for specific matrix shapes) with that of hand-tuned libraries; and (3) features a small memory footprint. [ABSTRACT FROM AUTHOR]
Titel: |
Algorithm 1039: Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM.
|
---|---|
Autor/in / Beteiligte Person: | ALAEJOS, GUILLERMO ; CASTELLÓ, ADRIÁN ; ALONSO-JORDÁ, PEDRO ; IGUAL, FRANCISCO D. ; MARTÍNEZ, HÉCTOR ; QUINTANA-ORTÍ, ENRIQUE S. |
Link: | |
Zeitschrift: | ACM Transactions on Mathematical Software, Jg. 50 (2024-03-01), Heft 1, S. 1-34 |
Veröffentlichung: | 2024 |
Medientyp: | academicJournal |
ISSN: | 0098-3500 (print) |
DOI: | 10.1145/3638532 |
Schlagwort: |
|
Sonstiges: |
|