Algorithm 784: GEMM-based level 3 BLAS: Portability and optimization issues

Citation
B. Kagstrom et al., Algorithm 784: GEMM-based level 3 BLAS: Portability and optimization issues, ACM T MATH, 24(3), 1998, pp. 303-316
Citations number
4
Categorie Soggetti
Computer Science & Engineering
Journal title
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE
ISSN journal
00983500 → ACNP
Volume
24
Issue
3
Year of publication
1998
Pages
303 - 316
Database
ISI
SICI code
0098-3500(199809)24:3<303:A7GL3B>2.0.ZU;2-O
Abstract
This companion article discusses portability and optimization issues of the GEMM-based level 3 BLAS model implementations and the performance evaluati on benchmark. All software comes in all four data types (single- and double -precision, real and complex) and are designed to be easy to implement and use on different platforms. Each of the GEMM-based routines has a few machi ne-dependent parameters that specify internal block. sizes, cache character istics, and branch points for alternative code sections. These parameters p rovide means for adjustment to the characteristics of a memory hierarchy.