TUNING LAPACK CODES ON HIERARCHICAL MEMORY MACHINES

Citation
M. Ojanguren et al., TUNING LAPACK CODES ON HIERARCHICAL MEMORY MACHINES, Advances in engineering software, 26(1), 1996, pp. 13-18
Citations number
2
Categorie Soggetti
Computer Application, Chemistry & Engineering","Computer Science Software Graphycs Programming
ISSN journal
09659978
Volume
26
Issue
1
Year of publication
1996
Pages
13 - 18
Database
ISI
SICI code
0965-9978(1996)26:1<13:TLCOHM>2.0.ZU;2-1
Abstract
As important as the advance of computer technology is the development of suitable software for the exploitation of computer capacity. In thi s sense LAPACK (Anderson et al., LAPACK User's Guide, release 1.0, SIA M, Philadelphia, 1992) appears as the most efficient library in the de nse linear algebra field, obtaining good results on vector and paralle l computers, and in general, on hierarchical memory machines. However, some deficiencies in the general matrix LU decomposition were detecte d in DGETRF subroutine. On hierarchical memory machines not only the c ache and TLB faults have to be minimized, but also the page faults, wh ich cause excessive I/O operations. The blocking strategy used by DGET RF (and LAPACK in general) makes good use of the cache memory, but doe s not seem to be enough to avoid unnecessary I/O operations. Therefore , DGETRF does not provide satisfactory run times for large dimension m atrices. In this paper a new code using a double blocking strategy wil l be described, which attains better run times than DGETRF. Copyright (C) 1996.