EFFICIENT HOUSEHOLDER QR FACTORIZATION FOR SUPERSCALAR PROCESSORS

Citation
Jj. Carrig et Ggl. Meyer, EFFICIENT HOUSEHOLDER QR FACTORIZATION FOR SUPERSCALAR PROCESSORS, ACM transactions on mathematical software, 23(3), 1997, pp. 362-378
Citations number
15
ISSN journal
00983500
Volume
23
Issue
3
Year of publication
1997
Pages
362 - 378
Database
ISI
SICI code
0098-3500(1997)23:3<362:EHQFFS>2.0.ZU;2-N
Abstract
To extract the potential promised by superscalar processors, algorithm designers must streamline memory references and allow for efficient d ata reuse throughout the memory hierarchy. Two parameterized Household er QR factorization algorithms are presented that take into account th e caches and registers typical of such processors. Guidelines are deve loped for choosing parameter values that obtain near-optimal cache and register utilization. The new algorithms are implemented and performa nce-tuned on an Intel Pentium Pro system, a single thin POWER2 node of the IBM Scalable Parallel System 2 (SP2), and a single R8000 processo r of a Silicon Graphics POWER Challenge XL.