Vector-Parallel Algorithms for 1-Dimensional Fast Fourier Transform

We review 1-dimensional FFT algorithms for distributed-memory machines with vector processing nodes. To attain high performance on this type of machine, one has to achieve both high single-processor performance and high parallel efficiency at the same time. We explain a general framework for designi...

Full description

Saved in:
Bibliographic Details
Published inNew Horizons of Parallel and Distributed Computing pp. 53 - 66
Main Authors Yamamoto, Yusaku, Kawamura, Hiroki, Igai, Mitsuyoshi
Format Book Chapter
LanguageEnglish
Published Boston, MA Springer US
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We review 1-dimensional FFT algorithms for distributed-memory machines with vector processing nodes. To attain high performance on this type of machine, one has to achieve both high single-processor performance and high parallel efficiency at the same time. We explain a general framework for designing 1-D FFT based on a 3-dimensional representation of the data that can satisfy both of these requirements. Among many algorithms derived from this framework, two variants are shown to be optimal from the viewpoint of both parallel performance and usability. We also introduce several ideas that further improve performance and flexibility of user interface. Numerical experiments on the Hitachi SR2201, a distributed-memory parallel machine with pseudo-vector processing nodes, show that our program can attain 48% of the peak performance when computing the FFT of 226 points using 64 nodes.
Bibliography:This work was done while the author was at the Central Research Laboratory, Hitachi Ltd.
ISBN:0387244344
9780387244341
DOI:10.1007/0-387-28967-4_4