File(s) under permanent embargo
Optimization of fast Fourier transforms on the Blue Gene/L Supercomputer
conference contribution
posted on 2023-05-23, 09:14 authored by Sabharwal, Y, Saurabh GargSaurabh Garg, Garg, R, Gunnels, JA, Sahoo, RKWe analyze the bottlenecks in the parallel FFT algorithm and describe optimizations carried out for the algorithm on the Blue Gene/L Supercomputer. We identified three avenues for improving the performance of the algorithm – single-node FFT performance, Alltoall collective performance and overlap of computation and communication. Performance at all these levels has been optimized using the double-hummer intrinsics of the Blue Gene/L CPU, careful ordering and synchronization of messages in Alltoall communications and suitable interleaving of message exchanges with computations.Using these optimizations,we obtained 20% performance improvement over the baseline version on the 64 racks Blue Gene/L system.We give a brief overview of theAlltoall optimizations, describe our computation-communication overlap strategy and present results for strong scaling and weak scaling of parallel FFT on Blue Gene/L. We also discuss the fundamental limits to scaling of the parallel transpose algorithm for computing FFT.
History
Publication title
Lecture Notes in Computer Science Volume 5374: HiPC 2008Editors
P Sadayappan, M Parashar, R Badrinath, VK PrasannaPagination
309-322ISBN
978-3-540-89893-1Department/School
School of Information and Communication TechnologyPublisher
Springer-VerlagPlace of publication
GermanyEvent title
15th International Conference on High Performance Computing 2008Event Venue
Bangalore, IndiaDate of Event (Start Date)
2008-12-17Date of Event (End Date)
2008-12-20Rights statement
Copyright 2008 SpringerRepository Status
- Restricted