Sensitivity of Parallel Applications to Large Differences in Bandwidth and Latency in Two-Layer Interconnects

This paper studies application performance on systems with strongly non-uniform remote memory access. In current generation NUMAs the speed difference between the slowest and fastest link in an interconnect---the "NUMA gap"---is typically less than an order of magnitude, and many conventio...

Full description

Bibliographic Details
Main Authors: Aske Plaat, Henri E. Bal, Rutger F. H. Hofman, Thilo Kielmann
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Published: 1999
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.35.7495
http://www.cs.vu.nl/~kielmann/papers/fgcs00.ps.gz
Description
Summary:This paper studies application performance on systems with strongly non-uniform remote memory access. In current generation NUMAs the speed difference between the slowest and fastest link in an interconnect---the "NUMA gap"---is typically less than an order of magnitude, and many conventional parallel programs achieve good performance. We study how different NUMA gaps influence application performance, up to and including typical wide-area latencies and bandwidths. We find that for gaps larger than those of current generation NUMAs, performance suffers considerably (for applications that were designed for a uniform access interconnect). For many applications, however, performance can be greatly improved with comparatively simple changes: traffic over slow links can be reduced by making communication patterns hierarchical---like the interconnect. We find that in four out of our six applications the size of the gap can be increased by an order of magnitude or more without severel.