Software-Extended Coherent Shared Memory:Performance and Cost

Abstract This paper evaluates the tradeoffs involved in the design of thesoftware-extended memory system of Alewife, a multiprocessor architecture that implements coherentsharedmemory through a com-bination of hardware and software mechanisms. For each block of memory, Alewife implements between zer...

Full description

Bibliographic Details
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.70.257
http://www4.informatik.uni-erlangen.de/~tsthiel/Papers/alewife-coherence-sm.ps.gz
Description
Summary:Abstract This paper evaluates the tradeoffs involved in the design of thesoftware-extended memory system of Alewife, a multiprocessor architecture that implements coherentsharedmemory through a com-bination of hardware and software mechanisms. For each block of memory, Alewife implements between zero and five coherence di-rectory pointers in hardware and allows software to handle requests when the pointers are exhausted. The software includes a flexiblecoherence interface that facilitates protocol software implementation. This interface is indispensable for conducting experimentsand has proven important for implementing enhancements to the basic system. Simulations of a number of applications running on a completesystem (with up to 256 processors) demonstrate that the hybrid architecture with five pointers achieves between 71 % and 100%of full-map directory performance at a constant cost per processing element. Our experience in designing the software protocolinterfaces and experiments with a variety of system configurations lead to a detailed understanding of the interaction of the hardwareand software components of the system. The results show that a small amount of shared memory hardware provides adequate per-formance: One-pointer systems reach between 42 % and 100 % of full-map performance on our parallel benchmarks. A software-only directory architecture with no hardware pointers has lower performance but minimal cost. 1 Introduction Implementing shared memory for a large-scale multiprocessor re-quires balancing the performance of the system as a whole with the complexity and cost of its hardware and software components.Shared memory itself helps control the complexity of the application software written for a machine, but it requires an efficientdesign to achieve this goal. The Alewife architecture[3] uses a combination of hardware and software to provide shared memoryat a constant cost per processing node, without sacrificing performance. Following the integrated systems approach,the architectureuses ...