Method for prefetching non-contiguous data structures

A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple pro...

Full description

Bibliographic Details
Main Authors: Blumrich, Matthias A, Chen, Dong, Coteus, Paul W, Gara, Alan G, Giampapa, Mark E, Heidelberger, Philip, Hoenicke, Dirk, Ohmacht, Martin, Steinmacher-Burow, Burkhard D, Takken, Todd E, Vranas, Pavlos M
Language:unknown
Published: 2023
Subjects:
Online Access:http://www.osti.gov/servlets/purl/988154
https://www.osti.gov/biblio/988154
Description
Summary:A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple perfecting for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefect rather than some other predictive algorithm. This enables hardware to effectively prefect memory access patterns that are non-contiguous, but repetitive.