Performance evaluation of the Orca shared-object system

Orca is a portable, object-based distributed shared memory (DSM) system. This article studies and evaluates the design choices made in the Orca system and compares Orca with other DSMs. The article gives a quantitative analysis of Orca's coherence protocol (based on write-updates with function...

Full description

Bibliographic Details
Published in:ACM Transactions on Computer Systems
Main Authors: Bal, Henri E., Bhoedjang, Raoul, Hofman, Rutger, Jacobs, Ceriel, Langendoen, Koen, Rühl, Tim, Kaashoek, M. Frans
Format: Article in Journal/Newspaper
Language:English
Published: Association for Computing Machinery (ACM) 1998
Subjects:
Online Access:http://dx.doi.org/10.1145/273011.273014
https://dl.acm.org/doi/pdf/10.1145/273011.273014
Description
Summary:Orca is a portable, object-based distributed shared memory (DSM) system. This article studies and evaluates the design choices made in the Orca system and compares Orca with other DSMs. The article gives a quantitative analysis of Orca's coherence protocol (based on write-updates with function shipping), the totally ordered group communication protocol, the strategy for object placement, and the all-software, user-space architecture. Performance measurements for 10 parallel applications illustrate the trade-offs made in the design of Orca and show that essentially the right design decisions have been made. A write-update protocol with function shipping is effective for Orca, especially since it is used in combination with techniques that avoid replicating objects that have a low read/write ratio. The overhead of totally ordered group communication on application performance is low. The Orca system is able to make near-optimal decisions for object placement and replication. In addition, the article compares the performance of Orca with that of a page-based DSM (TreadMarks) and another object-based DSM (CRL). It also analyzes the communication overhead of the DSMs for several applications. All performance measurements are done on a 32-node Pentium Pro cluster with Myrinet and Fast Ethernet networks. The results show that Orca programs send fewer messages and less data than the TreadMarks and CRL programs and obtain better speedups.