Sound garbage collection for C using pointer provenance
Garbage collection (GC) support for unmanaged languages can reduce programming burden in reasoning about liveness of dynamic objects. It also avoids temporal memory safety violations and memory leaks. Sound GC for weakly-typed languages such as C/C++, however, remains an unsolved problem. Current va...
Published in: | Proceedings of the ACM on Programming Languages |
---|---|
Main Authors: | , , , |
Other Authors: | |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Association for Computing Machinery (ACM)
2020
|
Subjects: | |
Online Access: | http://dx.doi.org/10.1145/3428244 https://dl.acm.org/doi/pdf/10.1145/3428244 |
id |
cracm:10.1145/3428244 |
---|---|
record_format |
openpolar |
spelling |
cracm:10.1145/3428244 2024-05-12T08:11:58+00:00 Sound garbage collection for C using pointer provenance Banerjee, Subarno Devecsery, David Chen, Peter M. Narayanasamy, Satish NSF 2020 http://dx.doi.org/10.1145/3428244 https://dl.acm.org/doi/pdf/10.1145/3428244 en eng Association for Computing Machinery (ACM) http://www.acm.org/publications/policies/copyright_policy#Background Proceedings of the ACM on Programming Languages volume 4, issue OOPSLA, page 1-28 ISSN 2475-1421 journal-article 2020 cracm https://doi.org/10.1145/3428244 2024-05-01T06:46:49Z Garbage collection (GC) support for unmanaged languages can reduce programming burden in reasoning about liveness of dynamic objects. It also avoids temporal memory safety violations and memory leaks. Sound GC for weakly-typed languages such as C/C++, however, remains an unsolved problem. Current value-based GC solutions examine values of memory locations to discover the pointers, and the objects they point to. The approach is inherently unsound in the presence of arbitrary type casts and pointer manipulations, which are legal in C/C++. Such language features are regularly used, especially in low-level systems code. In this paper, we propose Dynamic Pointer Provenance Tracking to realize sound GC. We observe that pointers cannot be created out-of-thin-air, and they must have provenance to at least one valid allocation. Therefore, by tracking pointer provenance from the source (e.g., malloc) through both explicit data-flow and implicit control-flow, our GC has sound and precise information to compute the set of all reachable objects at any program state. We discuss several static analysis optimizations, that can be employed during compilation aided with profiling, to significantly reduce the overhead of dynamic provenance tracking from nearly 8× to 16% for well-behaved programs that adhere to the C standards. Pointer provenance based sound GC invocation is also 13% faster and reclaims 6% more memory on average, compared to an unsound value-based GC. Article in Journal/Newspaper The Pointers ACM Publications (Association for Computing Machinery) Proceedings of the ACM on Programming Languages 4 OOPSLA 1 28 |
institution |
Open Polar |
collection |
ACM Publications (Association for Computing Machinery) |
op_collection_id |
cracm |
language |
English |
description |
Garbage collection (GC) support for unmanaged languages can reduce programming burden in reasoning about liveness of dynamic objects. It also avoids temporal memory safety violations and memory leaks. Sound GC for weakly-typed languages such as C/C++, however, remains an unsolved problem. Current value-based GC solutions examine values of memory locations to discover the pointers, and the objects they point to. The approach is inherently unsound in the presence of arbitrary type casts and pointer manipulations, which are legal in C/C++. Such language features are regularly used, especially in low-level systems code. In this paper, we propose Dynamic Pointer Provenance Tracking to realize sound GC. We observe that pointers cannot be created out-of-thin-air, and they must have provenance to at least one valid allocation. Therefore, by tracking pointer provenance from the source (e.g., malloc) through both explicit data-flow and implicit control-flow, our GC has sound and precise information to compute the set of all reachable objects at any program state. We discuss several static analysis optimizations, that can be employed during compilation aided with profiling, to significantly reduce the overhead of dynamic provenance tracking from nearly 8× to 16% for well-behaved programs that adhere to the C standards. Pointer provenance based sound GC invocation is also 13% faster and reclaims 6% more memory on average, compared to an unsound value-based GC. |
author2 |
NSF |
format |
Article in Journal/Newspaper |
author |
Banerjee, Subarno Devecsery, David Chen, Peter M. Narayanasamy, Satish |
spellingShingle |
Banerjee, Subarno Devecsery, David Chen, Peter M. Narayanasamy, Satish Sound garbage collection for C using pointer provenance |
author_facet |
Banerjee, Subarno Devecsery, David Chen, Peter M. Narayanasamy, Satish |
author_sort |
Banerjee, Subarno |
title |
Sound garbage collection for C using pointer provenance |
title_short |
Sound garbage collection for C using pointer provenance |
title_full |
Sound garbage collection for C using pointer provenance |
title_fullStr |
Sound garbage collection for C using pointer provenance |
title_full_unstemmed |
Sound garbage collection for C using pointer provenance |
title_sort |
sound garbage collection for c using pointer provenance |
publisher |
Association for Computing Machinery (ACM) |
publishDate |
2020 |
url |
http://dx.doi.org/10.1145/3428244 https://dl.acm.org/doi/pdf/10.1145/3428244 |
genre |
The Pointers |
genre_facet |
The Pointers |
op_source |
Proceedings of the ACM on Programming Languages volume 4, issue OOPSLA, page 1-28 ISSN 2475-1421 |
op_rights |
http://www.acm.org/publications/policies/copyright_policy#Background |
op_doi |
https://doi.org/10.1145/3428244 |
container_title |
Proceedings of the ACM on Programming Languages |
container_volume |
4 |
container_issue |
OOPSLA |
container_start_page |
1 |
op_container_end_page |
28 |
_version_ |
1798834221039484928 |