What's the Difference? Incremental Processing with Change Queries in Snowflake

Incremental algorithms are the heart and soul of stream processing. Low latency results depend on the ability to react to the subset of changes in a dataset over time rather than reprocessing the entirety of a dataset as it evolves. But while the SQL language is well suited for representing streams...

Full description

Bibliographic Details
Published in:Proceedings of the ACM on Management of Data
Main Authors: Akidau, Tyler, Barbier, Paul, Cseri, Istvan, Hueske, Fabian, Jones, Tyler, Lionheart, Sasha, Mills, Daniel, Pauliukevich, Dzmitry, Probst, Lukas, Semmler, Niklas, Sotolongo, Dan, Zhang, Boyuan
Format: Article in Journal/Newspaper
Language:English
Published: Association for Computing Machinery (ACM) 2023
Subjects:
DML
Online Access:http://dx.doi.org/10.1145/3589776
https://dl.acm.org/doi/pdf/10.1145/3589776
id cracm:10.1145/3589776
record_format openpolar
spelling cracm:10.1145/3589776 2024-09-15T18:03:52+00:00 What's the Difference? Incremental Processing with Change Queries in Snowflake Akidau, Tyler Barbier, Paul Cseri, Istvan Hueske, Fabian Jones, Tyler Lionheart, Sasha Mills, Daniel Pauliukevich, Dzmitry Probst, Lukas Semmler, Niklas Sotolongo, Dan Zhang, Boyuan 2023 http://dx.doi.org/10.1145/3589776 https://dl.acm.org/doi/pdf/10.1145/3589776 en eng Association for Computing Machinery (ACM) Proceedings of the ACM on Management of Data volume 1, issue 2, page 1-27 ISSN 2836-6573 journal-article 2023 cracm https://doi.org/10.1145/3589776 2024-08-26T04:03:35Z Incremental algorithms are the heart and soul of stream processing. Low latency results depend on the ability to react to the subset of changes in a dataset over time rather than reprocessing the entirety of a dataset as it evolves. But while the SQL language is well suited for representing streams of changes (via tables) and their application to tables over time (via DML), it entirely lacks a method to query the changes to a table or view in the first place. In this paper, we present CHANGES queries and STREAM objects, Snowflake's primitives for querying and consuming incremental changes to table objects over time. CHANGES queries and STREAMs have been in use within Snowflake for three years, and see broad adoption across our customers. We describe the semantics of these primitives, discuss the implementation challenges, present an analysis of their usage at Snowflake, and contrast with other offerings. Article in Journal/Newspaper DML ACM Publications (Association for Computing Machinery) Proceedings of the ACM on Management of Data 1 2 1 27
institution Open Polar
collection ACM Publications (Association for Computing Machinery)
op_collection_id cracm
language English
description Incremental algorithms are the heart and soul of stream processing. Low latency results depend on the ability to react to the subset of changes in a dataset over time rather than reprocessing the entirety of a dataset as it evolves. But while the SQL language is well suited for representing streams of changes (via tables) and their application to tables over time (via DML), it entirely lacks a method to query the changes to a table or view in the first place. In this paper, we present CHANGES queries and STREAM objects, Snowflake's primitives for querying and consuming incremental changes to table objects over time. CHANGES queries and STREAMs have been in use within Snowflake for three years, and see broad adoption across our customers. We describe the semantics of these primitives, discuss the implementation challenges, present an analysis of their usage at Snowflake, and contrast with other offerings.
format Article in Journal/Newspaper
author Akidau, Tyler
Barbier, Paul
Cseri, Istvan
Hueske, Fabian
Jones, Tyler
Lionheart, Sasha
Mills, Daniel
Pauliukevich, Dzmitry
Probst, Lukas
Semmler, Niklas
Sotolongo, Dan
Zhang, Boyuan
spellingShingle Akidau, Tyler
Barbier, Paul
Cseri, Istvan
Hueske, Fabian
Jones, Tyler
Lionheart, Sasha
Mills, Daniel
Pauliukevich, Dzmitry
Probst, Lukas
Semmler, Niklas
Sotolongo, Dan
Zhang, Boyuan
What's the Difference? Incremental Processing with Change Queries in Snowflake
author_facet Akidau, Tyler
Barbier, Paul
Cseri, Istvan
Hueske, Fabian
Jones, Tyler
Lionheart, Sasha
Mills, Daniel
Pauliukevich, Dzmitry
Probst, Lukas
Semmler, Niklas
Sotolongo, Dan
Zhang, Boyuan
author_sort Akidau, Tyler
title What's the Difference? Incremental Processing with Change Queries in Snowflake
title_short What's the Difference? Incremental Processing with Change Queries in Snowflake
title_full What's the Difference? Incremental Processing with Change Queries in Snowflake
title_fullStr What's the Difference? Incremental Processing with Change Queries in Snowflake
title_full_unstemmed What's the Difference? Incremental Processing with Change Queries in Snowflake
title_sort what's the difference? incremental processing with change queries in snowflake
publisher Association for Computing Machinery (ACM)
publishDate 2023
url http://dx.doi.org/10.1145/3589776
https://dl.acm.org/doi/pdf/10.1145/3589776
genre DML
genre_facet DML
op_source Proceedings of the ACM on Management of Data
volume 1, issue 2, page 1-27
ISSN 2836-6573
op_doi https://doi.org/10.1145/3589776
container_title Proceedings of the ACM on Management of Data
container_volume 1
container_issue 2
container_start_page 1
op_container_end_page 27
_version_ 1810441313468809216