DeepSZ: identification of Sunyaev–Zel’dovich galaxy clusters using deep learning

ABSTRACT Galaxy clusters identified via the Sunyaev–Zel’dovich (SZ) effect are a key ingredient in multiwavelength cluster cosmology. We present and compare three methods of cluster identification: the standard matched filter (MF) method in SZ cluster finding, a convolutional neural networks (CNN),...

Full description

Bibliographic Details
Published in:Monthly Notices of the Royal Astronomical Society
Main Authors: Lin, Z, Huang, N, Avestruz, C, Wu, W L K, Trivedi, S, Caldeira, J, Nord, B
Other Authors: University of Michigan, NSF, AAG, National Science Foundation
Format: Article in Journal/Newspaper
Language:English
Published: Oxford University Press (OUP) 2021
Subjects:
Online Access:http://dx.doi.org/10.1093/mnras/stab2229
http://academic.oup.com/mnras/advance-article-pdf/doi/10.1093/mnras/stab2229/39583876/stab2229.pdf
https://academic.oup.com/mnras/article-pdf/507/3/4149/40354872/stab2229.pdf
Description
Summary:ABSTRACT Galaxy clusters identified via the Sunyaev–Zel’dovich (SZ) effect are a key ingredient in multiwavelength cluster cosmology. We present and compare three methods of cluster identification: the standard matched filter (MF) method in SZ cluster finding, a convolutional neural networks (CNN), and a ‘combined’ identifier. We apply the methods to simulated millimeter maps for several observing frequencies for a survey similar to SPT-3G, the third-generation camera for the South Pole Telescope. The MF requires image pre-processing to remove point sources and a model for the noise, while the CNN requires very little pre-processing of images. Additionally, the CNN requires tuning of hyperparameters in the model and takes cut-out images of the sky as input, identifying the cut-out as cluster-containing or not. We compare differences in purity and completeness. The MF signal-to-noise ratio depends on both mass and redshift. Our CNN, trained for a given mass threshold, captures a different set of clusters than the MF, some with signal-to-noise-ratio below the MF detection threshold. However, the CNN tends to mis-classify cut-out whose clusters are located near the edge of the cut-out, which can be mitigated with staggered cut-out. We leverage the complementarity of the two methods, combining the scores from each method for identification. The purity and completeness are both 0.61 for MF, and 0.59 and 0.61 for CNN. The combined method yields 0.60 and 0.77, a significant increase for completeness with a modest decrease in purity. We advocate for combined methods that increase the confidence of many low signal-to-noise clusters.