AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images

Processing giga-pixel whole slide histopathology images (WSI) is a computationally expensive task. Multiple instance learning (MIL) has become the conventional approach to process WSIs, in which these images are split into smaller patches for further processing. However, MIL-based techniques ignore...

Full description

Bibliographic Details
Main Authors:	Nakhli, Ramin, Moghadam, Puria Azadi, Mi, Haoyang, Farahani, Hossein, Baras, Alexander, Gilks, Blake, Bashashati, Ali
Format:	Text
Language:	unknown
Published:	2023
Subjects:	Computer Science - Computer Vision and Pattern Recognition inuit
Online Access:	http://arxiv.org/abs/2303.00865

id	ftarxivpreprints:oai:arXiv.org:2303.00865
record_format	openpolar
spelling	ftarxivpreprints:oai:arXiv.org:2303.00865 2023-09-05T13:20:41+02:00 AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images Nakhli, Ramin Moghadam, Puria Azadi Mi, Haoyang Farahani, Hossein Baras, Alexander Gilks, Blake Bashashati, Ali 2023-03-01 http://arxiv.org/abs/2303.00865 unknown http://arxiv.org/abs/2303.00865 Computer Science - Computer Vision and Pattern Recognition text 2023 ftarxivpreprints 2023-08-16T17:33:59Z Processing giga-pixel whole slide histopathology images (WSI) is a computationally expensive task. Multiple instance learning (MIL) has become the conventional approach to process WSIs, in which these images are split into smaller patches for further processing. However, MIL-based techniques ignore explicit information about the individual cells within a patch. In this paper, by defining the novel concept of shared-context processing, we designed a multi-modal Graph Transformer (AMIGO) that uses the celluar graph within the tissue to provide a single representation for a patient while taking advantage of the hierarchical structure of the tissue, enabling a dynamic focus between cell-level and tissue-level information. We benchmarked the performance of our model against multiple state-of-the-art methods in survival prediction and showed that ours can significantly outperform all of them including hierarchical Vision Transformer (ViT). More importantly, we show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data. Finally, in two different cancer datasets, we demonstrated that our model was able to stratify the patients into low-risk and high-risk groups while other state-of-the-art methods failed to achieve this goal. We also publish a large dataset of immunohistochemistry images (InUIT) containing 1,600 tissue microarray (TMA) cores from 188 patients along with their survival information, making it one of the largest publicly available datasets in this context. Comment: Accepted at CVPR 2023 Text inuit ArXiv.org (Cornell University Library)
institution	Open Polar
collection	ArXiv.org (Cornell University Library)
op_collection_id	ftarxivpreprints
language	unknown
topic	Computer Science - Computer Vision and Pattern Recognition
spellingShingle	Computer Science - Computer Vision and Pattern Recognition Nakhli, Ramin Moghadam, Puria Azadi Mi, Haoyang Farahani, Hossein Baras, Alexander Gilks, Blake Bashashati, Ali AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images
topic_facet	Computer Science - Computer Vision and Pattern Recognition
description	Processing giga-pixel whole slide histopathology images (WSI) is a computationally expensive task. Multiple instance learning (MIL) has become the conventional approach to process WSIs, in which these images are split into smaller patches for further processing. However, MIL-based techniques ignore explicit information about the individual cells within a patch. In this paper, by defining the novel concept of shared-context processing, we designed a multi-modal Graph Transformer (AMIGO) that uses the celluar graph within the tissue to provide a single representation for a patient while taking advantage of the hierarchical structure of the tissue, enabling a dynamic focus between cell-level and tissue-level information. We benchmarked the performance of our model against multiple state-of-the-art methods in survival prediction and showed that ours can significantly outperform all of them including hierarchical Vision Transformer (ViT). More importantly, we show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data. Finally, in two different cancer datasets, we demonstrated that our model was able to stratify the patients into low-risk and high-risk groups while other state-of-the-art methods failed to achieve this goal. We also publish a large dataset of immunohistochemistry images (InUIT) containing 1,600 tissue microarray (TMA) cores from 188 patients along with their survival information, making it one of the largest publicly available datasets in this context. Comment: Accepted at CVPR 2023
format	Text
author	Nakhli, Ramin Moghadam, Puria Azadi Mi, Haoyang Farahani, Hossein Baras, Alexander Gilks, Blake Bashashati, Ali
author_facet	Nakhli, Ramin Moghadam, Puria Azadi Mi, Haoyang Farahani, Hossein Baras, Alexander Gilks, Blake Bashashati, Ali
author_sort	Nakhli, Ramin
title	AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images
title_short	AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images
title_full	AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images
title_fullStr	AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images
title_full_unstemmed	AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images
title_sort	amigo: sparse multi-modal graph transformer with shared-context processing for representation learning of giga-pixel images
publishDate	2023
url	http://arxiv.org/abs/2303.00865
genre	inuit
genre_facet	inuit
op_relation	http://arxiv.org/abs/2303.00865
_version_	1776201323956404224

AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images

Similar Items