Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification ...

Deep learning approaches exhibit promising performances on various text tasks. However, they are still struggling on medical text classification since samples are often extremely imbalanced and scarce. Different from existing mainstream approaches that focus on supplementary semantics with external...

Full description

Bibliographic Details
Main Authors: Yan, Jiahuan, Gao, Haojun, Kai, Zhang, Liu, Weize, Chen, Danny, Wu, Jian, Chen, Jintai
Format: Report
Language:unknown
Published: arXiv 2023
Subjects:
DML
Online Access:https://dx.doi.org/10.48550/arxiv.2311.16650
https://arxiv.org/abs/2311.16650
id ftdatacite:10.48550/arxiv.2311.16650
record_format openpolar
spelling ftdatacite:10.48550/arxiv.2311.16650 2023-12-31T10:06:17+01:00 Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification ... Yan, Jiahuan Gao, Haojun Kai, Zhang Liu, Weize Chen, Danny Wu, Jian Chen, Jintai 2023 https://dx.doi.org/10.48550/arxiv.2311.16650 https://arxiv.org/abs/2311.16650 unknown arXiv arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Computation and Language cs.CL FOS Computer and information sciences CreativeWork Preprint article Article 2023 ftdatacite https://doi.org/10.48550/arxiv.2311.16650 2023-12-01T12:19:45Z Deep learning approaches exhibit promising performances on various text tasks. However, they are still struggling on medical text classification since samples are often extremely imbalanced and scarce. Different from existing mainstream approaches that focus on supplementary semantics with external medical information, this paper aims to rethink the data challenges in medical texts and present a novel framework-agnostic algorithm called Text2Tree that only utilizes internal label hierarchy in training deep learning models. We embed the ICD code tree structure of labels into cascade attention modules for learning hierarchy-aware label representations. Two new learning schemes, Similarity Surrogate Learning (SSL) and Dissimilarity Mixup Learning (DML), are devised to boost text classification by reusing and distinguishing samples of other labels following the label representation hierarchy, respectively. Experiments on authoritative public datasets and real-world medical records show that our approach stably ... : EMNLP 2023 Findings. Code: https://github.com/jyansir/Text2Tree ... Report DML DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Computation and Language cs.CL
FOS Computer and information sciences
spellingShingle Computation and Language cs.CL
FOS Computer and information sciences
Yan, Jiahuan
Gao, Haojun
Kai, Zhang
Liu, Weize
Chen, Danny
Wu, Jian
Chen, Jintai
Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification ...
topic_facet Computation and Language cs.CL
FOS Computer and information sciences
description Deep learning approaches exhibit promising performances on various text tasks. However, they are still struggling on medical text classification since samples are often extremely imbalanced and scarce. Different from existing mainstream approaches that focus on supplementary semantics with external medical information, this paper aims to rethink the data challenges in medical texts and present a novel framework-agnostic algorithm called Text2Tree that only utilizes internal label hierarchy in training deep learning models. We embed the ICD code tree structure of labels into cascade attention modules for learning hierarchy-aware label representations. Two new learning schemes, Similarity Surrogate Learning (SSL) and Dissimilarity Mixup Learning (DML), are devised to boost text classification by reusing and distinguishing samples of other labels following the label representation hierarchy, respectively. Experiments on authoritative public datasets and real-world medical records show that our approach stably ... : EMNLP 2023 Findings. Code: https://github.com/jyansir/Text2Tree ...
format Report
author Yan, Jiahuan
Gao, Haojun
Kai, Zhang
Liu, Weize
Chen, Danny
Wu, Jian
Chen, Jintai
author_facet Yan, Jiahuan
Gao, Haojun
Kai, Zhang
Liu, Weize
Chen, Danny
Wu, Jian
Chen, Jintai
author_sort Yan, Jiahuan
title Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification ...
title_short Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification ...
title_full Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification ...
title_fullStr Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification ...
title_full_unstemmed Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification ...
title_sort text2tree: aligning text representation to the label tree hierarchy for imbalanced medical classification ...
publisher arXiv
publishDate 2023
url https://dx.doi.org/10.48550/arxiv.2311.16650
https://arxiv.org/abs/2311.16650
genre DML
genre_facet DML
op_rights arXiv.org perpetual, non-exclusive license
http://arxiv.org/licenses/nonexclusive-distrib/1.0/
op_doi https://doi.org/10.48550/arxiv.2311.16650
_version_ 1786838273367212032