What explains the success of cross-modal fine-tuning with ORCA? ...

ORCA (Shen et al., 2023) is a recent technique for cross-modal fine-tuning, i.e., applying pre-trained transformer models to modalities beyond their training data. The technique consists primarily of training an embedder and fine-tuning the embedder and model. Despite its high performance on a varie...

Full description

Bibliographic Details
Main Authors: García-de-Herreros, Paloma, Gautam, Vagrant, Slusallek, Philipp, Klakow, Dietrich, Mosbach, Marius
Format: Article in Journal/Newspaper
Language:unknown
Published: arXiv 2024
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.2403.13537
https://arxiv.org/abs/2403.13537
id ftdatacite:10.48550/arxiv.2403.13537
record_format openpolar
spelling ftdatacite:10.48550/arxiv.2403.13537 2024-04-28T08:35:12+00:00 What explains the success of cross-modal fine-tuning with ORCA? ... García-de-Herreros, Paloma Gautam, Vagrant Slusallek, Philipp Klakow, Dietrich Mosbach, Marius 2024 https://dx.doi.org/10.48550/arxiv.2403.13537 https://arxiv.org/abs/2403.13537 unknown arXiv Creative Commons Attribution Share Alike 4.0 International https://creativecommons.org/licenses/by-sa/4.0/legalcode cc-by-sa-4.0 Computation and Language cs.CL Artificial Intelligence cs.AI Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences article Article Preprint CreativeWork 2024 ftdatacite https://doi.org/10.48550/arxiv.2403.13537 2024-04-02T11:50:35Z ORCA (Shen et al., 2023) is a recent technique for cross-modal fine-tuning, i.e., applying pre-trained transformer models to modalities beyond their training data. The technique consists primarily of training an embedder and fine-tuning the embedder and model. Despite its high performance on a variety of downstream tasks, we do not understand precisely how each of these components contribute to ORCA's success. Therefore, we run a series of ablations and find that embedder training does not help 2D tasks at all, contrary to what the original paper posits. In 1D tasks, some amount of embedder training is necessary but more is not better. In 4 out of 6 datasets we experiment with, it is model fine-tuning that makes the biggest difference. Through our ablations and baselines, we contribute a better understanding of the individual components of ORCA. ... Article in Journal/Newspaper Orca DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Computation and Language cs.CL
Artificial Intelligence cs.AI
Computer Vision and Pattern Recognition cs.CV
Machine Learning cs.LG
FOS Computer and information sciences
spellingShingle Computation and Language cs.CL
Artificial Intelligence cs.AI
Computer Vision and Pattern Recognition cs.CV
Machine Learning cs.LG
FOS Computer and information sciences
García-de-Herreros, Paloma
Gautam, Vagrant
Slusallek, Philipp
Klakow, Dietrich
Mosbach, Marius
What explains the success of cross-modal fine-tuning with ORCA? ...
topic_facet Computation and Language cs.CL
Artificial Intelligence cs.AI
Computer Vision and Pattern Recognition cs.CV
Machine Learning cs.LG
FOS Computer and information sciences
description ORCA (Shen et al., 2023) is a recent technique for cross-modal fine-tuning, i.e., applying pre-trained transformer models to modalities beyond their training data. The technique consists primarily of training an embedder and fine-tuning the embedder and model. Despite its high performance on a variety of downstream tasks, we do not understand precisely how each of these components contribute to ORCA's success. Therefore, we run a series of ablations and find that embedder training does not help 2D tasks at all, contrary to what the original paper posits. In 1D tasks, some amount of embedder training is necessary but more is not better. In 4 out of 6 datasets we experiment with, it is model fine-tuning that makes the biggest difference. Through our ablations and baselines, we contribute a better understanding of the individual components of ORCA. ...
format Article in Journal/Newspaper
author García-de-Herreros, Paloma
Gautam, Vagrant
Slusallek, Philipp
Klakow, Dietrich
Mosbach, Marius
author_facet García-de-Herreros, Paloma
Gautam, Vagrant
Slusallek, Philipp
Klakow, Dietrich
Mosbach, Marius
author_sort García-de-Herreros, Paloma
title What explains the success of cross-modal fine-tuning with ORCA? ...
title_short What explains the success of cross-modal fine-tuning with ORCA? ...
title_full What explains the success of cross-modal fine-tuning with ORCA? ...
title_fullStr What explains the success of cross-modal fine-tuning with ORCA? ...
title_full_unstemmed What explains the success of cross-modal fine-tuning with ORCA? ...
title_sort what explains the success of cross-modal fine-tuning with orca? ...
publisher arXiv
publishDate 2024
url https://dx.doi.org/10.48550/arxiv.2403.13537
https://arxiv.org/abs/2403.13537
genre Orca
genre_facet Orca
op_rights Creative Commons Attribution Share Alike 4.0 International
https://creativecommons.org/licenses/by-sa/4.0/legalcode
cc-by-sa-4.0
op_doi https://doi.org/10.48550/arxiv.2403.13537
_version_ 1797567348161380352