Prompting Scientific Names for Zero-Shot Species Recognition ...

Trained on web-scale image-text pairs, Vision-Language Models (VLMs) such as CLIP can recognize images of common objects in a zero-shot fashion. However, it is underexplored how to use CLIP for zero-shot recognition of highly specialized concepts, e.g., species of birds, plants, and animals, for whi...

Full description

Bibliographic Details
Main Authors: Parashar, Shubham, Lin, Zhiqiu, Li, Yanan, Kong, Shu
Format: Article in Journal/Newspaper
Language:unknown
Published: arXiv 2023
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.2310.09929
https://arxiv.org/abs/2310.09929
id ftdatacite:10.48550/arxiv.2310.09929
record_format openpolar
spelling ftdatacite:10.48550/arxiv.2310.09929 2023-12-03T10:25:34+01:00 Prompting Scientific Names for Zero-Shot Species Recognition ... Parashar, Shubham Lin, Zhiqiu Li, Yanan Kong, Shu 2023 https://dx.doi.org/10.48550/arxiv.2310.09929 https://arxiv.org/abs/2310.09929 unknown arXiv arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Computer Vision and Pattern Recognition cs.CV Computation and Language cs.CL FOS Computer and information sciences CreativeWork article Article Preprint 2023 ftdatacite https://doi.org/10.48550/arxiv.2310.09929 2023-11-03T10:52:18Z Trained on web-scale image-text pairs, Vision-Language Models (VLMs) such as CLIP can recognize images of common objects in a zero-shot fashion. However, it is underexplored how to use CLIP for zero-shot recognition of highly specialized concepts, e.g., species of birds, plants, and animals, for which their scientific names are written in Latin or Greek. Indeed, CLIP performs poorly for zero-shot species recognition with prompts that use scientific names, e.g., "a photo of Lepus Timidus" (which is a scientific name in Latin). Because these names are usually not included in CLIP's training set. To improve performance, prior works propose to use large-language models (LLMs) to generate descriptions (e.g., of species color and shape) and additionally use them in prompts. We find that they bring only marginal gains. Differently, we are motivated to translate scientific names (e.g., Lepus Timidus) to common English names (e.g., mountain hare) and use such in the prompts. We find that common names are more likely ... : EMNLP 2023 ... Article in Journal/Newspaper Lepus timidus mountain hare DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Computer Vision and Pattern Recognition cs.CV
Computation and Language cs.CL
FOS Computer and information sciences
spellingShingle Computer Vision and Pattern Recognition cs.CV
Computation and Language cs.CL
FOS Computer and information sciences
Parashar, Shubham
Lin, Zhiqiu
Li, Yanan
Kong, Shu
Prompting Scientific Names for Zero-Shot Species Recognition ...
topic_facet Computer Vision and Pattern Recognition cs.CV
Computation and Language cs.CL
FOS Computer and information sciences
description Trained on web-scale image-text pairs, Vision-Language Models (VLMs) such as CLIP can recognize images of common objects in a zero-shot fashion. However, it is underexplored how to use CLIP for zero-shot recognition of highly specialized concepts, e.g., species of birds, plants, and animals, for which their scientific names are written in Latin or Greek. Indeed, CLIP performs poorly for zero-shot species recognition with prompts that use scientific names, e.g., "a photo of Lepus Timidus" (which is a scientific name in Latin). Because these names are usually not included in CLIP's training set. To improve performance, prior works propose to use large-language models (LLMs) to generate descriptions (e.g., of species color and shape) and additionally use them in prompts. We find that they bring only marginal gains. Differently, we are motivated to translate scientific names (e.g., Lepus Timidus) to common English names (e.g., mountain hare) and use such in the prompts. We find that common names are more likely ... : EMNLP 2023 ...
format Article in Journal/Newspaper
author Parashar, Shubham
Lin, Zhiqiu
Li, Yanan
Kong, Shu
author_facet Parashar, Shubham
Lin, Zhiqiu
Li, Yanan
Kong, Shu
author_sort Parashar, Shubham
title Prompting Scientific Names for Zero-Shot Species Recognition ...
title_short Prompting Scientific Names for Zero-Shot Species Recognition ...
title_full Prompting Scientific Names for Zero-Shot Species Recognition ...
title_fullStr Prompting Scientific Names for Zero-Shot Species Recognition ...
title_full_unstemmed Prompting Scientific Names for Zero-Shot Species Recognition ...
title_sort prompting scientific names for zero-shot species recognition ...
publisher arXiv
publishDate 2023
url https://dx.doi.org/10.48550/arxiv.2310.09929
https://arxiv.org/abs/2310.09929
genre Lepus timidus
mountain hare
genre_facet Lepus timidus
mountain hare
op_rights arXiv.org perpetual, non-exclusive license
http://arxiv.org/licenses/nonexclusive-distrib/1.0/
op_doi https://doi.org/10.48550/arxiv.2310.09929
_version_ 1784274508702547968