Prompting Scientific Names for Zero-Shot Species Recognition ...

Trained on web-scale image-text pairs, Vision-Language Models (VLMs) such as CLIP can recognize images of common objects in a zero-shot fashion. However, it is underexplored how to use CLIP for zero-shot recognition of highly specialized concepts, e.g., species of birds, plants, and animals, for whi...

Full description

Bibliographic Details
Main Authors:	Parashar, Shubham, Lin, Zhiqiu, Li, Yanan, Kong, Shu
Format:	Article in Journal/Newspaper
Language:	unknown
Published:	arXiv 2023
Subjects:	Computer Vision and Pattern Recognition cs.CV Computation and Language cs.CL FOS Computer and information sciences Lepus timidus mountain hare
Online Access:	https://dx.doi.org/10.48550/arxiv.2310.09929 https://arxiv.org/abs/2310.09929

id	ftdatacite:10.48550/arxiv.2310.09929
record_format	openpolar
spelling	ftdatacite:10.48550/arxiv.2310.09929 2023-12-03T10:25:34+01:00 Prompting Scientific Names for Zero-Shot Species Recognition ... Parashar, Shubham Lin, Zhiqiu Li, Yanan Kong, Shu 2023 https://dx.doi.org/10.48550/arxiv.2310.09929 https://arxiv.org/abs/2310.09929 unknown arXiv arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Computer Vision and Pattern Recognition cs.CV Computation and Language cs.CL FOS Computer and information sciences CreativeWork article Article Preprint 2023 ftdatacite https://doi.org/10.48550/arxiv.2310.09929 2023-11-03T10:52:18Z Trained on web-scale image-text pairs, Vision-Language Models (VLMs) such as CLIP can recognize images of common objects in a zero-shot fashion. However, it is underexplored how to use CLIP for zero-shot recognition of highly specialized concepts, e.g., species of birds, plants, and animals, for which their scientific names are written in Latin or Greek. Indeed, CLIP performs poorly for zero-shot species recognition with prompts that use scientific names, e.g., "a photo of Lepus Timidus" (which is a scientific name in Latin). Because these names are usually not included in CLIP's training set. To improve performance, prior works propose to use large-language models (LLMs) to generate descriptions (e.g., of species color and shape) and additionally use them in prompts. We find that they bring only marginal gains. Differently, we are motivated to translate scientific names (e.g., Lepus Timidus) to common English names (e.g., mountain hare) and use such in the prompts. We find that common names are more likely ... : EMNLP 2023 ... Article in Journal/Newspaper Lepus timidus mountain hare DataCite Metadata Store (German National Library of Science and Technology)
institution	Open Polar
collection	DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id	ftdatacite
language	unknown
topic	Computer Vision and Pattern Recognition cs.CV Computation and Language cs.CL FOS Computer and information sciences
spellingShingle	Computer Vision and Pattern Recognition cs.CV Computation and Language cs.CL FOS Computer and information sciences Parashar, Shubham Lin, Zhiqiu Li, Yanan Kong, Shu Prompting Scientific Names for Zero-Shot Species Recognition ...
topic_facet	Computer Vision and Pattern Recognition cs.CV Computation and Language cs.CL FOS Computer and information sciences
description	Trained on web-scale image-text pairs, Vision-Language Models (VLMs) such as CLIP can recognize images of common objects in a zero-shot fashion. However, it is underexplored how to use CLIP for zero-shot recognition of highly specialized concepts, e.g., species of birds, plants, and animals, for which their scientific names are written in Latin or Greek. Indeed, CLIP performs poorly for zero-shot species recognition with prompts that use scientific names, e.g., "a photo of Lepus Timidus" (which is a scientific name in Latin). Because these names are usually not included in CLIP's training set. To improve performance, prior works propose to use large-language models (LLMs) to generate descriptions (e.g., of species color and shape) and additionally use them in prompts. We find that they bring only marginal gains. Differently, we are motivated to translate scientific names (e.g., Lepus Timidus) to common English names (e.g., mountain hare) and use such in the prompts. We find that common names are more likely ... : EMNLP 2023 ...
format	Article in Journal/Newspaper
author	Parashar, Shubham Lin, Zhiqiu Li, Yanan Kong, Shu
author_facet	Parashar, Shubham Lin, Zhiqiu Li, Yanan Kong, Shu
author_sort	Parashar, Shubham
title	Prompting Scientific Names for Zero-Shot Species Recognition ...
title_short	Prompting Scientific Names for Zero-Shot Species Recognition ...
title_full	Prompting Scientific Names for Zero-Shot Species Recognition ...
title_fullStr	Prompting Scientific Names for Zero-Shot Species Recognition ...
title_full_unstemmed	Prompting Scientific Names for Zero-Shot Species Recognition ...
title_sort	prompting scientific names for zero-shot species recognition ...
publisher	arXiv
publishDate	2023
url	https://dx.doi.org/10.48550/arxiv.2310.09929 https://arxiv.org/abs/2310.09929
genre	Lepus timidus mountain hare
genre_facet	Lepus timidus mountain hare
op_rights	arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/
op_doi	https://doi.org/10.48550/arxiv.2310.09929
_version_	1784274508702547968

Prompting Scientific Names for Zero-Shot Species Recognition ...

Similar Items