Semantic clustering of pivot paraphrases

International audience Paraphrases extracted from parallel corpora by the pivot method (Bannard and Callison-Burch, 2005) constitute a valuable resource for multilingual NLP applications. In this study, we analyse the semantics of unigram pivot paraphrases and use a graph-based sense induction appro...

Full description

Bibliographic Details
Main Authors: Apidianaki, Marianna, Verzeni, Emilia, McCarthy, Diana
Other Authors: Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), Université Paris Saclay (COmUE)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université - UFR d'Ingénierie (UFR 919), Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Université Paris-Sud - Paris 11 (UP11)
Format: Conference Object
Language:English
Published: HAL CCSD 2014
Subjects:
Online Access:https://hal.archives-ouvertes.fr/hal-01838559
id ftccsdartic:oai:HAL:hal-01838559v1
record_format openpolar
spelling ftccsdartic:oai:HAL:hal-01838559v1 2023-05-15T16:48:24+02:00 Semantic clustering of pivot paraphrases Apidianaki, Marianna Verzeni, Emilia McCarthy, Diana Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI) Université Paris Saclay (COmUE)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université - UFR d'Ingénierie (UFR 919) Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Université Paris-Sud - Paris 11 (UP11) Reykjavik, Iceland 2014-01-01 https://hal.archives-ouvertes.fr/hal-01838559 en eng HAL CCSD hal-01838559 https://hal.archives-ouvertes.fr/hal-01838559 International Conference on Language Resources and Evaluation https://hal.archives-ouvertes.fr/hal-01838559 International Conference on Language Resources and Evaluation, Jan 2014, Reykjavik, Iceland parallel corpora sense clustering pivot paraphrasing [INFO]Computer Science [cs] [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] info:eu-repo/semantics/conferenceObject Conference papers 2014 ftccsdartic 2021-12-19T02:17:26Z International audience Paraphrases extracted from parallel corpora by the pivot method (Bannard and Callison-Burch, 2005) constitute a valuable resource for multilingual NLP applications. In this study, we analyse the semantics of unigram pivot paraphrases and use a graph-based sense induction approach to unveil hidden sense distinctions in the paraphrase sets. The comparison of the acquired senses to gold data from the Lexical Substitution shared task (McCarthy and Navigli, 2007) demonstrates that sense distinctions exist in the paraphrase sets and highlights the need for a disambiguation step in applications using this resource. Conference Object Iceland Archive ouverte HAL (Hyper Article en Ligne, CCSD - Centre pour la Communication Scientifique Directe) Burch ENVELOPE(164.417,164.417,-70.817,-70.817) McCarthy ENVELOPE(66.543,66.543,-70.404,-70.404) Pivot ENVELOPE(-30.239,-30.239,-80.667,-80.667)
institution Open Polar
collection Archive ouverte HAL (Hyper Article en Ligne, CCSD - Centre pour la Communication Scientifique Directe)
op_collection_id ftccsdartic
language English
topic parallel corpora
sense clustering
pivot paraphrasing
[INFO]Computer Science [cs]
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
spellingShingle parallel corpora
sense clustering
pivot paraphrasing
[INFO]Computer Science [cs]
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Apidianaki, Marianna
Verzeni, Emilia
McCarthy, Diana
Semantic clustering of pivot paraphrases
topic_facet parallel corpora
sense clustering
pivot paraphrasing
[INFO]Computer Science [cs]
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
description International audience Paraphrases extracted from parallel corpora by the pivot method (Bannard and Callison-Burch, 2005) constitute a valuable resource for multilingual NLP applications. In this study, we analyse the semantics of unigram pivot paraphrases and use a graph-based sense induction approach to unveil hidden sense distinctions in the paraphrase sets. The comparison of the acquired senses to gold data from the Lexical Substitution shared task (McCarthy and Navigli, 2007) demonstrates that sense distinctions exist in the paraphrase sets and highlights the need for a disambiguation step in applications using this resource.
author2 Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI)
Université Paris Saclay (COmUE)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université - UFR d'Ingénierie (UFR 919)
Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Université Paris-Sud - Paris 11 (UP11)
format Conference Object
author Apidianaki, Marianna
Verzeni, Emilia
McCarthy, Diana
author_facet Apidianaki, Marianna
Verzeni, Emilia
McCarthy, Diana
author_sort Apidianaki, Marianna
title Semantic clustering of pivot paraphrases
title_short Semantic clustering of pivot paraphrases
title_full Semantic clustering of pivot paraphrases
title_fullStr Semantic clustering of pivot paraphrases
title_full_unstemmed Semantic clustering of pivot paraphrases
title_sort semantic clustering of pivot paraphrases
publisher HAL CCSD
publishDate 2014
url https://hal.archives-ouvertes.fr/hal-01838559
op_coverage Reykjavik, Iceland
long_lat ENVELOPE(164.417,164.417,-70.817,-70.817)
ENVELOPE(66.543,66.543,-70.404,-70.404)
ENVELOPE(-30.239,-30.239,-80.667,-80.667)
geographic Burch
McCarthy
Pivot
geographic_facet Burch
McCarthy
Pivot
genre Iceland
genre_facet Iceland
op_source International Conference on Language Resources and Evaluation
https://hal.archives-ouvertes.fr/hal-01838559
International Conference on Language Resources and Evaluation, Jan 2014, Reykjavik, Iceland
op_relation hal-01838559
https://hal.archives-ouvertes.fr/hal-01838559
_version_ 1766038497405173760