Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary
International audience The lack of large-scale, freely available and durable lexical resources, and the consequences for NLP, is widely acknowledged but the attempts to cope with usual bottlenecks preventing their development often result in dead-ends. This article introduces a language-independent,...
Main Authors: | , , , , |
---|---|
Other Authors: | , , , , , , , , , , , , , , , |
Format: | Conference Object |
Language: | English |
Published: |
HAL CCSD
2010
|
Subjects: | |
Online Access: | https://hal.science/hal-00625326 https://hal.science/hal-00625326/document https://hal.science/hal-00625326/file/sajousEtAl2010-IceTAL.pdf |
id |
ftecolephe:oai:HAL:hal-00625326v1 |
---|---|
record_format |
openpolar |
spelling |
ftecolephe:oai:HAL:hal-00625326v1 2024-09-09T19:46:59+00:00 Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary Sajous, Franck Navarro, Emmanuel Gaume, Bruno Prévot, Laurent Chudy, Yannick Cognition, Langues, Langage, Ergonomie (CLLE-ERSS) École Pratique des Hautes Études (EPHE) Université Paris Sciences et Lettres (PSL)-Université Paris Sciences et Lettres (PSL)-Université Toulouse - Jean Jaurès (UT2J) Université de Toulouse (UT)-Université de Toulouse (UT)-Université Bordeaux Montaigne (UBM)-Centre National de la Recherche Scientifique (CNRS) Institut de recherche en informatique de Toulouse (IRIT) Université Toulouse Capitole (UT Capitole) Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J) Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3) Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP) Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI) Université Toulouse - Jean Jaurès (UT2J) Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3) Université de Toulouse (UT) Laboratoire Parole et Langage (LPL) Aix Marseille Université (AMU)-Centre National de la Recherche Scientifique (CNRS) Hrafn Loftsson, Eiríkur Rögnvaldsson and Sigrún Helgadóttir Reykjavik, Iceland 2010-08-16 https://hal.science/hal-00625326 https://hal.science/hal-00625326/document https://hal.science/hal-00625326/file/sajousEtAl2010-IceTAL.pdf en eng HAL CCSD Springer Berlin/Heidelberg hal-00625326 https://hal.science/hal-00625326 https://hal.science/hal-00625326/document https://hal.science/hal-00625326/file/sajousEtAl2010-IceTAL.pdf info:eu-repo/semantics/OpenAccess Advances in Natural Language Processing 7th International Conference on NLP, IceTAL 2010 https://hal.science/hal-00625326 7th International Conference on NLP, IceTAL 2010, Aug 2010, Reykjavik, Iceland. pp.332-344 Random Walks Collaboratively Constructed Lexical Resources Endogenous Enrichment Crowdsourcing Wiktionary [SHS.LANGUE]Humanities and Social Sciences/Linguistics [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] info:eu-repo/semantics/conferenceObject Conference papers 2010 ftecolephe 2024-07-08T23:40:42Z International audience The lack of large-scale, freely available and durable lexical resources, and the consequences for NLP, is widely acknowledged but the attempts to cope with usual bottlenecks preventing their development often result in dead-ends. This article introduces a language-independent, semi-automatic and endogenous method for enriching lexical resources, based on collaborative editing and random walks through existing lexical relationships, and shows how this approach enables us to overcome recurrent impediments. It compares the impact of using different data sources and similarity measures on the task of improving synonymy networks. Finally, it defines an architecture for applying the presented method to Wiktionary and explains how it has been implemented. Conference Object Iceland EPHE (Ecole pratique des hautes études, Paris): HAL |
institution |
Open Polar |
collection |
EPHE (Ecole pratique des hautes études, Paris): HAL |
op_collection_id |
ftecolephe |
language |
English |
topic |
Random Walks Collaboratively Constructed Lexical Resources Endogenous Enrichment Crowdsourcing Wiktionary [SHS.LANGUE]Humanities and Social Sciences/Linguistics [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] |
spellingShingle |
Random Walks Collaboratively Constructed Lexical Resources Endogenous Enrichment Crowdsourcing Wiktionary [SHS.LANGUE]Humanities and Social Sciences/Linguistics [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] Sajous, Franck Navarro, Emmanuel Gaume, Bruno Prévot, Laurent Chudy, Yannick Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary |
topic_facet |
Random Walks Collaboratively Constructed Lexical Resources Endogenous Enrichment Crowdsourcing Wiktionary [SHS.LANGUE]Humanities and Social Sciences/Linguistics [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] |
description |
International audience The lack of large-scale, freely available and durable lexical resources, and the consequences for NLP, is widely acknowledged but the attempts to cope with usual bottlenecks preventing their development often result in dead-ends. This article introduces a language-independent, semi-automatic and endogenous method for enriching lexical resources, based on collaborative editing and random walks through existing lexical relationships, and shows how this approach enables us to overcome recurrent impediments. It compares the impact of using different data sources and similarity measures on the task of improving synonymy networks. Finally, it defines an architecture for applying the presented method to Wiktionary and explains how it has been implemented. |
author2 |
Cognition, Langues, Langage, Ergonomie (CLLE-ERSS) École Pratique des Hautes Études (EPHE) Université Paris Sciences et Lettres (PSL)-Université Paris Sciences et Lettres (PSL)-Université Toulouse - Jean Jaurès (UT2J) Université de Toulouse (UT)-Université de Toulouse (UT)-Université Bordeaux Montaigne (UBM)-Centre National de la Recherche Scientifique (CNRS) Institut de recherche en informatique de Toulouse (IRIT) Université Toulouse Capitole (UT Capitole) Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J) Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3) Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP) Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI) Université Toulouse - Jean Jaurès (UT2J) Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3) Université de Toulouse (UT) Laboratoire Parole et Langage (LPL) Aix Marseille Université (AMU)-Centre National de la Recherche Scientifique (CNRS) Hrafn Loftsson, Eiríkur Rögnvaldsson and Sigrún Helgadóttir |
format |
Conference Object |
author |
Sajous, Franck Navarro, Emmanuel Gaume, Bruno Prévot, Laurent Chudy, Yannick |
author_facet |
Sajous, Franck Navarro, Emmanuel Gaume, Bruno Prévot, Laurent Chudy, Yannick |
author_sort |
Sajous, Franck |
title |
Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary |
title_short |
Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary |
title_full |
Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary |
title_fullStr |
Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary |
title_full_unstemmed |
Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary |
title_sort |
semi-automatic endogenous enrichment of collaboratively constructed lexical resources: piggybacking onto wiktionary |
publisher |
HAL CCSD |
publishDate |
2010 |
url |
https://hal.science/hal-00625326 https://hal.science/hal-00625326/document https://hal.science/hal-00625326/file/sajousEtAl2010-IceTAL.pdf |
op_coverage |
Reykjavik, Iceland |
genre |
Iceland |
genre_facet |
Iceland |
op_source |
Advances in Natural Language Processing 7th International Conference on NLP, IceTAL 2010 https://hal.science/hal-00625326 7th International Conference on NLP, IceTAL 2010, Aug 2010, Reykjavik, Iceland. pp.332-344 |
op_relation |
hal-00625326 https://hal.science/hal-00625326 https://hal.science/hal-00625326/document https://hal.science/hal-00625326/file/sajousEtAl2010-IceTAL.pdf |
op_rights |
info:eu-repo/semantics/OpenAccess |
_version_ |
1809916468249231360 |