From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
International audience In this paper we present a statistical machine learning approach to neologism detection going some way beyond the use of exclusion lists. We explore the impact of three groups of features: form related, morpho-lexical and thematic features. The latter type of features has not...
Main Authors: | , , |
---|---|
Other Authors: | , , , |
Format: | Other/Unknown Material |
Language: | English |
Published: |
HAL CCSD
2014
|
Subjects: | |
Online Access: | https://hal.inria.fr/hal-00959079/file/logo.pdf https://hal.inria.fr/hal-00959079 |
id |
fttriple:oai:gotriple.eu:10670/1.mvkvoa |
---|---|
record_format |
openpolar |
spelling |
fttriple:oai:gotriple.eu:10670/1.mvkvoa 2023-05-15T16:48:15+02:00 From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers Falk, Ingrid Bernhard, Delphine Gérard, Christophe Linguistique, Langues et Parole (LILPA) Université de Strasbourg (UNISTRA) Logoscope, Contrat IDEX 2012 avec l'Université de Strasbourg Logoscope Reykjavik, Iceland 2014-05-27 https://hal.inria.fr/hal-00959079/file/logo.pdf https://hal.inria.fr/hal-00959079 en eng HAL CCSD hal-00959079 10670/1.mvkvoa https://hal.inria.fr/hal-00959079/file/logo.pdf https://hal.inria.fr/hal-00959079 other Hyper Article en Ligne - Sciences de l'Homme et de la Société LREC - The 9th edition of the Language Resources and Evaluation Conference LREC - The 9th edition of the Language Resources and Evaluation Conference, May 2014, Reykjavik, Iceland psy lang Conference Output https://vocabularies.coar-repositories.org/resource_types/c_c94f/ 2014 fttriple 2023-01-22T18:25:26Z International audience In this paper we present a statistical machine learning approach to neologism detection going some way beyond the use of exclusion lists. We explore the impact of three groups of features: form related, morpho-lexical and thematic features. The latter type of features has not yet been used in this kind of application and represents a way to access the semantic context of new words. The results suggest that form related features are helpful at the overall classification task, while morpho-lexical and thematic features better single out true neologisms. Other/Unknown Material Iceland Unknown |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
fttriple |
language |
English |
topic |
psy lang |
spellingShingle |
psy lang Falk, Ingrid Bernhard, Delphine Gérard, Christophe From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers |
topic_facet |
psy lang |
description |
International audience In this paper we present a statistical machine learning approach to neologism detection going some way beyond the use of exclusion lists. We explore the impact of three groups of features: form related, morpho-lexical and thematic features. The latter type of features has not yet been used in this kind of application and represents a way to access the semantic context of new words. The results suggest that form related features are helpful at the overall classification task, while morpho-lexical and thematic features better single out true neologisms. |
author2 |
Linguistique, Langues et Parole (LILPA) Université de Strasbourg (UNISTRA) Logoscope, Contrat IDEX 2012 avec l'Université de Strasbourg Logoscope |
format |
Other/Unknown Material |
author |
Falk, Ingrid Bernhard, Delphine Gérard, Christophe |
author_facet |
Falk, Ingrid Bernhard, Delphine Gérard, Christophe |
author_sort |
Falk, Ingrid |
title |
From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers |
title_short |
From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers |
title_full |
From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers |
title_fullStr |
From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers |
title_full_unstemmed |
From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers |
title_sort |
from non word to new word: automatically identifying neologisms in french newspapers |
publisher |
HAL CCSD |
publishDate |
2014 |
url |
https://hal.inria.fr/hal-00959079/file/logo.pdf https://hal.inria.fr/hal-00959079 |
op_coverage |
Reykjavik, Iceland |
genre |
Iceland |
genre_facet |
Iceland |
op_source |
Hyper Article en Ligne - Sciences de l'Homme et de la Société LREC - The 9th edition of the Language Resources and Evaluation Conference LREC - The 9th edition of the Language Resources and Evaluation Conference, May 2014, Reykjavik, Iceland |
op_relation |
hal-00959079 10670/1.mvkvoa https://hal.inria.fr/hal-00959079/file/logo.pdf https://hal.inria.fr/hal-00959079 |
op_rights |
other |
_version_ |
1766038354187517952 |