From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers

International audience In this paper we present a statistical machine learning approach to neologism detection going some way beyond the use of exclusion lists. We explore the impact of three groups of features: form related, morpho-lexical and thematic features. The latter type of features has not...

Full description

Bibliographic Details
Main Authors: Falk, Ingrid, Bernhard, Delphine, Gérard, Christophe
Other Authors: Linguistique, Langues et Parole (LILPA), Université de Strasbourg (UNISTRA), Logoscope, Contrat IDEX 2012 avec l'Université de Strasbourg, Logoscope
Format: Other/Unknown Material
Language:English
Published: HAL CCSD 2014
Subjects:
psy
Online Access:https://hal.inria.fr/hal-00959079/file/logo.pdf
https://hal.inria.fr/hal-00959079
id fttriple:oai:gotriple.eu:10670/1.mvkvoa
record_format openpolar
spelling fttriple:oai:gotriple.eu:10670/1.mvkvoa 2023-05-15T16:48:15+02:00 From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers Falk, Ingrid Bernhard, Delphine Gérard, Christophe Linguistique, Langues et Parole (LILPA) Université de Strasbourg (UNISTRA) Logoscope, Contrat IDEX 2012 avec l'Université de Strasbourg Logoscope Reykjavik, Iceland 2014-05-27 https://hal.inria.fr/hal-00959079/file/logo.pdf https://hal.inria.fr/hal-00959079 en eng HAL CCSD hal-00959079 10670/1.mvkvoa https://hal.inria.fr/hal-00959079/file/logo.pdf https://hal.inria.fr/hal-00959079 other Hyper Article en Ligne - Sciences de l'Homme et de la Société LREC - The 9th edition of the Language Resources and Evaluation Conference LREC - The 9th edition of the Language Resources and Evaluation Conference, May 2014, Reykjavik, Iceland psy lang Conference Output https://vocabularies.coar-repositories.org/resource_types/c_c94f/ 2014 fttriple 2023-01-22T18:25:26Z International audience In this paper we present a statistical machine learning approach to neologism detection going some way beyond the use of exclusion lists. We explore the impact of three groups of features: form related, morpho-lexical and thematic features. The latter type of features has not yet been used in this kind of application and represents a way to access the semantic context of new words. The results suggest that form related features are helpful at the overall classification task, while morpho-lexical and thematic features better single out true neologisms. Other/Unknown Material Iceland Unknown
institution Open Polar
collection Unknown
op_collection_id fttriple
language English
topic psy
lang
spellingShingle psy
lang
Falk, Ingrid
Bernhard, Delphine
Gérard, Christophe
From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
topic_facet psy
lang
description International audience In this paper we present a statistical machine learning approach to neologism detection going some way beyond the use of exclusion lists. We explore the impact of three groups of features: form related, morpho-lexical and thematic features. The latter type of features has not yet been used in this kind of application and represents a way to access the semantic context of new words. The results suggest that form related features are helpful at the overall classification task, while morpho-lexical and thematic features better single out true neologisms.
author2 Linguistique, Langues et Parole (LILPA)
Université de Strasbourg (UNISTRA)
Logoscope, Contrat IDEX 2012 avec l'Université de Strasbourg
Logoscope
format Other/Unknown Material
author Falk, Ingrid
Bernhard, Delphine
Gérard, Christophe
author_facet Falk, Ingrid
Bernhard, Delphine
Gérard, Christophe
author_sort Falk, Ingrid
title From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
title_short From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
title_full From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
title_fullStr From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
title_full_unstemmed From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
title_sort from non word to new word: automatically identifying neologisms in french newspapers
publisher HAL CCSD
publishDate 2014
url https://hal.inria.fr/hal-00959079/file/logo.pdf
https://hal.inria.fr/hal-00959079
op_coverage Reykjavik, Iceland
genre Iceland
genre_facet Iceland
op_source Hyper Article en Ligne - Sciences de l'Homme et de la Société
LREC - The 9th edition of the Language Resources and Evaluation Conference
LREC - The 9th edition of the Language Resources and Evaluation Conference, May 2014, Reykjavik, Iceland
op_relation hal-00959079
10670/1.mvkvoa
https://hal.inria.fr/hal-00959079/file/logo.pdf
https://hal.inria.fr/hal-00959079
op_rights other
_version_ 1766038354187517952