French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime

International audience In this paper, we describe the development of French resources for the extraction and normalization of temporal expressions with HeidelTime, a open-source multilingual, cross-domain temporal tagger. HeidelTime extracts temporal expressions from documents and normalizes them ac...

Full description

Bibliographic Details
Main Authors: Moriceau, Véronique, Tannier, Xavier
Other Authors: Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), Université Paris-Sud - Paris 11 (UP11)-Sorbonne Université - UFR d'Ingénierie (UFR 919), Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université Paris Saclay (COmUE), European Language Resources Association (ELRA), Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis, ANR-10-CORD-0010,ChronoLines,Génération de Chronologies Evènementielles visuelles(2010)
Format: Conference Object
Language:English
Published: HAL CCSD 2014
Subjects:
Online Access:https://hal.science/hal-02489652
id ftanrparis:oai:HAL:hal-02489652v1
record_format openpolar
spelling ftanrparis:oai:HAL:hal-02489652v1 2023-11-12T04:19:19+01:00 French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime Moriceau, Véronique Tannier, Xavier Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI) Université Paris-Sud - Paris 11 (UP11)-Sorbonne Université - UFR d'Ingénierie (UFR 919) Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université Paris Saclay (COmUE) European Language Resources Association (ELRA) Nicoletta Calzolari Khalid Choukri Thierry Declerck Hrafn Loftsson Bente Maegaard Joseph Mariani Asuncion Moreno Jan Odijk Stelios Piperidis ANR-10-CORD-0010,ChronoLines,Génération de Chronologies Evènementielles visuelles(2010) Reykjavík, Iceland 2014-05-26 https://hal.science/hal-02489652 en eng HAL CCSD European Language Resources Association (ELRA) hal-02489652 https://hal.science/hal-02489652 http://creativecommons.org/licenses/by/ Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14) 9th International Conference on Language Resources and Evaluation (LREC 2014) https://hal.science/hal-02489652 9th International Conference on Language Resources and Evaluation (LREC 2014), European Language Resources Association (ELRA), May 2014, Reykjavík, Iceland. pp.3239-3243 https://aclanthology.org/L14-1382/ [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] info:eu-repo/semantics/conferenceObject Conference papers 2014 ftanrparis 2023-10-14T21:34:02Z International audience In this paper, we describe the development of French resources for the extraction and normalization of temporal expressions with HeidelTime, a open-source multilingual, cross-domain temporal tagger. HeidelTime extracts temporal expressions from documents and normalizes them according to the TIMEX3 annotation standard. Several types of temporal expressions are extracted: dates, times, durations and temporal sets. French resources have been evaluated in two different ways: on the French TimeBank corpus, a corpus of newspaper articles in French annotated according to the ISO-TimeML standard, and on a user application for automatic building of event timelines. Results on the French TimeBank are quite satisfaying as they are comparable to those obtained by HeidelTime in English and Spanish on newswire articles. Concerning the user application, we used two temporal taggers for the preprocessing of the corpus in order to compare their performance and results show that the performances of our application on French documents are better with HeidelTime. The French resources and evaluation scripts are publicly available with HeidelTime. Conference Object Iceland Reykjavík Reykjavík Portail HAL-ANR (Agence Nationale de la Recherche) Reykjavík
institution Open Polar
collection Portail HAL-ANR (Agence Nationale de la Recherche)
op_collection_id ftanrparis
language English
topic [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
spellingShingle [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Moriceau, Véronique
Tannier, Xavier
French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime
topic_facet [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
description International audience In this paper, we describe the development of French resources for the extraction and normalization of temporal expressions with HeidelTime, a open-source multilingual, cross-domain temporal tagger. HeidelTime extracts temporal expressions from documents and normalizes them according to the TIMEX3 annotation standard. Several types of temporal expressions are extracted: dates, times, durations and temporal sets. French resources have been evaluated in two different ways: on the French TimeBank corpus, a corpus of newspaper articles in French annotated according to the ISO-TimeML standard, and on a user application for automatic building of event timelines. Results on the French TimeBank are quite satisfaying as they are comparable to those obtained by HeidelTime in English and Spanish on newswire articles. Concerning the user application, we used two temporal taggers for the preprocessing of the corpus in order to compare their performance and results show that the performances of our application on French documents are better with HeidelTime. The French resources and evaluation scripts are publicly available with HeidelTime.
author2 Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI)
Université Paris-Sud - Paris 11 (UP11)-Sorbonne Université - UFR d'Ingénierie (UFR 919)
Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université Paris Saclay (COmUE)
European Language Resources Association (ELRA)
Nicoletta Calzolari
Khalid Choukri
Thierry Declerck
Hrafn Loftsson
Bente Maegaard
Joseph Mariani
Asuncion Moreno
Jan Odijk
Stelios Piperidis
ANR-10-CORD-0010,ChronoLines,Génération de Chronologies Evènementielles visuelles(2010)
format Conference Object
author Moriceau, Véronique
Tannier, Xavier
author_facet Moriceau, Véronique
Tannier, Xavier
author_sort Moriceau, Véronique
title French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime
title_short French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime
title_full French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime
title_fullStr French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime
title_full_unstemmed French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime
title_sort french resources for extraction and normalization of temporal expressions with heideltime
publisher HAL CCSD
publishDate 2014
url https://hal.science/hal-02489652
op_coverage Reykjavík, Iceland
geographic Reykjavík
geographic_facet Reykjavík
genre Iceland
Reykjavík
Reykjavík
genre_facet Iceland
Reykjavík
Reykjavík
op_source Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
9th International Conference on Language Resources and Evaluation (LREC 2014)
https://hal.science/hal-02489652
9th International Conference on Language Resources and Evaluation (LREC 2014), European Language Resources Association (ELRA), May 2014, Reykjavík, Iceland. pp.3239-3243
https://aclanthology.org/L14-1382/
op_relation hal-02489652
https://hal.science/hal-02489652
op_rights http://creativecommons.org/licenses/by/
_version_ 1782335787209064448