French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime
International audience In this paper, we describe the development of French resources for the extraction and normalization of temporal expressions with HeidelTime, a open-source multilingual, cross-domain temporal tagger. HeidelTime extracts temporal expressions from documents and normalizes them ac...
Main Authors: | , |
---|---|
Other Authors: | , , , , , , , , , , , , , |
Format: | Conference Object |
Language: | English |
Published: |
HAL CCSD
2014
|
Subjects: | |
Online Access: | https://hal.science/hal-02489652 |
id |
ftuniparissaclay:oai:HAL:hal-02489652v1 |
---|---|
record_format |
openpolar |
spelling |
ftuniparissaclay:oai:HAL:hal-02489652v1 2023-11-12T04:19:19+01:00 French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime Moriceau, Véronique Tannier, Xavier Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI) Université Paris-Sud - Paris 11 (UP11)-Sorbonne Université - UFR d'Ingénierie (UFR 919) Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université Paris Saclay (COmUE) European Language Resources Association (ELRA) Nicoletta Calzolari Khalid Choukri Thierry Declerck Hrafn Loftsson Bente Maegaard Joseph Mariani Asuncion Moreno Jan Odijk Stelios Piperidis ANR-10-CORD-0010,ChronoLines,Génération de Chronologies Evènementielles visuelles(2010) Reykjavík, Iceland 2014-05-26 https://hal.science/hal-02489652 en eng HAL CCSD European Language Resources Association (ELRA) hal-02489652 https://hal.science/hal-02489652 http://creativecommons.org/licenses/by/ Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14) 9th International Conference on Language Resources and Evaluation (LREC 2014) https://hal.science/hal-02489652 9th International Conference on Language Resources and Evaluation (LREC 2014), European Language Resources Association (ELRA), May 2014, Reykjavík, Iceland. pp.3239-3243 https://aclanthology.org/L14-1382/ [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] info:eu-repo/semantics/conferenceObject Conference papers 2014 ftuniparissaclay 2023-10-14T21:50:52Z International audience In this paper, we describe the development of French resources for the extraction and normalization of temporal expressions with HeidelTime, a open-source multilingual, cross-domain temporal tagger. HeidelTime extracts temporal expressions from documents and normalizes them according to the TIMEX3 annotation standard. Several types of temporal expressions are extracted: dates, times, durations and temporal sets. French resources have been evaluated in two different ways: on the French TimeBank corpus, a corpus of newspaper articles in French annotated according to the ISO-TimeML standard, and on a user application for automatic building of event timelines. Results on the French TimeBank are quite satisfaying as they are comparable to those obtained by HeidelTime in English and Spanish on newswire articles. Concerning the user application, we used two temporal taggers for the preprocessing of the corpus in order to compare their performance and results show that the performances of our application on French documents are better with HeidelTime. The French resources and evaluation scripts are publicly available with HeidelTime. Conference Object Iceland Reykjavík Reykjavík Archives ouvertes de Paris-Saclay Reykjavík |
institution |
Open Polar |
collection |
Archives ouvertes de Paris-Saclay |
op_collection_id |
ftuniparissaclay |
language |
English |
topic |
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] |
spellingShingle |
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] Moriceau, Véronique Tannier, Xavier French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime |
topic_facet |
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] |
description |
International audience In this paper, we describe the development of French resources for the extraction and normalization of temporal expressions with HeidelTime, a open-source multilingual, cross-domain temporal tagger. HeidelTime extracts temporal expressions from documents and normalizes them according to the TIMEX3 annotation standard. Several types of temporal expressions are extracted: dates, times, durations and temporal sets. French resources have been evaluated in two different ways: on the French TimeBank corpus, a corpus of newspaper articles in French annotated according to the ISO-TimeML standard, and on a user application for automatic building of event timelines. Results on the French TimeBank are quite satisfaying as they are comparable to those obtained by HeidelTime in English and Spanish on newswire articles. Concerning the user application, we used two temporal taggers for the preprocessing of the corpus in order to compare their performance and results show that the performances of our application on French documents are better with HeidelTime. The French resources and evaluation scripts are publicly available with HeidelTime. |
author2 |
Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI) Université Paris-Sud - Paris 11 (UP11)-Sorbonne Université - UFR d'Ingénierie (UFR 919) Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université Paris Saclay (COmUE) European Language Resources Association (ELRA) Nicoletta Calzolari Khalid Choukri Thierry Declerck Hrafn Loftsson Bente Maegaard Joseph Mariani Asuncion Moreno Jan Odijk Stelios Piperidis ANR-10-CORD-0010,ChronoLines,Génération de Chronologies Evènementielles visuelles(2010) |
format |
Conference Object |
author |
Moriceau, Véronique Tannier, Xavier |
author_facet |
Moriceau, Véronique Tannier, Xavier |
author_sort |
Moriceau, Véronique |
title |
French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime |
title_short |
French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime |
title_full |
French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime |
title_fullStr |
French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime |
title_full_unstemmed |
French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime |
title_sort |
french resources for extraction and normalization of temporal expressions with heideltime |
publisher |
HAL CCSD |
publishDate |
2014 |
url |
https://hal.science/hal-02489652 |
op_coverage |
Reykjavík, Iceland |
geographic |
Reykjavík |
geographic_facet |
Reykjavík |
genre |
Iceland Reykjavík Reykjavík |
genre_facet |
Iceland Reykjavík Reykjavík |
op_source |
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14) 9th International Conference on Language Resources and Evaluation (LREC 2014) https://hal.science/hal-02489652 9th International Conference on Language Resources and Evaluation (LREC 2014), European Language Resources Association (ELRA), May 2014, Reykjavík, Iceland. pp.3239-3243 https://aclanthology.org/L14-1382/ |
op_relation |
hal-02489652 https://hal.science/hal-02489652 |
op_rights |
http://creativecommons.org/licenses/by/ |
_version_ |
1782335787003543552 |