ANCOR_Centre, a Large Free Spoken French Coreference Corpus: description of the Resource and Reliability Measures
International audience This article presents ANCOR_Centre, a French coreference corpus, available under the Creative Commons Licence. With a size of around 500,000 words, the corpus is large enough to serve the needs of data-driven approaches in NLP and represents one of the largest coreference reso...
Main Authors: | , , , , , , , |
---|---|
Other Authors: | , , , , , , , , , , , , , , |
Format: | Conference Object |
Language: | English |
Published: |
HAL CCSD
2014
|
Subjects: | |
Online Access: | https://hal.science/hal-01075679 https://hal.science/hal-01075679/document https://hal.science/hal-01075679/file/2014_LREC_ANCOR.pdf |
id |
ftecolecentrpar:oai:HAL:hal-01075679v1 |
---|---|
record_format |
openpolar |
spelling |
ftecolecentrpar:oai:HAL:hal-01075679v1 2023-08-15T12:41:50+02:00 ANCOR_Centre, a Large Free Spoken French Coreference Corpus: description of the Resource and Reliability Measures Muzerelle, Judith Lefeuvre, Anaïs Schang, Emmanuel Antoine, Jean-Yves Pelletier, Aurore Maurel, Denis Eshkol, Iris Villaneau, Jeanne Laboratoire Ligérien de Linguistique (LLL) Université d'Orléans (UO)-Université de Tours (UT) Bases de données et traitement des langues naturelles (BDTLN) Laboratoire d'Informatique Fondamentale et Appliquée de Tours (LIFAT) Université de Tours (UT)-Institut National des Sciences Appliquées - Centre Val de Loire (INSA CVL) Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université de Tours (UT)-Institut National des Sciences Appliquées - Centre Val de Loire (INSA CVL) Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS) SEarch, Analyze, Synthesize and Interact with Data Ecosystems (SEASIDE) Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA) Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes) Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes) Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS) Région Centre ELRA Projet ANCOR Reyjavik, Iceland 2014-05-26 https://hal.science/hal-01075679 https://hal.science/hal-01075679/document https://hal.science/hal-01075679/file/2014_LREC_ANCOR.pdf en eng HAL CCSD hal-01075679 https://hal.science/hal-01075679 https://hal.science/hal-01075679/document https://hal.science/hal-01075679/file/2014_LREC_ANCOR.pdf info:eu-repo/semantics/OpenAccess LREC'2014, 9th Language Resources and Evaluation Conference. https://hal.science/hal-01075679 LREC'2014, 9th Language Resources and Evaluation Conference., May 2014, Reyjavik, Iceland. pp.843-847 http://www.lrec-conf.org/proceedings/lrec2014/index.html French spoken language free annotated corpus coreference anaphora [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] info:eu-repo/semantics/conferenceObject Conference papers 2014 ftecolecentrpar 2023-07-25T20:50:12Z International audience This article presents ANCOR_Centre, a French coreference corpus, available under the Creative Commons Licence. With a size of around 500,000 words, the corpus is large enough to serve the needs of data-driven approaches in NLP and represents one of the largest coreference resources currently available. The corpus focuses exclusively on spoken language, it aims at representing a certain variety of spoken genders. ANCOR_Centre includes anaphora as well as coreference relations which involve nominal and pronominal mentions. The paper describes into details the annotation scheme and the reliability measures computed on the resource. Conference Object Iceland École Centrale Paris: HAL-ECP |
institution |
Open Polar |
collection |
École Centrale Paris: HAL-ECP |
op_collection_id |
ftecolecentrpar |
language |
English |
topic |
French spoken language free annotated corpus coreference anaphora [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] |
spellingShingle |
French spoken language free annotated corpus coreference anaphora [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] Muzerelle, Judith Lefeuvre, Anaïs Schang, Emmanuel Antoine, Jean-Yves Pelletier, Aurore Maurel, Denis Eshkol, Iris Villaneau, Jeanne ANCOR_Centre, a Large Free Spoken French Coreference Corpus: description of the Resource and Reliability Measures |
topic_facet |
French spoken language free annotated corpus coreference anaphora [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] |
description |
International audience This article presents ANCOR_Centre, a French coreference corpus, available under the Creative Commons Licence. With a size of around 500,000 words, the corpus is large enough to serve the needs of data-driven approaches in NLP and represents one of the largest coreference resources currently available. The corpus focuses exclusively on spoken language, it aims at representing a certain variety of spoken genders. ANCOR_Centre includes anaphora as well as coreference relations which involve nominal and pronominal mentions. The paper describes into details the annotation scheme and the reliability measures computed on the resource. |
author2 |
Laboratoire Ligérien de Linguistique (LLL) Université d'Orléans (UO)-Université de Tours (UT) Bases de données et traitement des langues naturelles (BDTLN) Laboratoire d'Informatique Fondamentale et Appliquée de Tours (LIFAT) Université de Tours (UT)-Institut National des Sciences Appliquées - Centre Val de Loire (INSA CVL) Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université de Tours (UT)-Institut National des Sciences Appliquées - Centre Val de Loire (INSA CVL) Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS) SEarch, Analyze, Synthesize and Interact with Data Ecosystems (SEASIDE) Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA) Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes) Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes) Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS) Région Centre ELRA Projet ANCOR |
format |
Conference Object |
author |
Muzerelle, Judith Lefeuvre, Anaïs Schang, Emmanuel Antoine, Jean-Yves Pelletier, Aurore Maurel, Denis Eshkol, Iris Villaneau, Jeanne |
author_facet |
Muzerelle, Judith Lefeuvre, Anaïs Schang, Emmanuel Antoine, Jean-Yves Pelletier, Aurore Maurel, Denis Eshkol, Iris Villaneau, Jeanne |
author_sort |
Muzerelle, Judith |
title |
ANCOR_Centre, a Large Free Spoken French Coreference Corpus: description of the Resource and Reliability Measures |
title_short |
ANCOR_Centre, a Large Free Spoken French Coreference Corpus: description of the Resource and Reliability Measures |
title_full |
ANCOR_Centre, a Large Free Spoken French Coreference Corpus: description of the Resource and Reliability Measures |
title_fullStr |
ANCOR_Centre, a Large Free Spoken French Coreference Corpus: description of the Resource and Reliability Measures |
title_full_unstemmed |
ANCOR_Centre, a Large Free Spoken French Coreference Corpus: description of the Resource and Reliability Measures |
title_sort |
ancor_centre, a large free spoken french coreference corpus: description of the resource and reliability measures |
publisher |
HAL CCSD |
publishDate |
2014 |
url |
https://hal.science/hal-01075679 https://hal.science/hal-01075679/document https://hal.science/hal-01075679/file/2014_LREC_ANCOR.pdf |
op_coverage |
Reyjavik, Iceland |
genre |
Iceland |
genre_facet |
Iceland |
op_source |
LREC'2014, 9th Language Resources and Evaluation Conference. https://hal.science/hal-01075679 LREC'2014, 9th Language Resources and Evaluation Conference., May 2014, Reyjavik, Iceland. pp.843-847 http://www.lrec-conf.org/proceedings/lrec2014/index.html |
op_relation |
hal-01075679 https://hal.science/hal-01075679 https://hal.science/hal-01075679/document https://hal.science/hal-01075679/file/2014_LREC_ANCOR.pdf |
op_rights |
info:eu-repo/semantics/OpenAccess |
_version_ |
1774295348737474560 |