Computerization of African languages-French dictionaries
8 pages International audience This paper relates work done during the DiLAF project. It consists in converting 5 bilingual African language-French dictionaries originally in Word format into XML following the LMF model. The languages processed are Bambara, Hausa, Kanuri, Tamajaq and Songhai-zarma,...
Main Authors: | , |
---|---|
Other Authors: | , , , , , , , , |
Format: | Conference Object |
Language: | English |
Published: |
HAL CCSD
2014
|
Subjects: | |
Online Access: | https://hal.archives-ouvertes.fr/hal-00994821 https://hal.archives-ouvertes.fr/hal-00994821/document https://hal.archives-ouvertes.fr/hal-00994821/file/ENGUEHARD_DiLAF_WSLREC2014_final_en.pdf |
id |
ftccsdartic:oai:HAL:hal-00994821v1 |
---|---|
record_format |
openpolar |
spelling |
ftccsdartic:oai:HAL:hal-00994821v1 2023-05-15T16:49:32+02:00 Computerization of African languages-French dictionaries Enguehard, Chantal Mangeot, Mathieu Laboratoire d'Informatique de Nantes Atlantique (LINA) Mines Nantes (Mines Nantes)-Université de Nantes (UN)-Centre National de la Recherche Scientifique (CNRS) Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole (GETALP) Laboratoire d'Informatique de Grenoble (LIG) Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF) Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF) Université Savoie Mont Blanc (USMB Université de Savoie Université de Chambéry ) Financement : projet autoroutes de l'information de l'Organisation Internationale de la Francophonie Projet DiLAF Reykjavik, Iceland 2014-05-26 https://hal.archives-ouvertes.fr/hal-00994821 https://hal.archives-ouvertes.fr/hal-00994821/document https://hal.archives-ouvertes.fr/hal-00994821/file/ENGUEHARD_DiLAF_WSLREC2014_final_en.pdf en eng HAL CCSD info:eu-repo/semantics/altIdentifier/arxiv/1405.5893 hal-00994821 https://hal.archives-ouvertes.fr/hal-00994821 https://hal.archives-ouvertes.fr/hal-00994821/document https://hal.archives-ouvertes.fr/hal-00994821/file/ENGUEHARD_DiLAF_WSLREC2014_final_en.pdf ARXIV: 1405.5893 info:eu-repo/semantics/OpenAccess CCURL 2014 : Collaboration and Computing for Under Resourced Languages in the Linked Open Data Era https://hal.archives-ouvertes.fr/hal-00994821 CCURL 2014 : Collaboration and Computing for Under Resourced Languages in the Linked Open Data Era, May 2014, Reykjavik, Iceland. pp.121 Jibiki kanouri haoussa tamajaq zarma bambara XML LMF DiLAF [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] info:eu-repo/semantics/conferenceObject Conference papers 2014 ftccsdartic 2021-10-24T13:26:26Z 8 pages International audience This paper relates work done during the DiLAF project. It consists in converting 5 bilingual African language-French dictionaries originally in Word format into XML following the LMF model. The languages processed are Bambara, Hausa, Kanuri, Tamajaq and Songhai-zarma, still considered as under-resourced languages concerning Natural Language Processing tools. Once converted, the dictionaries are available online on the Jibiki platform for lookup and modification. The DiLAF project is first presented. A description of each dictionary follows. Then, the conversion methodology from .doc format to XML files is presented. A specific point on the usage of Unicode follows. Then, each step of the conversion into XML and LMF is detailed. The last part presents the Jibiki lexical resources management platform used for the project. Conference Object Iceland Archive ouverte HAL (Hyper Article en Ligne, CCSD - Centre pour la Communication Scientifique Directe) |
institution |
Open Polar |
collection |
Archive ouverte HAL (Hyper Article en Ligne, CCSD - Centre pour la Communication Scientifique Directe) |
op_collection_id |
ftccsdartic |
language |
English |
topic |
Jibiki kanouri haoussa tamajaq zarma bambara XML LMF DiLAF [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] |
spellingShingle |
Jibiki kanouri haoussa tamajaq zarma bambara XML LMF DiLAF [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] Enguehard, Chantal Mangeot, Mathieu Computerization of African languages-French dictionaries |
topic_facet |
Jibiki kanouri haoussa tamajaq zarma bambara XML LMF DiLAF [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] |
description |
8 pages International audience This paper relates work done during the DiLAF project. It consists in converting 5 bilingual African language-French dictionaries originally in Word format into XML following the LMF model. The languages processed are Bambara, Hausa, Kanuri, Tamajaq and Songhai-zarma, still considered as under-resourced languages concerning Natural Language Processing tools. Once converted, the dictionaries are available online on the Jibiki platform for lookup and modification. The DiLAF project is first presented. A description of each dictionary follows. Then, the conversion methodology from .doc format to XML files is presented. A specific point on the usage of Unicode follows. Then, each step of the conversion into XML and LMF is detailed. The last part presents the Jibiki lexical resources management platform used for the project. |
author2 |
Laboratoire d'Informatique de Nantes Atlantique (LINA) Mines Nantes (Mines Nantes)-Université de Nantes (UN)-Centre National de la Recherche Scientifique (CNRS) Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole (GETALP) Laboratoire d'Informatique de Grenoble (LIG) Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF) Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF) Université Savoie Mont Blanc (USMB Université de Savoie Université de Chambéry ) Financement : projet autoroutes de l'information de l'Organisation Internationale de la Francophonie Projet DiLAF |
format |
Conference Object |
author |
Enguehard, Chantal Mangeot, Mathieu |
author_facet |
Enguehard, Chantal Mangeot, Mathieu |
author_sort |
Enguehard, Chantal |
title |
Computerization of African languages-French dictionaries |
title_short |
Computerization of African languages-French dictionaries |
title_full |
Computerization of African languages-French dictionaries |
title_fullStr |
Computerization of African languages-French dictionaries |
title_full_unstemmed |
Computerization of African languages-French dictionaries |
title_sort |
computerization of african languages-french dictionaries |
publisher |
HAL CCSD |
publishDate |
2014 |
url |
https://hal.archives-ouvertes.fr/hal-00994821 https://hal.archives-ouvertes.fr/hal-00994821/document https://hal.archives-ouvertes.fr/hal-00994821/file/ENGUEHARD_DiLAF_WSLREC2014_final_en.pdf |
op_coverage |
Reykjavik, Iceland |
genre |
Iceland |
genre_facet |
Iceland |
op_source |
CCURL 2014 : Collaboration and Computing for Under Resourced Languages in the Linked Open Data Era https://hal.archives-ouvertes.fr/hal-00994821 CCURL 2014 : Collaboration and Computing for Under Resourced Languages in the Linked Open Data Era, May 2014, Reykjavik, Iceland. pp.121 |
op_relation |
info:eu-repo/semantics/altIdentifier/arxiv/1405.5893 hal-00994821 https://hal.archives-ouvertes.fr/hal-00994821 https://hal.archives-ouvertes.fr/hal-00994821/document https://hal.archives-ouvertes.fr/hal-00994821/file/ENGUEHARD_DiLAF_WSLREC2014_final_en.pdf ARXIV: 1405.5893 |
op_rights |
info:eu-repo/semantics/OpenAccess |
_version_ |
1766039661889716224 |