MotàMot project: conversion of a French-Khmer published dictionary for building a multilingual lexical system
Economic issues related to the information processing techniques are very important. The development of such technologies is a major asset for developing countries like Cambodia and Laos, and emerging ones like Vietnam, Malaysia and Thailand. The MotAMot project aims to computerize an under-resource...
Main Author: | |
---|---|
Format: | Report |
Language: | unknown |
Published: |
arXiv
2014
|
Subjects: | |
Online Access: | https://dx.doi.org/10.48550/arxiv.1405.5674 https://arxiv.org/abs/1405.5674 |
id |
ftdatacite:10.48550/arxiv.1405.5674 |
---|---|
record_format |
openpolar |
spelling |
ftdatacite:10.48550/arxiv.1405.5674 2023-05-15T16:50:47+02:00 MotàMot project: conversion of a French-Khmer published dictionary for building a multilingual lexical system Mangeot, Mathieu 2014 https://dx.doi.org/10.48550/arxiv.1405.5674 https://arxiv.org/abs/1405.5674 unknown arXiv arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Computation and Language cs.CL FOS Computer and information sciences Preprint Article article CreativeWork 2014 ftdatacite https://doi.org/10.48550/arxiv.1405.5674 2022-04-01T13:01:31Z Economic issues related to the information processing techniques are very important. The development of such technologies is a major asset for developing countries like Cambodia and Laos, and emerging ones like Vietnam, Malaysia and Thailand. The MotAMot project aims to computerize an under-resourced language: Khmer, spoken mainly in Cambodia. The main goal of the project is the development of a multilingual lexical system targeted for Khmer. The macrostructure is a pivot one with each word sense of each language linked to a pivot axi. The microstructure comes from a simplification of the explanatory and combinatory dictionary. The lexical system has been initialized with data coming mainly from the conversion of the French-Khmer bilingual dictionary of Denis Richer from Word to XML format. The French part was completed with pronunciation and parts-of-speech coming from the FeM French-english-Malay dictionary. The Khmer headwords noted in IPA in the Richer dictionary were converted to Khmer writing with OpenFST, a finite state transducer tool. The resulting resource is available online for lookup, editing, download and remote programming via a REST API on a Jibiki platform. : 8 pages, Languages Resources and Evaluation Conference, Reykjavik : Iceland (2014) Report Iceland DataCite Metadata Store (German National Library of Science and Technology) Pivot ENVELOPE(-30.239,-30.239,-80.667,-80.667) |
institution |
Open Polar |
collection |
DataCite Metadata Store (German National Library of Science and Technology) |
op_collection_id |
ftdatacite |
language |
unknown |
topic |
Computation and Language cs.CL FOS Computer and information sciences |
spellingShingle |
Computation and Language cs.CL FOS Computer and information sciences Mangeot, Mathieu MotàMot project: conversion of a French-Khmer published dictionary for building a multilingual lexical system |
topic_facet |
Computation and Language cs.CL FOS Computer and information sciences |
description |
Economic issues related to the information processing techniques are very important. The development of such technologies is a major asset for developing countries like Cambodia and Laos, and emerging ones like Vietnam, Malaysia and Thailand. The MotAMot project aims to computerize an under-resourced language: Khmer, spoken mainly in Cambodia. The main goal of the project is the development of a multilingual lexical system targeted for Khmer. The macrostructure is a pivot one with each word sense of each language linked to a pivot axi. The microstructure comes from a simplification of the explanatory and combinatory dictionary. The lexical system has been initialized with data coming mainly from the conversion of the French-Khmer bilingual dictionary of Denis Richer from Word to XML format. The French part was completed with pronunciation and parts-of-speech coming from the FeM French-english-Malay dictionary. The Khmer headwords noted in IPA in the Richer dictionary were converted to Khmer writing with OpenFST, a finite state transducer tool. The resulting resource is available online for lookup, editing, download and remote programming via a REST API on a Jibiki platform. : 8 pages, Languages Resources and Evaluation Conference, Reykjavik : Iceland (2014) |
format |
Report |
author |
Mangeot, Mathieu |
author_facet |
Mangeot, Mathieu |
author_sort |
Mangeot, Mathieu |
title |
MotàMot project: conversion of a French-Khmer published dictionary for building a multilingual lexical system |
title_short |
MotàMot project: conversion of a French-Khmer published dictionary for building a multilingual lexical system |
title_full |
MotàMot project: conversion of a French-Khmer published dictionary for building a multilingual lexical system |
title_fullStr |
MotàMot project: conversion of a French-Khmer published dictionary for building a multilingual lexical system |
title_full_unstemmed |
MotàMot project: conversion of a French-Khmer published dictionary for building a multilingual lexical system |
title_sort |
motàmot project: conversion of a french-khmer published dictionary for building a multilingual lexical system |
publisher |
arXiv |
publishDate |
2014 |
url |
https://dx.doi.org/10.48550/arxiv.1405.5674 https://arxiv.org/abs/1405.5674 |
long_lat |
ENVELOPE(-30.239,-30.239,-80.667,-80.667) |
geographic |
Pivot |
geographic_facet |
Pivot |
genre |
Iceland |
genre_facet |
Iceland |
op_rights |
arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ |
op_doi |
https://doi.org/10.48550/arxiv.1405.5674 |
_version_ |
1766040897476100096 |