Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language

We describe a paradigm for combining manual and automatic error correction of noisy structured lexicographic data. Modifications to the structure and underlying text of the lexicographic data are expressed in a simple, interpreted programming language. Dictionary Manipulation Language (DML) commands...

Full description

Bibliographic Details
Main Authors: Zajic, David, Maxwell, Michael, Doermann, David, Rodrigues, Paul, Bloodgood, Michael
Format: Article in Journal/Newspaper
Language:English
Published: Trojina Institute for Applied Slovene Studies 2011
Subjects:
XML
DML
Online Access:http://hdl.handle.net/1903/15577
https://doi.org/10.13016/M2RP4W
id ftunivmaryland:oai:drum.lib.umd.edu:1903/15577
record_format openpolar
spelling ftunivmaryland:oai:drum.lib.umd.edu:1903/15577 2023-05-15T16:01:34+02:00 Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language Zajic, David Maxwell, Michael Doermann, David Rodrigues, Paul Bloodgood, Michael 2011-11 application/pdf http://hdl.handle.net/1903/15577 https://doi.org/10.13016/M2RP4W en_US eng Trojina Institute for Applied Slovene Studies Center for Advanced Study of Language Digitial Repository at the University of Maryland University of Maryland (College Park, Md) doi:10.13016/M2RP4W David Zajic, Michael Maxwell, David Doermann, Paul Rodrigues, and Michael Bloodgood. 2011. Correcting errors in digital lexicographic resources using a dictionary manipulation language. In Proceedings of Electronic Lexicography in the 21st Century (eLex), pages 297-301, Bled, Slovenia, November. Trojina Institute for Applied Slovene Studies. http://hdl.handle.net/1903/15577 computer science computational linguistics noisy structured data error correction digital lexicography electronic lexicography XML digital bilingual dictionaries Article 2011 ftunivmaryland https://doi.org/10.13016/M2RP4W 2022-11-11T11:15:41Z We describe a paradigm for combining manual and automatic error correction of noisy structured lexicographic data. Modifications to the structure and underlying text of the lexicographic data are expressed in a simple, interpreted programming language. Dictionary Manipulation Language (DML) commands identify nodes by unique identifiers, and manipulations are performed using simple commands such as create, move, set text, etc. Corrected lexicons are produced by applying sequences of DML commands to the source version of the lexicon. DML commands can be written manually to repair one-off errors or generated automatically to correct recurring problems. We discuss advantages of the paradigm for the task of editing digital bilingual dictionaries. This material is based upon work supported, in whole or in part, with funding from the United States Government. Any opinions, findings and conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the University of Maryland, College Park and/or any agency or entity of the United States Government. Nothing in this report is intended to be and shall not be treated or construed as an endorsement or recommendation by the University of Maryland, United States Government, or the authors of the product, process, or service that is the subject of this report. No one may use any information contained or based on this report in advertisements or promotional materials related to any company product, process, or service or in support of other commercial purposes. Article in Journal/Newspaper DML University of Maryland: Digital Repository (DRUM)
institution Open Polar
collection University of Maryland: Digital Repository (DRUM)
op_collection_id ftunivmaryland
language English
topic computer science
computational linguistics
noisy structured data
error correction
digital lexicography
electronic lexicography
XML
digital bilingual dictionaries
spellingShingle computer science
computational linguistics
noisy structured data
error correction
digital lexicography
electronic lexicography
XML
digital bilingual dictionaries
Zajic, David
Maxwell, Michael
Doermann, David
Rodrigues, Paul
Bloodgood, Michael
Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language
topic_facet computer science
computational linguistics
noisy structured data
error correction
digital lexicography
electronic lexicography
XML
digital bilingual dictionaries
description We describe a paradigm for combining manual and automatic error correction of noisy structured lexicographic data. Modifications to the structure and underlying text of the lexicographic data are expressed in a simple, interpreted programming language. Dictionary Manipulation Language (DML) commands identify nodes by unique identifiers, and manipulations are performed using simple commands such as create, move, set text, etc. Corrected lexicons are produced by applying sequences of DML commands to the source version of the lexicon. DML commands can be written manually to repair one-off errors or generated automatically to correct recurring problems. We discuss advantages of the paradigm for the task of editing digital bilingual dictionaries. This material is based upon work supported, in whole or in part, with funding from the United States Government. Any opinions, findings and conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the University of Maryland, College Park and/or any agency or entity of the United States Government. Nothing in this report is intended to be and shall not be treated or construed as an endorsement or recommendation by the University of Maryland, United States Government, or the authors of the product, process, or service that is the subject of this report. No one may use any information contained or based on this report in advertisements or promotional materials related to any company product, process, or service or in support of other commercial purposes.
format Article in Journal/Newspaper
author Zajic, David
Maxwell, Michael
Doermann, David
Rodrigues, Paul
Bloodgood, Michael
author_facet Zajic, David
Maxwell, Michael
Doermann, David
Rodrigues, Paul
Bloodgood, Michael
author_sort Zajic, David
title Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language
title_short Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language
title_full Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language
title_fullStr Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language
title_full_unstemmed Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language
title_sort correcting errors in digital lexicographic resources using a dictionary manipulation language
publisher Trojina Institute for Applied Slovene Studies
publishDate 2011
url http://hdl.handle.net/1903/15577
https://doi.org/10.13016/M2RP4W
genre DML
genre_facet DML
op_relation Center for Advanced Study of Language
Digitial Repository at the University of Maryland
University of Maryland (College Park, Md)
doi:10.13016/M2RP4W
David Zajic, Michael Maxwell, David Doermann, Paul Rodrigues, and Michael Bloodgood. 2011. Correcting errors in digital lexicographic resources using a dictionary manipulation language. In Proceedings of Electronic Lexicography in the 21st Century (eLex), pages 297-301, Bled, Slovenia, November. Trojina Institute for Applied Slovene Studies.
http://hdl.handle.net/1903/15577
op_doi https://doi.org/10.13016/M2RP4W
_version_ 1766397369554829312