Nonparametric Model for Inupiaq Morphology Tokenization
We present how to use English translation for unsupervised word segmentation of low resource languages. The inference uses a dynamic programming algorithm for efficient blocked Gibbs sampling. We apply the model to Inupiaq morphology analysis and get better results than monolingual model as well as...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.1099 http://aclweb.org/anthology/C/C12/C12-3042.pdf |
id |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.368.1099 |
---|---|
record_format |
openpolar |
spelling |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.368.1099 2023-05-15T16:55:37+02:00 Nonparametric Model for Inupiaq Morphology Tokenization Thuy Linh N Stephan Vogel The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.1099 http://aclweb.org/anthology/C/C12/C12-3042.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.1099 http://aclweb.org/anthology/C/C12/C12-3042.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://aclweb.org/anthology/C/C12/C12-3042.pdf Nonparametric model Gibbs sampling morphology tokenization. Proceedings of COLING 2012 Demonstration Papers pages 337–344 COLING 2012 Mumbai December 2012 text ftciteseerx 2016-01-08T01:08:41Z We present how to use English translation for unsupervised word segmentation of low resource languages. The inference uses a dynamic programming algorithm for efficient blocked Gibbs sampling. We apply the model to Inupiaq morphology analysis and get better results than monolingual model as well as Morfessor output. Text Inupiaq Unknown |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
ftciteseerx |
language |
English |
topic |
Nonparametric model Gibbs sampling morphology tokenization. Proceedings of COLING 2012 Demonstration Papers pages 337–344 COLING 2012 Mumbai December 2012 |
spellingShingle |
Nonparametric model Gibbs sampling morphology tokenization. Proceedings of COLING 2012 Demonstration Papers pages 337–344 COLING 2012 Mumbai December 2012 Thuy Linh N Stephan Vogel Nonparametric Model for Inupiaq Morphology Tokenization |
topic_facet |
Nonparametric model Gibbs sampling morphology tokenization. Proceedings of COLING 2012 Demonstration Papers pages 337–344 COLING 2012 Mumbai December 2012 |
description |
We present how to use English translation for unsupervised word segmentation of low resource languages. The inference uses a dynamic programming algorithm for efficient blocked Gibbs sampling. We apply the model to Inupiaq morphology analysis and get better results than monolingual model as well as Morfessor output. |
author2 |
The Pennsylvania State University CiteSeerX Archives |
format |
Text |
author |
Thuy Linh N Stephan Vogel |
author_facet |
Thuy Linh N Stephan Vogel |
author_sort |
Thuy Linh |
title |
Nonparametric Model for Inupiaq Morphology Tokenization |
title_short |
Nonparametric Model for Inupiaq Morphology Tokenization |
title_full |
Nonparametric Model for Inupiaq Morphology Tokenization |
title_fullStr |
Nonparametric Model for Inupiaq Morphology Tokenization |
title_full_unstemmed |
Nonparametric Model for Inupiaq Morphology Tokenization |
title_sort |
nonparametric model for inupiaq morphology tokenization |
url |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.1099 http://aclweb.org/anthology/C/C12/C12-3042.pdf |
genre |
Inupiaq |
genre_facet |
Inupiaq |
op_source |
http://aclweb.org/anthology/C/C12/C12-3042.pdf |
op_relation |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.1099 http://aclweb.org/anthology/C/C12/C12-3042.pdf |
op_rights |
Metadata may be used without restrictions as long as the oai identifier remains attached to it. |
_version_ |
1766046604766216192 |