IceNLP: A Natural Language Processing Toolkit for Icelandic

Icelandic is a morphologically complex language, for which language technology resources are scarce. Only a few years ago, it could be stated that language technology was practically non-existent in Iceland. In this paper, we describe the development of an NLP toolkit for processing the language, th...

Full description

Bibliographic Details
Main Authors: Hrafn Loftsson, Eiríkur Rögnvaldsson
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.330.5629
http://www.ru.is/faculty/hrafn/Papers/IceNLP_final2.pdf
id ftciteseerx:oai:CiteSeerX.psu:10.1.1.330.5629
record_format openpolar
spelling ftciteseerx:oai:CiteSeerX.psu:10.1.1.330.5629 2023-05-15T16:48:25+02:00 IceNLP: A Natural Language Processing Toolkit for Icelandic Hrafn Loftsson Eiríkur Rögnvaldsson The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.330.5629 http://www.ru.is/faculty/hrafn/Papers/IceNLP_final2.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.330.5629 http://www.ru.is/faculty/hrafn/Papers/IceNLP_final2.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www.ru.is/faculty/hrafn/Papers/IceNLP_final2.pdf text ftciteseerx 2016-09-04T00:40:30Z Icelandic is a morphologically complex language, for which language technology resources are scarce. Only a few years ago, it could be stated that language technology was practically non-existent in Iceland. In this paper, we describe the development of an NLP toolkit for processing the language, the challenges faced and the decisions made during development. The current version of the toolkit consists of a tokeniser/sentence segmentiser, a morphological analyser, a linguistic rule-based tagger, and a finite-state parser. The development of our toolkit is a step towards building a Basic Language Resource Toolkit (BLARK) for the Icelandic language. Index Terms: morphological analysis, part-of-speech tagging, finite-state parsing Text Iceland Unknown
institution Open Polar
collection Unknown
op_collection_id ftciteseerx
language English
description Icelandic is a morphologically complex language, for which language technology resources are scarce. Only a few years ago, it could be stated that language technology was practically non-existent in Iceland. In this paper, we describe the development of an NLP toolkit for processing the language, the challenges faced and the decisions made during development. The current version of the toolkit consists of a tokeniser/sentence segmentiser, a morphological analyser, a linguistic rule-based tagger, and a finite-state parser. The development of our toolkit is a step towards building a Basic Language Resource Toolkit (BLARK) for the Icelandic language. Index Terms: morphological analysis, part-of-speech tagging, finite-state parsing
author2 The Pennsylvania State University CiteSeerX Archives
format Text
author Hrafn Loftsson
Eiríkur Rögnvaldsson
spellingShingle Hrafn Loftsson
Eiríkur Rögnvaldsson
IceNLP: A Natural Language Processing Toolkit for Icelandic
author_facet Hrafn Loftsson
Eiríkur Rögnvaldsson
author_sort Hrafn Loftsson
title IceNLP: A Natural Language Processing Toolkit for Icelandic
title_short IceNLP: A Natural Language Processing Toolkit for Icelandic
title_full IceNLP: A Natural Language Processing Toolkit for Icelandic
title_fullStr IceNLP: A Natural Language Processing Toolkit for Icelandic
title_full_unstemmed IceNLP: A Natural Language Processing Toolkit for Icelandic
title_sort icenlp: a natural language processing toolkit for icelandic
url http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.330.5629
http://www.ru.is/faculty/hrafn/Papers/IceNLP_final2.pdf
genre Iceland
genre_facet Iceland
op_source http://www.ru.is/faculty/hrafn/Papers/IceNLP_final2.pdf
op_relation http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.330.5629
http://www.ru.is/faculty/hrafn/Papers/IceNLP_final2.pdf
op_rights Metadata may be used without restrictions as long as the oai identifier remains attached to it.
_version_ 1766038515261374464