The Norwegian Dependency Treebank

The Norwegian Dependency Treebank is a new syntactic treebank for Norwegian Bokmål and Nynorsk with manual syntactic and morphological annotation, developed at the National Library of Norway in collaboration with the University of Oslo. It is the first publically available treebank for Norwegian. Th...

Full description

Bibliographic Details
Main Authors: Solberg, Per Erik, Skjærholt, Arne, Øvrelid, Lilja, Hagen, Kristin, Johannessen, Janne Bondi
Format: Book Part
Language:English
Published: 2014
Subjects:
Online Access:http://hdl.handle.net/10852/39236
http://urn.nb.no/URN:NBN:no-44125
Description
Summary:The Norwegian Dependency Treebank is a new syntactic treebank for Norwegian Bokmål and Nynorsk with manual syntactic and morphological annotation, developed at the National Library of Norway in collaboration with the University of Oslo. It is the first publically available treebank for Norwegian. This paper presents the core principles behind the syntactic annotation and how these principles were employed in certain specific cases. We then present the selection of texts and distribution between genres, as well as the annotation process and an evaluation of the inter-annotator agreement. Finally, we present the first results of data-driven dependency parsing of Norwegian, contrasting four state-of-the-art dependency parsers trained on the treebank. The consistency and the parsability of this treebank is shown to be comparable to other large treebank initiatives. Proceedings of the LREC 2014, Ninth International Conference on Language Resources and Evaluation, Reykjavik, Iceland. http://www.lrec-conf.org/proceedings/lrec2014/index.html.