Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank
Source at http://e-scripta.ilit.bas.bg/archives/year-2015/issue-14-15 . Journal home page at http://e-scripta.ilit.bas.bg/ . The Tromsø Old Russian and OCS Treebank (TOROT, nestor.uit.no)1 is, along with its parent treebank, the PROIEL corpus (foni.uio.no), the only existing treebank of Old Church S...
Main Authors: | , |
---|---|
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Institute for Literature, Bulgarian Academy of Sciences
2015
|
Subjects: | |
Online Access: | https://hdl.handle.net/10037/22366 |
_version_ | 1829300235137974272 |
---|---|
author | Eckhoff, Hanne Martine Berdicevskis, Aleksandrs |
author_facet | Eckhoff, Hanne Martine Berdicevskis, Aleksandrs |
author_sort | Eckhoff, Hanne Martine |
collection | University of Tromsø: Munin Open Research Archive |
description | Source at http://e-scripta.ilit.bas.bg/archives/year-2015/issue-14-15 . Journal home page at http://e-scripta.ilit.bas.bg/ . The Tromsø Old Russian and OCS Treebank (TOROT, nestor.uit.no)1 is, along with its parent treebank, the PROIEL corpus (foni.uio.no), the only existing treebank of Old Church Slavonic (OCS), Old East Slavic and Middle Russian texts. There are other tagged resources, such as the Old Russian subcorpus of the Russian National Corpus2 and the Manuskript corpus,3 but none of them, to our knowledge, currently provide syntactic annotation. The TOROT presently contains approximately 160,000 word tokens of fully annotated OCS (Codex Marianus4 and Codex Suprasliensis), 85,000 word tokens of fully annotated Kiev-era Old East Slavic, and 60,000 word tokens of fully annotated 15th–17th-century Middle Russian. In addition, it contains the Codex Zographensis with automatic and partially hand-corrected morphological annotation and lemmatisation (sections of the Gospels missing in the Codex Marianus also have full syntactic annotation), and the PROIEL version of the Greek Gospels, with which the Codex Marianus and the Codex Zographensis are both aligned at token level (automatically, then hand-corrected). |
format | Article in Journal/Newspaper |
genre | Tromsø |
genre_facet | Tromsø |
geographic | Tromsø |
geographic_facet | Tromsø |
id | ftunivtroemsoe:oai:munin.uit.no:10037/22366 |
institution | Open Polar |
language | English |
op_collection_id | ftunivtroemsoe |
op_relation | Scripta & e-Scripta info:eu-repo/grantAgreement/RCN/FRIHUMSAM/222506/Norway/Birds & Beasts: Shaping Events in Old Russian// FRIDAID 1266416 https://hdl.handle.net/10037/22366 |
op_rights | openAccess Copyright 2015 The Author(s) |
publishDate | 2015 |
publisher | Institute for Literature, Bulgarian Academy of Sciences |
record_format | openpolar |
spelling | ftunivtroemsoe:oai:munin.uit.no:10037/22366 2025-04-13T14:27:34+00:00 Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank Eckhoff, Hanne Martine Berdicevskis, Aleksandrs 2015 https://hdl.handle.net/10037/22366 eng eng Institute for Literature, Bulgarian Academy of Sciences Scripta & e-Scripta info:eu-repo/grantAgreement/RCN/FRIHUMSAM/222506/Norway/Birds & Beasts: Shaping Events in Old Russian// FRIDAID 1266416 https://hdl.handle.net/10037/22366 openAccess Copyright 2015 The Author(s) VDP::Humanities: 000::Linguistics: 010::Russian language: 028 VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Russisk språk: 028 VDP::Humanities: 000::Linguistics: 010::Other Slavic languages: 029 VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Andre slaviske språk: 029 Journal article Tidsskriftartikkel Peer reviewed publishedVersion 2015 ftunivtroemsoe 2025-03-14T05:17:56Z Source at http://e-scripta.ilit.bas.bg/archives/year-2015/issue-14-15 . Journal home page at http://e-scripta.ilit.bas.bg/ . The Tromsø Old Russian and OCS Treebank (TOROT, nestor.uit.no)1 is, along with its parent treebank, the PROIEL corpus (foni.uio.no), the only existing treebank of Old Church Slavonic (OCS), Old East Slavic and Middle Russian texts. There are other tagged resources, such as the Old Russian subcorpus of the Russian National Corpus2 and the Manuskript corpus,3 but none of them, to our knowledge, currently provide syntactic annotation. The TOROT presently contains approximately 160,000 word tokens of fully annotated OCS (Codex Marianus4 and Codex Suprasliensis), 85,000 word tokens of fully annotated Kiev-era Old East Slavic, and 60,000 word tokens of fully annotated 15th–17th-century Middle Russian. In addition, it contains the Codex Zographensis with automatic and partially hand-corrected morphological annotation and lemmatisation (sections of the Gospels missing in the Codex Marianus also have full syntactic annotation), and the PROIEL version of the Greek Gospels, with which the Codex Marianus and the Codex Zographensis are both aligned at token level (automatically, then hand-corrected). Article in Journal/Newspaper Tromsø University of Tromsø: Munin Open Research Archive Tromsø |
spellingShingle | VDP::Humanities: 000::Linguistics: 010::Russian language: 028 VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Russisk språk: 028 VDP::Humanities: 000::Linguistics: 010::Other Slavic languages: 029 VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Andre slaviske språk: 029 Eckhoff, Hanne Martine Berdicevskis, Aleksandrs Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank |
title | Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank |
title_full | Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank |
title_fullStr | Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank |
title_full_unstemmed | Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank |
title_short | Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank |
title_sort | linguistics vs. digital editions: the tromsø old russian and ocs treebank |
topic | VDP::Humanities: 000::Linguistics: 010::Russian language: 028 VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Russisk språk: 028 VDP::Humanities: 000::Linguistics: 010::Other Slavic languages: 029 VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Andre slaviske språk: 029 |
topic_facet | VDP::Humanities: 000::Linguistics: 010::Russian language: 028 VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Russisk språk: 028 VDP::Humanities: 000::Linguistics: 010::Other Slavic languages: 029 VDP::Humaniora: 000::Språkvitenskapelige fag: 010::Andre slaviske språk: 029 |
url | https://hdl.handle.net/10037/22366 |