Usage of XSL Stylesheets for the annotation of the Sámi language corpora
This paper describes an annotation system for Sámi language corpora, which consists of structured, running texts. The annotation of the texts is fully automatic, starting from the original documents in different formats. The texts are first extracted from the original documents preserving the origin...
Main Author: | |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.4828 http://acl.ldc.upenn.edu/w/w07/w07-1507.pdf |
id |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.75.4828 |
---|---|
record_format |
openpolar |
spelling |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.75.4828 2023-05-15T18:14:46+02:00 Usage of XSL Stylesheets for the annotation of the Sámi language corpora Saara Huhmarniemi The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.4828 http://acl.ldc.upenn.edu/w/w07/w07-1507.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.4828 http://acl.ldc.upenn.edu/w/w07/w07-1507.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://acl.ldc.upenn.edu/w/w07/w07-1507.pdf text ftciteseerx 2016-01-08T19:04:29Z This paper describes an annotation system for Sámi language corpora, which consists of structured, running texts. The annotation of the texts is fully automatic, starting from the original documents in different formats. The texts are first extracted from the original documents preserving the original structural markup. The markup is enhanced by a document-specific XSLT script which contains document-specific formatting instructions. The overall maintenance is achieved by system-wide XSLT scripts. 1 Text Sámi Unknown |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
ftciteseerx |
language |
English |
description |
This paper describes an annotation system for Sámi language corpora, which consists of structured, running texts. The annotation of the texts is fully automatic, starting from the original documents in different formats. The texts are first extracted from the original documents preserving the original structural markup. The markup is enhanced by a document-specific XSLT script which contains document-specific formatting instructions. The overall maintenance is achieved by system-wide XSLT scripts. 1 |
author2 |
The Pennsylvania State University CiteSeerX Archives |
format |
Text |
author |
Saara Huhmarniemi |
spellingShingle |
Saara Huhmarniemi Usage of XSL Stylesheets for the annotation of the Sámi language corpora |
author_facet |
Saara Huhmarniemi |
author_sort |
Saara Huhmarniemi |
title |
Usage of XSL Stylesheets for the annotation of the Sámi language corpora |
title_short |
Usage of XSL Stylesheets for the annotation of the Sámi language corpora |
title_full |
Usage of XSL Stylesheets for the annotation of the Sámi language corpora |
title_fullStr |
Usage of XSL Stylesheets for the annotation of the Sámi language corpora |
title_full_unstemmed |
Usage of XSL Stylesheets for the annotation of the Sámi language corpora |
title_sort |
usage of xsl stylesheets for the annotation of the sámi language corpora |
url |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.4828 http://acl.ldc.upenn.edu/w/w07/w07-1507.pdf |
genre |
Sámi |
genre_facet |
Sámi |
op_source |
http://acl.ldc.upenn.edu/w/w07/w07-1507.pdf |
op_relation |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.4828 http://acl.ldc.upenn.edu/w/w07/w07-1507.pdf |
op_rights |
Metadata may be used without restrictions as long as the oai identifier remains attached to it. |
_version_ |
1766187761928241152 |