Numerals and what counts

This study discusses the way different numerals and related expressions are currently annotated in the Universal Dependencies project, with a specific focus on the Uralic language family and only occasional references to the other language groups. We analyse different annotation conventions between...

Full description

Bibliographic Details
Main Authors: Rueter, Jack, Partanen, Niko, Pirinen, Tommi A
Other Authors: Lhoneux, Miryam de, Tsarfaty, Reut, Language Technology, The National Library of Finland, Library Network Services
Format: Conference Object
Language:English
Published: 2022
Subjects:
Online Access:http://hdl.handle.net/10138/343000
Description
Summary:This study discusses the way different numerals and related expressions are currently annotated in the Universal Dependencies project, with a specific focus on the Uralic language family and only occasional references to the other language groups. We analyse different annotation conventions between individual treebanks, and aim to highlight some areas where further development work and systematization could prove beneficial. At the same time, the Universal Dependencies project already offers a wide range of conventions to mark nuanced variation in numerals and counting expressions, and the harmonization of conventions between different languages could be the next step to take. The discussion here makes specific reference to Universal Dependencies version 2.8, and some differences found may already have been harmonized in version 2.9. Regardless of whether this takes place or not, we believe that the study still forms an important documentation of this period in the project. Peer reviewed