Data for "SuperSim: a test set for word similarity and relatedness in Swedish"

This repository contains the data described in SuperSim: a test set for word similarity and relatedness in Swedish (Hengchen and Tahmasebi, 2021) available at https://aclanthology.org/2021.nodalida-main.27/ . If you use part orwhole of this resource, please cite the following work or alternatively u...

Full description

Bibliographic Details
Main Authors: Simon Hengchen, Nina Tahmasebi
Format: Other/Unknown Material
Language:Swedish
Published: Zenodo 2021
Subjects:
Online Access:https://doi.org/10.5281/zenodo.4660084
Description
Summary:This repository contains the data described in SuperSim: a test set for word similarity and relatedness in Swedish (Hengchen and Tahmasebi, 2021) available at https://aclanthology.org/2021.nodalida-main.27/ . If you use part orwhole of this resource, please cite the following work or alternatively use the bibtex entry: Hengchen, Simonand Tahmasebi, Nina, 2021. SuperSim: a test set for word similarity and relatedness in Swedish. In The 23rd Nordic Conference on Computational Linguistics (NoDaLiDa’21) . <code>@inproceedings{hengchen-tahmasebi-2021-supersim, title = "{SuperSim:} a test set for word similarity and relatedness in {Swedish}", author = "Hengchen, Simon and Tahmasebi, Nina", booktitle = "Proceedings of the 23rd Nordic Conference on Computational Linguistics", month = may # "{--}" # jun, year = "2021", address = "Reykjavik, Iceland, and Online", publisher = {Link{\"o}ping University Electronic Press}, }</code> The data contained in this repository is as follows: The<code>code</code>folder contains: <code>main.py</code> <code>utils.py</code> <code>train_base_models.py</code> <code>perl-clean.pl</code> <code>requirements.txt</code> The<code>data</code>folder contains: <code>gold_relatedness.tsv</code>: all relatedness judgments from all annotators, as well as the mean <code>gold_similarity.tsv</code>: all similarity judgments from all annotators, as well as the mean <code>models</code>contains baseline models: Trained on the Swedish Gigaword: ...