WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration

The problem of word sense disambiguation (WSD) is considered in the article. Given a set of synonyms (synsets) and sentences with these synonyms. It is necessary to select the meaning of the word in the sentence automatically. 1285 sentences were tagged by experts, namely, one of the dictionary mean...

Full description

Bibliographic Details
Main Authors: Kirillov, Alexander, Krizhanovsky, Natalia, Krizhanovsky, Andrew
Format: Text
Language:unknown
Published: arXiv 2018
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.1805.09559
https://arxiv.org/abs/1805.09559
id ftdatacite:10.48550/arxiv.1805.09559
record_format openpolar
spelling ftdatacite:10.48550/arxiv.1805.09559 2023-05-15T17:01:35+02:00 WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration Kirillov, Alexander Krizhanovsky, Natalia Krizhanovsky, Andrew 2018 https://dx.doi.org/10.48550/arxiv.1805.09559 https://arxiv.org/abs/1805.09559 unknown arXiv https://dx.doi.org/10.17076/mat829 Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 CC-BY Information Retrieval cs.IR Computation and Language cs.CL FOS Computer and information sciences I.5.3; H.3.1; H.3.3 68T50 article-journal Article ScholarlyArticle Text 2018 ftdatacite https://doi.org/10.48550/arxiv.1805.09559 https://doi.org/10.17076/mat829 2022-04-01T09:35:41Z The problem of word sense disambiguation (WSD) is considered in the article. Given a set of synonyms (synsets) and sentences with these synonyms. It is necessary to select the meaning of the word in the sentence automatically. 1285 sentences were tagged by experts, namely, one of the dictionary meanings was selected by experts for target words. To solve the WSD-problem, an algorithm based on a new method of vector-word contexts proximity calculation is proposed. In order to achieve higher accuracy, a preliminary epsilon-filtering of words is performed, both in the sentence and in the set of synonyms. An extensive program of experiments was carried out. Four algorithms are implemented, including a new algorithm. Experiments have shown that in a number of cases the new algorithm shows better results. The developed software and the tagged corpus have an open license and are available online. Wiktionary and Wikisource are used. A brief description of this work can be viewed in slides (https://goo.gl/9ak6Gt). Video lecture in Russian on this research is available online (https://youtu.be/-DLmRkepf58). : 15 pages, 1 table, 15 figures, accepted in the journal Transactions of Karelian Research Centre of the Russian Academy of Sciences Text karelian DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Information Retrieval cs.IR
Computation and Language cs.CL
FOS Computer and information sciences
I.5.3; H.3.1; H.3.3
68T50
spellingShingle Information Retrieval cs.IR
Computation and Language cs.CL
FOS Computer and information sciences
I.5.3; H.3.1; H.3.3
68T50
Kirillov, Alexander
Krizhanovsky, Natalia
Krizhanovsky, Andrew
WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration
topic_facet Information Retrieval cs.IR
Computation and Language cs.CL
FOS Computer and information sciences
I.5.3; H.3.1; H.3.3
68T50
description The problem of word sense disambiguation (WSD) is considered in the article. Given a set of synonyms (synsets) and sentences with these synonyms. It is necessary to select the meaning of the word in the sentence automatically. 1285 sentences were tagged by experts, namely, one of the dictionary meanings was selected by experts for target words. To solve the WSD-problem, an algorithm based on a new method of vector-word contexts proximity calculation is proposed. In order to achieve higher accuracy, a preliminary epsilon-filtering of words is performed, both in the sentence and in the set of synonyms. An extensive program of experiments was carried out. Four algorithms are implemented, including a new algorithm. Experiments have shown that in a number of cases the new algorithm shows better results. The developed software and the tagged corpus have an open license and are available online. Wiktionary and Wikisource are used. A brief description of this work can be viewed in slides (https://goo.gl/9ak6Gt). Video lecture in Russian on this research is available online (https://youtu.be/-DLmRkepf58). : 15 pages, 1 table, 15 figures, accepted in the journal Transactions of Karelian Research Centre of the Russian Academy of Sciences
format Text
author Kirillov, Alexander
Krizhanovsky, Natalia
Krizhanovsky, Andrew
author_facet Kirillov, Alexander
Krizhanovsky, Natalia
Krizhanovsky, Andrew
author_sort Kirillov, Alexander
title WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration
title_short WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration
title_full WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration
title_fullStr WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration
title_full_unstemmed WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration
title_sort wsd algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration
publisher arXiv
publishDate 2018
url https://dx.doi.org/10.48550/arxiv.1805.09559
https://arxiv.org/abs/1805.09559
genre karelian
genre_facet karelian
op_relation https://dx.doi.org/10.17076/mat829
op_rights Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
cc-by-4.0
op_rightsnorm CC-BY
op_doi https://doi.org/10.48550/arxiv.1805.09559
https://doi.org/10.17076/mat829
_version_ 1766054695697121280