Extraction Of Wikidata Knowledge For The Metadata Formation For Documents of Digital Mathematical Collections

Methods for creating digital mathematical collections that include unstructured sets of documents are presented. These sets contain materials from scientific conferences, as well as articles from the archives of mathematical journals of the "pre-digital" period. Using the software tools of...

Full description

Bibliographic Details
Main Authors: Гафурова, Полина Олеговна, Елизаров, Александр Михайлович, Липачёв, Евгений Константинович
Format: Article in Journal/Newspaper
Language:Russian
Published: Kazan Federal University 2022
Subjects:
DML
Online Access:https://elbib.ru/article/view/720
Description
Summary:Methods for creating digital mathematical collections that include unstructured sets of documents are presented. These sets contain materials from scientific conferences, as well as articles from the archives of mathematical journals of the "pre-digital" period. Using the software tools of the metadata factory of the digital mathematical library Lobachevskii DML, a mandatory set of metadata for digital collection documents was formed. To refine and replenish the metadata sets, knowledge extraction methods from Wikidata were used. To search Wikidata for information about digital collection documents and their authors, a system of SPARQL queries has been developed. A set of Wikidata entities is defined, which determine the features of the search, as well as the subsequent filtering of the results. Methods for clarifying and supplementing the bibliographic references given in the articles are proposed. When forming the metadata of documents of retrocollections, a search was made in Wikidata for information about the years of life of the authors of articles, as well as URLs of web pages with information about articles and their authors. The results of the formation of several new digital collections of the Lobachevskii-DML digital library are presented. Представлены методы создания цифровых математических коллекций, включающих неструктурированные наборы документов. Эти наборы содержат материалы сборников научных конференций, а также статьи из архивов математических журналов «доцифрового» периода. Формирование обязательного набора метаданных названных документов произведено с помощью программных инструментов фабрики метаданных цифровой математической библиотеки Lobachevskii DML. Для уточнения и пополнения наборов метаданных документов цифровых коллекций использованы методы извлечения знаний из Wikidata. Разработана система SPARQL-запросов для поиска в Wikidata информации о документах электронных коллекций и их авторах. Обозначен набор сущностей Wikidata, определяющих признаки поиска, а также последующую фильтрацию ...