Automated Classification and Categorization of Mathematical Knowledge
Abstract. There is a common Mathematics Subject Classification (MSC) System used for categorizing mathematical papers and knowledge. We present results of machine learning of the MSC on full texts of papers in the mathematical digital libraries DML-CZ and NUMDAM. The F1-measure achieved on classific...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.221.1116 http://www.fi.muni.cz/usr/sojka/papers/mkm2008-rehurek-sojka.pdf |
id |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.221.1116 |
---|---|
record_format |
openpolar |
spelling |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.221.1116 2023-05-15T16:01:46+02:00 Automated Classification and Categorization of Mathematical Knowledge Radim Řehůřek Petr Sojka The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.221.1116 http://www.fi.muni.cz/usr/sojka/papers/mkm2008-rehurek-sojka.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.221.1116 http://www.fi.muni.cz/usr/sojka/papers/mkm2008-rehurek-sojka.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www.fi.muni.cz/usr/sojka/papers/mkm2008-rehurek-sojka.pdf capacity to select edit single out structure highlight group pair merge harmonize synthesize focus organize condense reduce boil down choose categorize catalog classify list abstract scan look into idealize isolate discriminate distinguish screen pigeonhole pick over sort integrate blend inspect text ftciteseerx 2016-01-07T18:18:52Z Abstract. There is a common Mathematics Subject Classification (MSC) System used for categorizing mathematical papers and knowledge. We present results of machine learning of the MSC on full texts of papers in the mathematical digital libraries DML-CZ and NUMDAM. The F1-measure achieved on classification task of top-level MSC categories exceeds 89%. We describe and evaluate our methods for measuring the similarity of papers in the digital library based on paper full texts. 1 Text DML Unknown |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
ftciteseerx |
language |
English |
topic |
capacity to select edit single out structure highlight group pair merge harmonize synthesize focus organize condense reduce boil down choose categorize catalog classify list abstract scan look into idealize isolate discriminate distinguish screen pigeonhole pick over sort integrate blend inspect |
spellingShingle |
capacity to select edit single out structure highlight group pair merge harmonize synthesize focus organize condense reduce boil down choose categorize catalog classify list abstract scan look into idealize isolate discriminate distinguish screen pigeonhole pick over sort integrate blend inspect Radim Řehůřek Petr Sojka Automated Classification and Categorization of Mathematical Knowledge |
topic_facet |
capacity to select edit single out structure highlight group pair merge harmonize synthesize focus organize condense reduce boil down choose categorize catalog classify list abstract scan look into idealize isolate discriminate distinguish screen pigeonhole pick over sort integrate blend inspect |
description |
Abstract. There is a common Mathematics Subject Classification (MSC) System used for categorizing mathematical papers and knowledge. We present results of machine learning of the MSC on full texts of papers in the mathematical digital libraries DML-CZ and NUMDAM. The F1-measure achieved on classification task of top-level MSC categories exceeds 89%. We describe and evaluate our methods for measuring the similarity of papers in the digital library based on paper full texts. 1 |
author2 |
The Pennsylvania State University CiteSeerX Archives |
format |
Text |
author |
Radim Řehůřek Petr Sojka |
author_facet |
Radim Řehůřek Petr Sojka |
author_sort |
Radim Řehůřek |
title |
Automated Classification and Categorization of Mathematical Knowledge |
title_short |
Automated Classification and Categorization of Mathematical Knowledge |
title_full |
Automated Classification and Categorization of Mathematical Knowledge |
title_fullStr |
Automated Classification and Categorization of Mathematical Knowledge |
title_full_unstemmed |
Automated Classification and Categorization of Mathematical Knowledge |
title_sort |
automated classification and categorization of mathematical knowledge |
url |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.221.1116 http://www.fi.muni.cz/usr/sojka/papers/mkm2008-rehurek-sojka.pdf |
genre |
DML |
genre_facet |
DML |
op_source |
http://www.fi.muni.cz/usr/sojka/papers/mkm2008-rehurek-sojka.pdf |
op_relation |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.221.1116 http://www.fi.muni.cz/usr/sojka/papers/mkm2008-rehurek-sojka.pdf |
op_rights |
Metadata may be used without restrictions as long as the oai identifier remains attached to it. |
_version_ |
1766397498821181440 |