Normalization of Digital Mathematics Library Content MathML Canonicalization

Abstract. Paper discusses the needs for data normalization in a Digital Mathematics Library (DML). Specifically, emphasis is given to canonicalizing formulae encoded in Presentation MathML notation which starts to be available in several DMLs and is used by DML applications. This is a prerequisite f...

Full description

Bibliographic Details
Main Authors: David Formánek, Martin Líška, Michal Růžička, Petr Sojka
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Subjects:
DML
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.294.6791
http://ceur-ws.org/Vol-921/wip-05.pdf
Description
Summary:Abstract. Paper discusses the needs for data normalization in a Digital Mathematics Library (DML). Specifically, emphasis is given to canonicalizing formulae encoded in Presentation MathML notation which starts to be available in several DMLs and is used by DML applications. This is a prerequisite for advanced processing — namely math enabled fulltext searching or semantic filtering and automated classification. Different sources of MathML and their specifics are described. Several use cases of possible formulae canonicalization transformations are listed and discussed in detail. Findings are finally concluded and a design of a to-be-developed canonicalization tool is outlined.