JBIG2 Supported by OCR
Digital Mathematical libraries contain a large volume of PDF documents containing scanned text. In this paper, we describe how this documents can be compressed and thus provide them more effectively to the users. We introduce a JBIG2 standard for compressing bitonal images such as scanned text and w...
Main Author: | |
---|---|
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Neuveden
2012
|
Subjects: | |
Online Access: | https://is.muni.cz/publication/986500 |
id |
ftmasarykis:oai:is.muni.cz:986500 |
---|---|
record_format |
openpolar |
spelling |
ftmasarykis:oai:is.muni.cz:986500 2024-09-15T18:03:50+00:00 JBIG2 Supported by OCR Hatlapatka Radim 2012 9 https://is.muni.cz/publication/986500 eng eng Neuveden https://is.muni.cz/publication/986500 info:eu-repo/semantics/restrictedAccess CEUR Workshop Proceedings, Volume 921 jbig2enc JBIG2 PDF size optimization compression DML OCR pdfJbIm DML-CZ EuDML optimalizace PDF komprese info:eu-repo/semantics/article D 2012 ftmasarykis 2024-08-29T03:18:41Z Digital Mathematical libraries contain a large volume of PDF documents containing scanned text. In this paper, we describe how this documents can be compressed and thus provide them more effectively to the users. We introduce a JBIG2 standard for compressing bitonal images such as scanned text and we discuss issues if OCR is used for improving the compression ratio of jbig2enc open-source encoder. For this purpose, we have designed API for using OCR in jbig2enc which we describe in this paper together with already achieved results. Digitální matematické knihovnz obsahují velké množství PDF dokumentů obsahujících skenovaný text. V tomto článku popisujeme, jakým způsobem mohou být takové dokumenty komprimovány, a tím pádem poskytovány uživateli efektivnější cestou. Za tímto účelem představujeme JBIG2 standard pro kompresi bitonálních obrázků (např. naskenovaný text) a diskutujeme přínosy a problémy použití OCR za účelem zvýšení komprese volně šiřitelného jbig2enc enkodéru. Za tímto účelem jsme navrhli a implementovali rozhraní pro používání OCR v jbig2enc enkodéru, které zde popisujeme spolu s předběžnými výsledky. Digital Mathematical libraries contain a large volume of PDF documents containing scanned text. In this paper, we describe how this documents can be compressed and thus provide them more effectively to the users. We introduce a JBIG2 standard for compressing bitonal images such as scanned text and we discuss issues if OCR is used for improving the compression ratio of jbig2enc open-source encoder. For this purpose, we have designed API for using OCR in jbig2enc which we describe in this paper together with already achieved results. Article in Journal/Newspaper DML Masaryk University: Open Services of Information System |
institution |
Open Polar |
collection |
Masaryk University: Open Services of Information System |
op_collection_id |
ftmasarykis |
language |
English |
topic |
jbig2enc JBIG2 PDF size optimization compression DML OCR pdfJbIm DML-CZ EuDML optimalizace PDF komprese |
spellingShingle |
jbig2enc JBIG2 PDF size optimization compression DML OCR pdfJbIm DML-CZ EuDML optimalizace PDF komprese Hatlapatka Radim JBIG2 Supported by OCR |
topic_facet |
jbig2enc JBIG2 PDF size optimization compression DML OCR pdfJbIm DML-CZ EuDML optimalizace PDF komprese |
description |
Digital Mathematical libraries contain a large volume of PDF documents containing scanned text. In this paper, we describe how this documents can be compressed and thus provide them more effectively to the users. We introduce a JBIG2 standard for compressing bitonal images such as scanned text and we discuss issues if OCR is used for improving the compression ratio of jbig2enc open-source encoder. For this purpose, we have designed API for using OCR in jbig2enc which we describe in this paper together with already achieved results. Digitální matematické knihovnz obsahují velké množství PDF dokumentů obsahujících skenovaný text. V tomto článku popisujeme, jakým způsobem mohou být takové dokumenty komprimovány, a tím pádem poskytovány uživateli efektivnější cestou. Za tímto účelem představujeme JBIG2 standard pro kompresi bitonálních obrázků (např. naskenovaný text) a diskutujeme přínosy a problémy použití OCR za účelem zvýšení komprese volně šiřitelného jbig2enc enkodéru. Za tímto účelem jsme navrhli a implementovali rozhraní pro používání OCR v jbig2enc enkodéru, které zde popisujeme spolu s předběžnými výsledky. Digital Mathematical libraries contain a large volume of PDF documents containing scanned text. In this paper, we describe how this documents can be compressed and thus provide them more effectively to the users. We introduce a JBIG2 standard for compressing bitonal images such as scanned text and we discuss issues if OCR is used for improving the compression ratio of jbig2enc open-source encoder. For this purpose, we have designed API for using OCR in jbig2enc which we describe in this paper together with already achieved results. |
format |
Article in Journal/Newspaper |
author |
Hatlapatka Radim |
author_facet |
Hatlapatka Radim |
author_sort |
Hatlapatka Radim |
title |
JBIG2 Supported by OCR |
title_short |
JBIG2 Supported by OCR |
title_full |
JBIG2 Supported by OCR |
title_fullStr |
JBIG2 Supported by OCR |
title_full_unstemmed |
JBIG2 Supported by OCR |
title_sort |
jbig2 supported by ocr |
publisher |
Neuveden |
publishDate |
2012 |
url |
https://is.muni.cz/publication/986500 |
genre |
DML |
genre_facet |
DML |
op_source |
CEUR Workshop Proceedings, Volume 921 |
op_relation |
https://is.muni.cz/publication/986500 |
op_rights |
info:eu-repo/semantics/restrictedAccess |
_version_ |
1810441293925449728 |