Critical Survey of the Freely Available Arabic Corpora
The availability of corpora is a major factor in building natural language processing applications. However, the costs of acquiring corpora can prevent some researchers from going further in their endeavours. The ease of access to freely available corpora is urgent needed in the NLP research communi...
Main Author: | |
---|---|
Format: | Text |
Language: | unknown |
Published: |
2017
|
Subjects: | |
Online Access: | http://arxiv.org/abs/1702.07835 |
id |
ftarxivpreprints:oai:arXiv.org:1702.07835 |
---|---|
record_format |
openpolar |
spelling |
ftarxivpreprints:oai:arXiv.org:1702.07835 2023-09-05T13:20:31+02:00 Critical Survey of the Freely Available Arabic Corpora Zaghouani, Wajdi 2017-02-25 http://arxiv.org/abs/1702.07835 unknown http://arxiv.org/abs/1702.07835 Computer Science - Computation and Language text 2017 ftarxivpreprints 2023-08-16T14:18:14Z The availability of corpora is a major factor in building natural language processing applications. However, the costs of acquiring corpora can prevent some researchers from going further in their endeavours. The ease of access to freely available corpora is urgent needed in the NLP research community especially for language such as Arabic. Currently, there is not easy was to access to a comprehensive and updated list of freely available Arabic corpora. We present in this paper, the results of a recent survey conducted to identify the list of the freely available Arabic corpora and language resources. Our preliminary results showed an initial list of 66 sources. We presents our findings in the various categories studied and we provided the direct links to get the data when possible. Comment: Published in the Proceedings of the International Conference on Language Resources and Evaluation (LREC'2014), OSACT Workshop. Reykjavik, Iceland, 26-31 May 2014 Text Iceland ArXiv.org (Cornell University Library) |
institution |
Open Polar |
collection |
ArXiv.org (Cornell University Library) |
op_collection_id |
ftarxivpreprints |
language |
unknown |
topic |
Computer Science - Computation and Language |
spellingShingle |
Computer Science - Computation and Language Zaghouani, Wajdi Critical Survey of the Freely Available Arabic Corpora |
topic_facet |
Computer Science - Computation and Language |
description |
The availability of corpora is a major factor in building natural language processing applications. However, the costs of acquiring corpora can prevent some researchers from going further in their endeavours. The ease of access to freely available corpora is urgent needed in the NLP research community especially for language such as Arabic. Currently, there is not easy was to access to a comprehensive and updated list of freely available Arabic corpora. We present in this paper, the results of a recent survey conducted to identify the list of the freely available Arabic corpora and language resources. Our preliminary results showed an initial list of 66 sources. We presents our findings in the various categories studied and we provided the direct links to get the data when possible. Comment: Published in the Proceedings of the International Conference on Language Resources and Evaluation (LREC'2014), OSACT Workshop. Reykjavik, Iceland, 26-31 May 2014 |
format |
Text |
author |
Zaghouani, Wajdi |
author_facet |
Zaghouani, Wajdi |
author_sort |
Zaghouani, Wajdi |
title |
Critical Survey of the Freely Available Arabic Corpora |
title_short |
Critical Survey of the Freely Available Arabic Corpora |
title_full |
Critical Survey of the Freely Available Arabic Corpora |
title_fullStr |
Critical Survey of the Freely Available Arabic Corpora |
title_full_unstemmed |
Critical Survey of the Freely Available Arabic Corpora |
title_sort |
critical survey of the freely available arabic corpora |
publishDate |
2017 |
url |
http://arxiv.org/abs/1702.07835 |
genre |
Iceland |
genre_facet |
Iceland |
op_relation |
http://arxiv.org/abs/1702.07835 |
_version_ |
1776201193649864704 |