Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension

The conversational machine reading comprehension (CMRC) task aims to answer questions in conversations, which has been a hot research topic in recent years because of its wide applications. However, existing CMRC benchmarks in which each conversation is assigned a static passage are inconsistent wit...

Full description

Bibliographic Details
Main Authors:	Chen, Nuo, Li, Hongguang, He, Junqing, Bao, Yinan, Lin, Xinshi, Yang, Qi, Liu, Jianfeng, Gan, Ruyi, Zhang, Jiaxing, Wang, Baoyuan, Li, Jia
Format:	Text
Language:	unknown
Published:	2023
Subjects:	Computer Science - Computation and Language Computer Science - Artificial Intelligence Orca
Online Access:	http://arxiv.org/abs/2302.13619

id	ftarxivpreprints:oai:arXiv.org:2302.13619
record_format	openpolar
spelling	ftarxivpreprints:oai:arXiv.org:2302.13619 2023-11-12T04:24:06+01:00 Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension Chen, Nuo Li, Hongguang He, Junqing Bao, Yinan Lin, Xinshi Yang, Qi Liu, Jianfeng Gan, Ruyi Zhang, Jiaxing Wang, Baoyuan Li, Jia 2023-02-27 http://arxiv.org/abs/2302.13619 unknown http://arxiv.org/abs/2302.13619 EMNLP 2023 Computer Science - Computation and Language Computer Science - Artificial Intelligence text 2023 ftarxivpreprints 2023-10-22T01:06:24Z The conversational machine reading comprehension (CMRC) task aims to answer questions in conversations, which has been a hot research topic in recent years because of its wide applications. However, existing CMRC benchmarks in which each conversation is assigned a static passage are inconsistent with real scenarios. Thus, model's comprehension ability towards real scenarios are hard to evaluate reasonably. To this end, we propose the first Chinese CMRC benchmark Orca and further provide zero-shot/few-shot settings to evaluate model's generalization ability towards diverse domains. We collect 831 hot-topic driven conversations with 4,742 turns in total. Each turn of a conversation is assigned with a response-related passage, aiming to evaluate model's comprehension ability more reasonably. The topics of conversations are collected from social media platform and cover 33 domains, trying to be consistent with real scenarios. Importantly, answers in Orca are all well-annotated natural responses rather than the specific spans or short phrase in previous datasets. Besides, we implement three strong baselines to tackle the challenge in Orca. The results indicate the great challenge of our CMRC benchmark. Our datatset and checkpoints are available at https://github.com/nuochenpku/Orca. Comment: 14 pages Text Orca ArXiv.org (Cornell University Library)
institution	Open Polar
collection	ArXiv.org (Cornell University Library)
op_collection_id	ftarxivpreprints
language	unknown
topic	Computer Science - Computation and Language Computer Science - Artificial Intelligence
spellingShingle	Computer Science - Computation and Language Computer Science - Artificial Intelligence Chen, Nuo Li, Hongguang He, Junqing Bao, Yinan Lin, Xinshi Yang, Qi Liu, Jianfeng Gan, Ruyi Zhang, Jiaxing Wang, Baoyuan Li, Jia Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension
topic_facet	Computer Science - Computation and Language Computer Science - Artificial Intelligence
description	The conversational machine reading comprehension (CMRC) task aims to answer questions in conversations, which has been a hot research topic in recent years because of its wide applications. However, existing CMRC benchmarks in which each conversation is assigned a static passage are inconsistent with real scenarios. Thus, model's comprehension ability towards real scenarios are hard to evaluate reasonably. To this end, we propose the first Chinese CMRC benchmark Orca and further provide zero-shot/few-shot settings to evaluate model's generalization ability towards diverse domains. We collect 831 hot-topic driven conversations with 4,742 turns in total. Each turn of a conversation is assigned with a response-related passage, aiming to evaluate model's comprehension ability more reasonably. The topics of conversations are collected from social media platform and cover 33 domains, trying to be consistent with real scenarios. Importantly, answers in Orca are all well-annotated natural responses rather than the specific spans or short phrase in previous datasets. Besides, we implement three strong baselines to tackle the challenge in Orca. The results indicate the great challenge of our CMRC benchmark. Our datatset and checkpoints are available at https://github.com/nuochenpku/Orca. Comment: 14 pages
format	Text
author	Chen, Nuo Li, Hongguang He, Junqing Bao, Yinan Lin, Xinshi Yang, Qi Liu, Jianfeng Gan, Ruyi Zhang, Jiaxing Wang, Baoyuan Li, Jia
author_facet	Chen, Nuo Li, Hongguang He, Junqing Bao, Yinan Lin, Xinshi Yang, Qi Liu, Jianfeng Gan, Ruyi Zhang, Jiaxing Wang, Baoyuan Li, Jia
author_sort	Chen, Nuo
title	Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension
title_short	Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension
title_full	Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension
title_fullStr	Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension
title_full_unstemmed	Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension
title_sort	orca: a few-shot benchmark for chinese conversational machine reading comprehension
publishDate	2023
url	http://arxiv.org/abs/2302.13619
genre	Orca
genre_facet	Orca
op_relation	http://arxiv.org/abs/2302.13619 EMNLP 2023
_version_	1782338674759827456

Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension

Similar Items