Orca 2: Teaching Small Language Models How to Reason

Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on tra...

Full description

Bibliographic Details
Main Authors:	Mitra, Arindam, Del Corro, Luciano, Mahajan, Shweti, Codas, Andres, Simoes, Clarisse, Agarwal, Sahaj, Chen, Xuxi, Razdaibiedina, Anastasia, Jones, Erik, Aggarwal, Kriti, Palangi, Hamid, Zheng, Guoqing, Rosset, Corby, Khanpour, Hamed, Awadallah, Ahmed
Format:	Text
Language:	unknown
Published:	2023
Subjects:	Computer Science - Artificial Intelligence Orca
Online Access:	http://arxiv.org/abs/2311.11045

id	ftarxivpreprints:oai:arXiv.org:2311.11045
record_format	openpolar
spelling	ftarxivpreprints:oai:arXiv.org:2311.11045 2023-12-24T10:23:59+01:00 Orca 2: Teaching Small Language Models How to Reason Mitra, Arindam Del Corro, Luciano Mahajan, Shweti Codas, Andres Simoes, Clarisse Agarwal, Sahaj Chen, Xuxi Razdaibiedina, Anastasia Jones, Erik Aggarwal, Kriti Palangi, Hamid Zheng, Guoqing Rosset, Corby Khanpour, Hamed Awadallah, Ahmed 2023-11-18 http://arxiv.org/abs/2311.11045 unknown http://arxiv.org/abs/2311.11045 Computer Science - Artificial Intelligence text 2023 ftarxivpreprints 2023-11-26T02:06:59Z Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We contend that excessive emphasis on imitation may restrict the potential of smaller models. We seek to teach small LMs to employ different solution strategies for different tasks, potentially different from the one used by the larger model. For example, while larger models might provide a direct answer to a complex task, smaller models may not have the same capacity. In Orca 2, we teach the model various reasoning techniques (step-by-step, recall then generate, recall-reason-generate, direct answer, etc.). More crucially, we aim to help the model learn to determine the most effective solution strategy for each task. We evaluate Orca 2 using a comprehensive set of 15 diverse benchmarks (corresponding to approximately 100 tasks and over 36,000 unique prompts). Orca 2 significantly surpasses models of similar size and attains performance levels similar or better to those of models 5-10x larger, as assessed on complex tasks that test advanced reasoning abilities in zero-shot settings. make Orca 2 weights publicly available at aka.ms/orca-lm to support research on the development, evaluation, and alignment of smaller LMs Comment: Added url to model weights fixed typo in Author name Text Orca ArXiv.org (Cornell University Library)
institution	Open Polar
collection	ArXiv.org (Cornell University Library)
op_collection_id	ftarxivpreprints
language	unknown
topic	Computer Science - Artificial Intelligence
spellingShingle	Computer Science - Artificial Intelligence Mitra, Arindam Del Corro, Luciano Mahajan, Shweti Codas, Andres Simoes, Clarisse Agarwal, Sahaj Chen, Xuxi Razdaibiedina, Anastasia Jones, Erik Aggarwal, Kriti Palangi, Hamid Zheng, Guoqing Rosset, Corby Khanpour, Hamed Awadallah, Ahmed Orca 2: Teaching Small Language Models How to Reason
topic_facet	Computer Science - Artificial Intelligence
description	Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We contend that excessive emphasis on imitation may restrict the potential of smaller models. We seek to teach small LMs to employ different solution strategies for different tasks, potentially different from the one used by the larger model. For example, while larger models might provide a direct answer to a complex task, smaller models may not have the same capacity. In Orca 2, we teach the model various reasoning techniques (step-by-step, recall then generate, recall-reason-generate, direct answer, etc.). More crucially, we aim to help the model learn to determine the most effective solution strategy for each task. We evaluate Orca 2 using a comprehensive set of 15 diverse benchmarks (corresponding to approximately 100 tasks and over 36,000 unique prompts). Orca 2 significantly surpasses models of similar size and attains performance levels similar or better to those of models 5-10x larger, as assessed on complex tasks that test advanced reasoning abilities in zero-shot settings. make Orca 2 weights publicly available at aka.ms/orca-lm to support research on the development, evaluation, and alignment of smaller LMs Comment: Added url to model weights fixed typo in Author name
format	Text
author	Mitra, Arindam Del Corro, Luciano Mahajan, Shweti Codas, Andres Simoes, Clarisse Agarwal, Sahaj Chen, Xuxi Razdaibiedina, Anastasia Jones, Erik Aggarwal, Kriti Palangi, Hamid Zheng, Guoqing Rosset, Corby Khanpour, Hamed Awadallah, Ahmed
author_facet	Mitra, Arindam Del Corro, Luciano Mahajan, Shweti Codas, Andres Simoes, Clarisse Agarwal, Sahaj Chen, Xuxi Razdaibiedina, Anastasia Jones, Erik Aggarwal, Kriti Palangi, Hamid Zheng, Guoqing Rosset, Corby Khanpour, Hamed Awadallah, Ahmed
author_sort	Mitra, Arindam
title	Orca 2: Teaching Small Language Models How to Reason
title_short	Orca 2: Teaching Small Language Models How to Reason
title_full	Orca 2: Teaching Small Language Models How to Reason
title_fullStr	Orca 2: Teaching Small Language Models How to Reason
title_full_unstemmed	Orca 2: Teaching Small Language Models How to Reason
title_sort	orca 2: teaching small language models how to reason
publishDate	2023
url	http://arxiv.org/abs/2311.11045
genre	Orca
genre_facet	Orca
op_relation	http://arxiv.org/abs/2311.11045
_version_	1786198342317899776

Orca 2: Teaching Small Language Models How to Reason

Similar Items