Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ...

Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scal...

Full description

Bibliographic Details
Main Authors: Mukherjee, Subhabrata, Mitra, Arindam, Jawahar, Ganesh, Agarwal, Sahaj, Palangi, Hamid, Awadallah, Ahmed
Format: Report
Language:unknown
Published: arXiv 2023
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.2306.02707
https://arxiv.org/abs/2306.02707
id ftdatacite:10.48550/arxiv.2306.02707
record_format openpolar
spelling ftdatacite:10.48550/arxiv.2306.02707 2023-07-23T04:21:14+02:00 Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ... Mukherjee, Subhabrata Mitra, Arindam Jawahar, Ganesh Agarwal, Sahaj Palangi, Hamid Awadallah, Ahmed 2023 https://dx.doi.org/10.48550/arxiv.2306.02707 https://arxiv.org/abs/2306.02707 unknown arXiv Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences CreativeWork Preprint article Article 2023 ftdatacite https://doi.org/10.48550/arxiv.2306.02707 2023-07-03T18:35:43Z Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimating the small model's capability as they tend to learn to imitate the style, but not the reasoning process of LFMs. To address these challenges, we develop Orca (We are working with our legal team to publicly release a diff of the model weights in accordance with LLaMA's release policy to be published at https://aka.ms/orca-lm), a 13-billion parameter model that learns to imitate the reasoning process of LFMs. Orca learns from rich signals from GPT-4 including explanation traces; step-by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT. To promote this progressive ... Report Orca DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Computation and Language cs.CL
Machine Learning cs.LG
FOS Computer and information sciences
spellingShingle Computation and Language cs.CL
Machine Learning cs.LG
FOS Computer and information sciences
Mukherjee, Subhabrata
Mitra, Arindam
Jawahar, Ganesh
Agarwal, Sahaj
Palangi, Hamid
Awadallah, Ahmed
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ...
topic_facet Computation and Language cs.CL
Machine Learning cs.LG
FOS Computer and information sciences
description Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimating the small model's capability as they tend to learn to imitate the style, but not the reasoning process of LFMs. To address these challenges, we develop Orca (We are working with our legal team to publicly release a diff of the model weights in accordance with LLaMA's release policy to be published at https://aka.ms/orca-lm), a 13-billion parameter model that learns to imitate the reasoning process of LFMs. Orca learns from rich signals from GPT-4 including explanation traces; step-by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT. To promote this progressive ...
format Report
author Mukherjee, Subhabrata
Mitra, Arindam
Jawahar, Ganesh
Agarwal, Sahaj
Palangi, Hamid
Awadallah, Ahmed
author_facet Mukherjee, Subhabrata
Mitra, Arindam
Jawahar, Ganesh
Agarwal, Sahaj
Palangi, Hamid
Awadallah, Ahmed
author_sort Mukherjee, Subhabrata
title Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ...
title_short Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ...
title_full Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ...
title_fullStr Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ...
title_full_unstemmed Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ...
title_sort orca: progressive learning from complex explanation traces of gpt-4 ...
publisher arXiv
publishDate 2023
url https://dx.doi.org/10.48550/arxiv.2306.02707
https://arxiv.org/abs/2306.02707
genre Orca
genre_facet Orca
op_rights Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
cc-by-4.0
op_doi https://doi.org/10.48550/arxiv.2306.02707
_version_ 1772186568566505472