Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ...
Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scal...
Main Authors: | , , , , , |
---|---|
Format: | Report |
Language: | unknown |
Published: |
arXiv
2023
|
Subjects: | |
Online Access: | https://dx.doi.org/10.48550/arxiv.2306.02707 https://arxiv.org/abs/2306.02707 |
id |
ftdatacite:10.48550/arxiv.2306.02707 |
---|---|
record_format |
openpolar |
spelling |
ftdatacite:10.48550/arxiv.2306.02707 2023-07-23T04:21:14+02:00 Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ... Mukherjee, Subhabrata Mitra, Arindam Jawahar, Ganesh Agarwal, Sahaj Palangi, Hamid Awadallah, Ahmed 2023 https://dx.doi.org/10.48550/arxiv.2306.02707 https://arxiv.org/abs/2306.02707 unknown arXiv Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences CreativeWork Preprint article Article 2023 ftdatacite https://doi.org/10.48550/arxiv.2306.02707 2023-07-03T18:35:43Z Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimating the small model's capability as they tend to learn to imitate the style, but not the reasoning process of LFMs. To address these challenges, we develop Orca (We are working with our legal team to publicly release a diff of the model weights in accordance with LLaMA's release policy to be published at https://aka.ms/orca-lm), a 13-billion parameter model that learns to imitate the reasoning process of LFMs. Orca learns from rich signals from GPT-4 including explanation traces; step-by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT. To promote this progressive ... Report Orca DataCite Metadata Store (German National Library of Science and Technology) |
institution |
Open Polar |
collection |
DataCite Metadata Store (German National Library of Science and Technology) |
op_collection_id |
ftdatacite |
language |
unknown |
topic |
Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences |
spellingShingle |
Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences Mukherjee, Subhabrata Mitra, Arindam Jawahar, Ganesh Agarwal, Sahaj Palangi, Hamid Awadallah, Ahmed Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ... |
topic_facet |
Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences |
description |
Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimating the small model's capability as they tend to learn to imitate the style, but not the reasoning process of LFMs. To address these challenges, we develop Orca (We are working with our legal team to publicly release a diff of the model weights in accordance with LLaMA's release policy to be published at https://aka.ms/orca-lm), a 13-billion parameter model that learns to imitate the reasoning process of LFMs. Orca learns from rich signals from GPT-4 including explanation traces; step-by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT. To promote this progressive ... |
format |
Report |
author |
Mukherjee, Subhabrata Mitra, Arindam Jawahar, Ganesh Agarwal, Sahaj Palangi, Hamid Awadallah, Ahmed |
author_facet |
Mukherjee, Subhabrata Mitra, Arindam Jawahar, Ganesh Agarwal, Sahaj Palangi, Hamid Awadallah, Ahmed |
author_sort |
Mukherjee, Subhabrata |
title |
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ... |
title_short |
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ... |
title_full |
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ... |
title_fullStr |
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ... |
title_full_unstemmed |
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 ... |
title_sort |
orca: progressive learning from complex explanation traces of gpt-4 ... |
publisher |
arXiv |
publishDate |
2023 |
url |
https://dx.doi.org/10.48550/arxiv.2306.02707 https://arxiv.org/abs/2306.02707 |
genre |
Orca |
genre_facet |
Orca |
op_rights |
Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 |
op_doi |
https://doi.org/10.48550/arxiv.2306.02707 |
_version_ |
1772186568566505472 |