Deep Mutual Learning

Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this pap...

Full description

Bibliographic Details
Main Authors:	Zhang, Ying, Xiang, Tao, Hospedales, Timothy M., Lu, Huchuan
Format:	Report
Language:	unknown
Published:	arXiv 2017
Subjects:	Computer Vision and Pattern Recognition cs.CV FOS Computer and information sciences DML
Online Access:	https://dx.doi.org/10.48550/arxiv.1706.00384 https://arxiv.org/abs/1706.00384

id	ftdatacite:10.48550/arxiv.1706.00384
record_format	openpolar
spelling	ftdatacite:10.48550/arxiv.1706.00384 2023-05-15T16:01:52+02:00 Deep Mutual Learning Zhang, Ying Xiang, Tao Hospedales, Timothy M. Lu, Huchuan 2017 https://dx.doi.org/10.48550/arxiv.1706.00384 https://arxiv.org/abs/1706.00384 unknown arXiv arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Computer Vision and Pattern Recognition cs.CV FOS Computer and information sciences Preprint Article article CreativeWork 2017 ftdatacite https://doi.org/10.48550/arxiv.1706.00384 2022-04-01T10:18:46Z Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this paper, we present a deep mutual learning (DML) strategy where, rather than one way transfer between a static pre-defined teacher and a student, an ensemble of students learn collaboratively and teach each other throughout the training process. Our experiments show that a variety of network architectures benefit from mutual learning and achieve compelling results on CIFAR-100 recognition and Market-1501 person re-identification benchmarks. Surprisingly, it is revealed that no prior powerful teacher network is necessary -- mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher. : 10 pages, 4 figures Report DML DataCite Metadata Store (German National Library of Science and Technology)
institution	Open Polar
collection	DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id	ftdatacite
language	unknown
topic	Computer Vision and Pattern Recognition cs.CV FOS Computer and information sciences
spellingShingle	Computer Vision and Pattern Recognition cs.CV FOS Computer and information sciences Zhang, Ying Xiang, Tao Hospedales, Timothy M. Lu, Huchuan Deep Mutual Learning
topic_facet	Computer Vision and Pattern Recognition cs.CV FOS Computer and information sciences
description	Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this paper, we present a deep mutual learning (DML) strategy where, rather than one way transfer between a static pre-defined teacher and a student, an ensemble of students learn collaboratively and teach each other throughout the training process. Our experiments show that a variety of network architectures benefit from mutual learning and achieve compelling results on CIFAR-100 recognition and Market-1501 person re-identification benchmarks. Surprisingly, it is revealed that no prior powerful teacher network is necessary -- mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher. : 10 pages, 4 figures
format	Report
author	Zhang, Ying Xiang, Tao Hospedales, Timothy M. Lu, Huchuan
author_facet	Zhang, Ying Xiang, Tao Hospedales, Timothy M. Lu, Huchuan
author_sort	Zhang, Ying
title	Deep Mutual Learning
title_short	Deep Mutual Learning
title_full	Deep Mutual Learning
title_fullStr	Deep Mutual Learning
title_full_unstemmed	Deep Mutual Learning
title_sort	deep mutual learning
publisher	arXiv
publishDate	2017
url	https://dx.doi.org/10.48550/arxiv.1706.00384 https://arxiv.org/abs/1706.00384
genre	DML
genre_facet	DML
op_rights	arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/
op_doi	https://doi.org/10.48550/arxiv.1706.00384
_version_	1766397560803557376

Deep Mutual Learning

Similar Items