Deep Mutual Learning

Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this pap...

Full description

Bibliographic Details
Main Authors: Zhang, Ying, Xiang, Tao, Hospedales, Timothy M., Lu, Huchuan
Format: Report
Language:unknown
Published: arXiv 2017
Subjects:
DML
Online Access:https://dx.doi.org/10.48550/arxiv.1706.00384
https://arxiv.org/abs/1706.00384
id ftdatacite:10.48550/arxiv.1706.00384
record_format openpolar
spelling ftdatacite:10.48550/arxiv.1706.00384 2023-05-15T16:01:52+02:00 Deep Mutual Learning Zhang, Ying Xiang, Tao Hospedales, Timothy M. Lu, Huchuan 2017 https://dx.doi.org/10.48550/arxiv.1706.00384 https://arxiv.org/abs/1706.00384 unknown arXiv arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Computer Vision and Pattern Recognition cs.CV FOS Computer and information sciences Preprint Article article CreativeWork 2017 ftdatacite https://doi.org/10.48550/arxiv.1706.00384 2022-04-01T10:18:46Z Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this paper, we present a deep mutual learning (DML) strategy where, rather than one way transfer between a static pre-defined teacher and a student, an ensemble of students learn collaboratively and teach each other throughout the training process. Our experiments show that a variety of network architectures benefit from mutual learning and achieve compelling results on CIFAR-100 recognition and Market-1501 person re-identification benchmarks. Surprisingly, it is revealed that no prior powerful teacher network is necessary -- mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher. : 10 pages, 4 figures Report DML DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Computer Vision and Pattern Recognition cs.CV
FOS Computer and information sciences
spellingShingle Computer Vision and Pattern Recognition cs.CV
FOS Computer and information sciences
Zhang, Ying
Xiang, Tao
Hospedales, Timothy M.
Lu, Huchuan
Deep Mutual Learning
topic_facet Computer Vision and Pattern Recognition cs.CV
FOS Computer and information sciences
description Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this paper, we present a deep mutual learning (DML) strategy where, rather than one way transfer between a static pre-defined teacher and a student, an ensemble of students learn collaboratively and teach each other throughout the training process. Our experiments show that a variety of network architectures benefit from mutual learning and achieve compelling results on CIFAR-100 recognition and Market-1501 person re-identification benchmarks. Surprisingly, it is revealed that no prior powerful teacher network is necessary -- mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher. : 10 pages, 4 figures
format Report
author Zhang, Ying
Xiang, Tao
Hospedales, Timothy M.
Lu, Huchuan
author_facet Zhang, Ying
Xiang, Tao
Hospedales, Timothy M.
Lu, Huchuan
author_sort Zhang, Ying
title Deep Mutual Learning
title_short Deep Mutual Learning
title_full Deep Mutual Learning
title_fullStr Deep Mutual Learning
title_full_unstemmed Deep Mutual Learning
title_sort deep mutual learning
publisher arXiv
publishDate 2017
url https://dx.doi.org/10.48550/arxiv.1706.00384
https://arxiv.org/abs/1706.00384
genre DML
genre_facet DML
op_rights arXiv.org perpetual, non-exclusive license
http://arxiv.org/licenses/nonexclusive-distrib/1.0/
op_doi https://doi.org/10.48550/arxiv.1706.00384
_version_ 1766397560803557376