Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning

Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-tur...

Full description

Bibliographic Details
Main Authors: Adewumi, Tosin, Brännvall, Rickard, Abid, Nosheen, Pahlavan, Maryam, Sabry, Sana Sabah, Liwicki, Foteini, Liwicki, Marcus
Format: Article in Journal/Newspaper
Language:unknown
Published: arXiv 2021
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.2110.06273
https://arxiv.org/abs/2110.06273
id ftdatacite:10.48550/arxiv.2110.06273
record_format openpolar
spelling ftdatacite:10.48550/arxiv.2110.06273 2023-05-15T18:33:51+02:00 Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning Adewumi, Tosin Brännvall, Rickard Abid, Nosheen Pahlavan, Maryam Sabry, Sana Sabah Liwicki, Foteini Liwicki, Marcus 2021 https://dx.doi.org/10.48550/arxiv.2110.06273 https://arxiv.org/abs/2110.06273 unknown arXiv Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 CC-BY Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences Article CreativeWork article Preprint 2021 ftdatacite https://doi.org/10.48550/arxiv.2110.06273 2022-03-10T14:10:48Z Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English. This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources. Perplexity score (an automated intrinsic language model metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models, with results that indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogue judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. We provide the demos and model checkpoints of our English and Swedish chatbots on the HuggingFace platform for public use. : Presented at Northern Lights Deep Learning Conference (NLDL) 2022, Tromso, Norway Article in Journal/Newspaper Tromso Tromso DataCite Metadata Store (German National Library of Science and Technology) Norway Tromso ENVELOPE(16.546,16.546,68.801,68.801)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Computation and Language cs.CL
Machine Learning cs.LG
FOS Computer and information sciences
spellingShingle Computation and Language cs.CL
Machine Learning cs.LG
FOS Computer and information sciences
Adewumi, Tosin
Brännvall, Rickard
Abid, Nosheen
Pahlavan, Maryam
Sabry, Sana Sabah
Liwicki, Foteini
Liwicki, Marcus
Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning
topic_facet Computation and Language cs.CL
Machine Learning cs.LG
FOS Computer and information sciences
description Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English. This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources. Perplexity score (an automated intrinsic language model metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models, with results that indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogue judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. We provide the demos and model checkpoints of our English and Swedish chatbots on the HuggingFace platform for public use. : Presented at Northern Lights Deep Learning Conference (NLDL) 2022, Tromso, Norway
format Article in Journal/Newspaper
author Adewumi, Tosin
Brännvall, Rickard
Abid, Nosheen
Pahlavan, Maryam
Sabry, Sana Sabah
Liwicki, Foteini
Liwicki, Marcus
author_facet Adewumi, Tosin
Brännvall, Rickard
Abid, Nosheen
Pahlavan, Maryam
Sabry, Sana Sabah
Liwicki, Foteini
Liwicki, Marcus
author_sort Adewumi, Tosin
title Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning
title_short Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning
title_full Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning
title_fullStr Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning
title_full_unstemmed Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning
title_sort småprat: dialogpt for natural language generation of swedish dialogue by transfer learning
publisher arXiv
publishDate 2021
url https://dx.doi.org/10.48550/arxiv.2110.06273
https://arxiv.org/abs/2110.06273
long_lat ENVELOPE(16.546,16.546,68.801,68.801)
geographic Norway
Tromso
geographic_facet Norway
Tromso
genre Tromso
Tromso
genre_facet Tromso
Tromso
op_rights Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
cc-by-4.0
op_rightsnorm CC-BY
op_doi https://doi.org/10.48550/arxiv.2110.06273
_version_ 1766218493268590592