Policy Adaptation over Environmental Configurations in Survivor Type Games

This thesis explores the applicability and effectiveness of direct policy transfer in reinforcement learning (RL), with a focus on environmental configurations within the open-source game openSURVIVORS. We investigate how policy transfer can decrease the time to threshold and enhance jumpstart and a...

Full description

Bibliographic Details
Main Author: Markus Johannes Pedersen
Other Authors: Ole Christian Eidheim, Jonathan Jørgensen
Format: Master Thesis
Language:English
Published: NTNU 2023
Subjects:
Online Access:https://hdl.handle.net/11250/3099258
id ftntnutrondheimi:oai:ntnuopen.ntnu.no:11250/3099258
record_format openpolar
spelling ftntnutrondheimi:oai:ntnuopen.ntnu.no:11250/3099258 2023-11-12T04:16:50+01:00 Policy Adaptation over Environmental Configurations in Survivor Type Games Markus Johannes Pedersen Ole Christian Eidheim Jonathan Jørgensen 2023 application/pdf https://hdl.handle.net/11250/3099258 eng eng NTNU no.ntnu:inspera:145904930:35331123 https://hdl.handle.net/11250/3099258 Master thesis 2023 ftntnutrondheimi 2023-11-01T23:46:49Z This thesis explores the applicability and effectiveness of direct policy transfer in reinforcement learning (RL), with a focus on environmental configurations within the open-source game openSURVIVORS. We investigate how policy transfer can decrease the time to threshold and enhance jumpstart and asymptotic performance across various environmental configurations. Our experimental results highlight the strong influence of the game's unique environmental dynamics on policy effectiveness and transferability, with different RL agents exhibiting varied performance based on their respective training environments. While the asymptotic performance of agents trained via policy transfer remains inconclusive due to non-convergence within the study's scope, our findings demonstrate that policy transfer outperforms training from scratch, indicating its potential advantages in learning new environments. This research advances the understanding of policy transfer in RL, offering insights into training agents more effectively across diverse environments. Denne oppgaven utforsker effektiviteten til direkte overføring av evenene til i reinforcement learning agent, med fokus på problems konfigurasjoner innenfor det åpne kildekode-spillet openSURVIVORS. Vi undersøker hvordan overføring av agentenes evner kan redusere trenings tiden på tvers av ulike problem konfigurasjoner. Våre eksperimentelle resultater fremhever den sterke påvirkningen av spillets dynamikk på evnene til agentene på tvers av problemkonfigurasjoner. RL-agentene viser varierende ytelse basert på deres respektive treningsmiljøer. Våre funn viser at at overføring av evner over konfigurasjoner presterer bedre enn trening fra bunnen av, noe som indikerer potensielle fordeler ved denne metoden. Denne forskningen bidrar til å øke forståelsen av overføring av evner i RL og gir innsikt i mer effektiv trening av agenter på tvers av ulike miljøer. Master Thesis evenene NTNU Open Archive (Norwegian University of Science and Technology)
institution Open Polar
collection NTNU Open Archive (Norwegian University of Science and Technology)
op_collection_id ftntnutrondheimi
language English
description This thesis explores the applicability and effectiveness of direct policy transfer in reinforcement learning (RL), with a focus on environmental configurations within the open-source game openSURVIVORS. We investigate how policy transfer can decrease the time to threshold and enhance jumpstart and asymptotic performance across various environmental configurations. Our experimental results highlight the strong influence of the game's unique environmental dynamics on policy effectiveness and transferability, with different RL agents exhibiting varied performance based on their respective training environments. While the asymptotic performance of agents trained via policy transfer remains inconclusive due to non-convergence within the study's scope, our findings demonstrate that policy transfer outperforms training from scratch, indicating its potential advantages in learning new environments. This research advances the understanding of policy transfer in RL, offering insights into training agents more effectively across diverse environments. Denne oppgaven utforsker effektiviteten til direkte overføring av evenene til i reinforcement learning agent, med fokus på problems konfigurasjoner innenfor det åpne kildekode-spillet openSURVIVORS. Vi undersøker hvordan overføring av agentenes evner kan redusere trenings tiden på tvers av ulike problem konfigurasjoner. Våre eksperimentelle resultater fremhever den sterke påvirkningen av spillets dynamikk på evnene til agentene på tvers av problemkonfigurasjoner. RL-agentene viser varierende ytelse basert på deres respektive treningsmiljøer. Våre funn viser at at overføring av evner over konfigurasjoner presterer bedre enn trening fra bunnen av, noe som indikerer potensielle fordeler ved denne metoden. Denne forskningen bidrar til å øke forståelsen av overføring av evner i RL og gir innsikt i mer effektiv trening av agenter på tvers av ulike miljøer.
author2 Ole Christian Eidheim
Jonathan Jørgensen
format Master Thesis
author Markus Johannes Pedersen
spellingShingle Markus Johannes Pedersen
Policy Adaptation over Environmental Configurations in Survivor Type Games
author_facet Markus Johannes Pedersen
author_sort Markus Johannes Pedersen
title Policy Adaptation over Environmental Configurations in Survivor Type Games
title_short Policy Adaptation over Environmental Configurations in Survivor Type Games
title_full Policy Adaptation over Environmental Configurations in Survivor Type Games
title_fullStr Policy Adaptation over Environmental Configurations in Survivor Type Games
title_full_unstemmed Policy Adaptation over Environmental Configurations in Survivor Type Games
title_sort policy adaptation over environmental configurations in survivor type games
publisher NTNU
publishDate 2023
url https://hdl.handle.net/11250/3099258
genre evenene
genre_facet evenene
op_relation no.ntnu:inspera:145904930:35331123
https://hdl.handle.net/11250/3099258
_version_ 1782333894087933952