Covariance Analysis as a Measure of Policy Robustness in Reinforcement Learning

—In this paper we propose covariance analysis as a metric for reinforcement learning to improve the robustness of a learned policy. The local optima found during the exploration are analyzed in terms of the total cumulative reward and the local behavior of the system in the neighborhood of the optim...

Full description

Bibliographic Details
Main Authors:	Jamali, N, Kormushev, P, Ahmadzadeh, SR, Caldwell, DG
Format:	Conference Object
Language:	unknown
Published:	2014
Subjects:	Newfoundland
Online Access:	http://hdl.handle.net/10044/1/26103

id	ftimperialcol:oai:spiral.imperial.ac.uk:10044/1/26103
record_format	openpolar
spelling	ftimperialcol:oai:spiral.imperial.ac.uk:10044/1/26103 2023-05-15T17:22:36+02:00 Covariance Analysis as a Measure of Policy Robustness in Reinforcement Learning Jamali, N Kormushev, P Ahmadzadeh, SR Caldwell, DG St. Johns, Newfoundland 2014-07-14 http://hdl.handle.net/10044/1/26103 unknown Proc. MTS/IEEE Intl Conf. OCEANS 2014 © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. http://www.rioxx.net/licenses/all-rights-reserved OCEANS'14 MTS/IEEE Conference Paper 2014 ftimperialcol 2018-09-16T05:52:55Z —In this paper we propose covariance analysis as a metric for reinforcement learning to improve the robustness of a learned policy. The local optima found during the exploration are analyzed in terms of the total cumulative reward and the local behavior of the system in the neighborhood of the optima. The analysis is performed in the solution space to select a policy that exhibits robustness in uncertain and noisy environments. We demonstrate the utility of the method using our previously developed system where an autonomous underwater vehicle (AUV) has to recover from a thruster failure. When a failure is detected the recovery system is invoked, which uses simulations to learn a new controller that utilizes the remaining functioning thrusters to achieve the goal of the AUV, that is, to reach a target position. In this paper, we use covariance analysis to examine the performance of the top, n, policies output by the previous algorithm. We propose a scoring metric that uses the output of the covariance analysis, the time it takes the AUV to reach the target position and the distance between the target position and the AUV’s final position. The top polices are simulated in a noisy environment and evaluated using the proposed scoring metric to analyze the effect of noise on their performance. The policy that exhibits more tolerance to noise is selected. We show experimental results where covariance analysis successfully selects a more robust policy that was ranked lower by the original algorithm. Conference Object Newfoundland Imperial College London: Spiral
institution	Open Polar
collection	Imperial College London: Spiral
op_collection_id	ftimperialcol
language	unknown
description	—In this paper we propose covariance analysis as a metric for reinforcement learning to improve the robustness of a learned policy. The local optima found during the exploration are analyzed in terms of the total cumulative reward and the local behavior of the system in the neighborhood of the optima. The analysis is performed in the solution space to select a policy that exhibits robustness in uncertain and noisy environments. We demonstrate the utility of the method using our previously developed system where an autonomous underwater vehicle (AUV) has to recover from a thruster failure. When a failure is detected the recovery system is invoked, which uses simulations to learn a new controller that utilizes the remaining functioning thrusters to achieve the goal of the AUV, that is, to reach a target position. In this paper, we use covariance analysis to examine the performance of the top, n, policies output by the previous algorithm. We propose a scoring metric that uses the output of the covariance analysis, the time it takes the AUV to reach the target position and the distance between the target position and the AUV’s final position. The top polices are simulated in a noisy environment and evaluated using the proposed scoring metric to analyze the effect of noise on their performance. The policy that exhibits more tolerance to noise is selected. We show experimental results where covariance analysis successfully selects a more robust policy that was ranked lower by the original algorithm.
format	Conference Object
author	Jamali, N Kormushev, P Ahmadzadeh, SR Caldwell, DG
spellingShingle	Jamali, N Kormushev, P Ahmadzadeh, SR Caldwell, DG Covariance Analysis as a Measure of Policy Robustness in Reinforcement Learning
author_facet	Jamali, N Kormushev, P Ahmadzadeh, SR Caldwell, DG
author_sort	Jamali, N
title	Covariance Analysis as a Measure of Policy Robustness in Reinforcement Learning
title_short	Covariance Analysis as a Measure of Policy Robustness in Reinforcement Learning
title_full	Covariance Analysis as a Measure of Policy Robustness in Reinforcement Learning
title_fullStr	Covariance Analysis as a Measure of Policy Robustness in Reinforcement Learning
title_full_unstemmed	Covariance Analysis as a Measure of Policy Robustness in Reinforcement Learning
title_sort	covariance analysis as a measure of policy robustness in reinforcement learning
publishDate	2014
url	http://hdl.handle.net/10044/1/26103
op_coverage	St. Johns, Newfoundland
genre	Newfoundland
genre_facet	Newfoundland
op_source	OCEANS'14 MTS/IEEE
op_relation	Proc. MTS/IEEE Intl Conf. OCEANS 2014
op_rights	© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. http://www.rioxx.net/licenses/all-rights-reserved
_version_	1766109367560568832

Covariance Analysis as a Measure of Policy Robustness in Reinforcement Learning

Similar Items