Military stochastic scheduling treated as a "multi-armed bandit" problem

A Blue airborne force attacks a region defended by a single Red surface-to-air missile system (SAM). Red is uncertain about the Blues he faces, but is able to learn about them during the engagement. Red's objective is to develop a policy for shooting at the Blues to maximize the value of Blues...

Full description

Bibliographic Details
Main Author: Glazebrook, Kevin D.
Other Authors: Operations Research
Format: Report
Language:unknown
Published: Monterey, California. Naval Postgraduate School 2001
Subjects:
Online Access:https://hdl.handle.net/10945/15377
Description
Summary:A Blue airborne force attacks a region defended by a single Red surface-to-air missile system (SAM). Red is uncertain about the Blues he faces, but is able to learn about them during the engagement. Red's objective is to develop a policy for shooting at the Blues to maximize the value of Blues shot down before he himself is destroyed. We show that index policies are optimal for Red in a range of scenarios and yield effective heuristics more generally. The quality of such index heuristics is confirmed in a computational study. Contract number: N00244-10-10031.