Randomized sampling for large zero-sum games

IRIS

This paper addresses the solution of large zero-sum matrix games using randomized methods. We formalize a procedure, termed as the sampled security policy (SSP) algorithm, by which a player can compute policies that, with a high confidence, are security policies against an adversary using randomized methods to explore the possible outcomes of the game. The SSP algorithm essentially consists of solving a stochastically sampled subgame that is much smaller than the original game. We also propose a randomized algorithm, termed as the sampled security value (SSV) algorithm, which computes a high-confidence security-level (i.e., worst-case outcome) for a given policy, which may or may not have been obtained using the SSP algorithm. For both the SSP and the SSV algorithms we provide results to determine how many samples are needed to guarantee a desired level of confidence. We start by providing results when the two players sample policies with the same distribution and subsequently extend these results to the case of mismatched distributions. We demonstrate the usefulness of these results in a hide-and-seek game that exhibits exponential complexity. © 2013 Elsevier Ltd. All rights reserved.

Randomized sampling for large zero-sum games

Bopardikar, Shaunak D;Borri, Alessandro;Hespanha, João P.;Prandini, Maria;DI BENEDETTO, MARIA DOMENICA

2013-01-01

Abstract

This paper addresses the solution of large zero-sum matrix games using randomized methods. We formalize a procedure, termed as the sampled security policy (SSP) algorithm, by which a player can compute policies that, with a high confidence, are security policies against an adversary using randomized methods to explore the possible outcomes of the game. The SSP algorithm essentially consists of solving a stochastically sampled subgame that is much smaller than the original game. We also propose a randomized algorithm, termed as the sampled security value (SSV) algorithm, which computes a high-confidence security-level (i.e., worst-case outcome) for a given policy, which may or may not have been obtained using the SSP algorithm. For both the SSP and the SSV algorithms we provide results to determine how many samples are needed to guarantee a desired level of confidence. We start by providing results when the two players sample policies with the same distribution and subsequently extend these results to the case of mismatched distributions. We demonstrate the usefulness of these results in a hide-and-seek game that exhibits exponential complexity. © 2013 Elsevier Ltd. All rights reserved.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2013
			
	Rivista
	
				AUTOMATICA
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.automatica.2013.01.062
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/112280

Citazioni

ND

14

11

social impact