Evaluation of Repeated Colonel Blotto Games with Semi-Bandit Feedback

Diamond, N'yoma

Etd

Evaluation of Repeated Colonel Blotto Games with Semi-Bandit Feedback

Public Deposited

The Colonel Blotto (CB) game is a well-studied strategic resource allocation game with applications in many real-world problems. In the CB game, two players compete by allocating resources to a number of battlefields with the goal of overpowering their opponent on as many of them as possible. However, approaches for evaluating resource allocation algorithms are often inconsistent or depend on context-specific information that is not broadly applicable. This is especially visible in the context of semi-bandit feedback, where the player receives partial feedback about their decision, such as which battlefields they won and which they lost. Many performance evaluation techniques completely ignore either the potential uses of semi-bandit feedback or rely exclusively on it and exclude cases where it is unavailable. In this thesis, we develop general definitions of payoff and regret for the CB game with the goal of unifying performance evaluation under bandit and semi-bandit feedback. Furthermore, we propose a suite of estimation metrics for approximating payoff and regret for practical use under bandit or semi-bandit feedback. Finally, in order to efficiently compute these estimates, we propose a graph-pruning approach for identifying possible opponent decisions in the CB game. Simulations of the CB game utilizing these proposed metrics and algorithms show empirically that they are highly effective at estimating true opponent behavior in the absence of full information.

Creator