Observatorio de I+D+i UPM

Memorias de investigación
Book chapters:
Allocation Strategies based on Possibilistic Rewards for the Multi-Armed Bandit Problem: A Numerical Study and Regret Analysis
Research Areas
  • Operative research and mathematic programming,
  • Statistics
In this paper, we propose a novel allocation strategy based on possibilistic rewards for the multi-armed bandit problem. First, we use possibilistic reward distributions to model the uncertainty about the expected rewards from the arms, derived from a set of infinite confidence intervals nested around the expected value. They are then converted into probability distributions using a pignistic probability transformation. Finally, a simulation experiment is carried out to find out the one with the highest expected reward, which is then pulled. A parametric probability transformation of the proposed is then introduced together with a dynamic optimization. A numerical study proves that the proposed method outperforms other policies in the literature in five scenarios accounting for Bernoulli, Poisson and exponential distributions for the rewards. The regret analysis of the proposed methods suggests a logarithmic asymptotic convergence for the original possibilistic reward method, whereas a polynomial regret could be associated with the parametric extension and the dynamic optimization
Book Edition
Book Publishing
G. Palmier, F. Liberatore, M. Demange (eds.), Springer
Communications in Computer and Information Science 884
Book title
Operations Research and Enterprise Systems
From page
To page
  • Autor: Miguel Carlos Martin Blanco (UPM)
  • Autor: Antonio Jimenez Martin (UPM)
  • Autor: Alfonso Mateos Caballero (UPM)
Research Group, Departaments and Institutes related
  • Creador: Grupo de Investigación: Grupo de análisis de decisiones y estadística
  • Departamento: Inteligencia Artificial
S2i 2020 Observatorio de investigación @ UPM con la colaboración del Consejo Social UPM
Cofinanciación del MINECO en el marco del Programa INNCIDE 2011 (OTR-2011-0236)
Cofinanciación del MINECO en el marco del Programa INNPACTO (IPT-020000-2010-22)