Memorias de investigación
Communications at congresses:
Cooperative off-policy prediction of markov decision processes in adaptive networks
Year:2013

Research Areas
  • Electronic technology and of the communications

Information
Abstract
We apply diffusion strategies to propose a cooperative reinforcement learning algorithm, in which agents in a network communicate with their neighbors to improve predictions about their environment. The algorithm is suitable to learn off-policy even in large state spaces. We provide a mean-square-error performance analysis under constant step-sizes. The gain of cooperation in the form of more stability and less bias and variance in the prediction error, is illustrated in the context of a classical model. We show that the improvement in performance is especially significant when the behavior policy of the agents is different from the target policy under evaluation.
International
Si
Congress
2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
960
Place
Reviewers
Si
ISBN/ISSN
1520-6149
Start Date
26/05/2013
End Date
31/05/2013
From page
4539
To page
4543
Proceedings of ICASSP
Participants

Research Group, Departaments and Institutes related
  • Creador: Grupo de Investigación: Grupo de Aplicaciones del Procesado de Señal (GAPS)
  • Departamento: Señales, Sistemas y Radiocomunicaciones