Descripción
|
|
---|---|
We apply diffusion strategies to propose a cooperative reinforcement learning algorithm, in which agents in a network communicate with their neighbors to improve predictions about their environment. The algorithm is suitable to learn off-policy even in large state spaces. We provide a mean-square-error performance analysis under constant step-sizes. The gain of cooperation in the form of more stability and less bias and variance in the prediction error, is illustrated in the context of a classical model. We show that the improvement in performance is especially significant when the behavior policy of the agents is different from the target policy under evaluation. | |
Internacional
|
Si |
Nombre congreso
|
2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Tipo de participación
|
960 |
Lugar del congreso
|
|
Revisores
|
Si |
ISBN o ISSN
|
1520-6149 |
DOI
|
|
Fecha inicio congreso
|
26/05/2013 |
Fecha fin congreso
|
31/05/2013 |
Desde la página
|
4539 |
Hasta la página
|
4543 |
Título de las actas
|
Proceedings of ICASSP |