Abstract
|
|
---|---|
We extend earlier works on continuous potential games to the most general case: stochastic time varying environment, stochastic rewards, non-reduced form and constrained state-action sets. We provide conditions for a Markov Nash equilibrium (MNE) of the game to be equivalent to the solution of a single control problem. Then, we address the problem of learning this MNE when the reward and state transition models are unknown. We follow a reinforcement learning approach and extend previous algorithms for working with constrained state-action subsets of real vector spaces. As an application example, we simulate a network flow optimization model, in which the relays have batteries that deplete with a random factor. The results obtained with the proposed framework are close to optimal. | |
International
|
Si |
Congress
|
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
|
970 |
Place
|
Shanghai, China |
Reviewers
|
Si |
ISBN/ISSN
|
2379-190X |
|
10.1109/ICASSP.2016.7472542 |
Start Date
|
20/03/2016 |
End Date
|
25/05/2017 |
From page
|
1 |
To page
|
5 |
|
Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on |