REINFORCEMENT LEARNING IN MARKOVIAN EVOLUTIONARY GAMES
V. S. BORKARSchool of Technology and Computer Science, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400005, India
Received: 12 July 2001
Revised: 9 November 2001
Accepted: 9 December 2001
A population of agents plays a stochastic dynamic game wherein there is an underlying state process with a Markovian dynamics that also affects their costs. A learning mechanism is proposed which takes into account intertemporal effects and incorporates an explicit process of expectation formation. The agents use this scheme to update their mixed strategies incrementally. The asymptotic behavior of this scheme is captured by an associated ordinary differential equation. Both the formulation and the analysis of the scheme draw upon the theory of reinforcement learning in artificial intelligence.
Keywords: Evolutionary games; stochastic dynamic games; expectation formation; actor-critic methods; reinforcement learning; generalized Nash equilibria
Cited by (6):
D. Krishna Sundar, K. Ravikumar. (2014) An actor–critic algorithm for multi-agent learning in queue-based stochastic games. Neurocomputing 127, 258-265. Online publication date: 1-Mar-2014. [CrossRef]
Vishnu Nanduri, Tapas K. Das. (2009) A reinforcement learning algorithm for obtaining the Nash equilibrium of multi-player matrix games. IIE Transactions 41, 158-167. Online publication date: 20-Nov-2009. [CrossRef]
V.L.R. Chinthalapati, N. Yadati, R. Karumanchi. (2006) Learning dynamic prices in MultiSeller electronic retail markets with price sensitive customers, stochastic demands, and inventory replenishments. IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews) 36, 92-106. Online publication date: 1-Jan-2006. [CrossRef]
Alfredo Garcia, Enrique Campos-Nañez, James Reitzes. (2005) Dynamic Pricing and Learning in Electricity Markets. Operations Research 53, 231-241. Online publication date: 1-Apr-2005. [CrossRef]
David S. Leslie, E. J. Collins. (2005) Individual
Q
-Learning in Normal Form Games. SIAM Journal on Control and Optimization 44, 495-514. Online publication date: 1-Jan-2005. [CrossRef]
C.V.L. Raju, Y. Narahari, K. Ravikumar. Reinforcement learning applications in dynamic pricing of retail markets. IEEE International Conference on E-Commerce, 2003. CEC 2003., 339-346. [CrossRef]


