Adaptive and Learning Agents: International Workshop, ALA by Edward Robinson, Peter McBurney, Xin Yao (auth.), Peter

By Edward Robinson, Peter McBurney, Xin Yao (auth.), Peter Vrancx, Matthew Knudson, Marek Grześ (eds.)

This quantity constitutes the completely refereed post-conference complaints of the overseas Workshop on Adaptive and studying brokers, ALA 2011, held on the tenth overseas convention on self sustaining brokers and Multiagent structures, AAMAS 2011, in Taipei, Taiwan, in could 2011. The 7 revised complete papers provided including 1 invited speak have been rigorously reviewed and chosen from quite a few submissions. The papers are geared up in topical sections on unmarried and multi-agent reinforcement studying, supervised multiagent studying, edition and studying in dynamic environments, studying belief and recognition, minority video games and agent coordination.

Additional info for Adaptive and Learning Agents: International Workshop, ALA 2011, Held at AAMAS 2011, Taipei, Taiwan, May 2, 2011, Revised Selected Papers

Example text

K. B¨ uning of selecting an action based on an estimate of the actions’ usefulness. Given appropriate parameters, Kapetanakis and Kudenko [6] showed experimentally that FMQ converges almost always to optimal strategies in the considered games. However, they also point out problems with stochastic rewards. In [9], an extended FMQ with improved convergence in such stochastic games is presented. The approach presented later in this work uses Lauer’s and Riedmiller’s Distributed Q-Learning algorithm (DQL) [8].

Then any optimal joint strategy for game GB is also an optimal joint strategy for GC and vice versa. u)| + maxuˆ∈U |ρGB (ˆ u)|. Then, Equation 4 of transProof. Let c = maxuˆ∈U |ρGA (ˆ formation function t can be rewritten as ρGC (u) = c+ρGB (u), where c is constant for any two fixed games GA and GB . Thus, the reward function ρGC for game GC is obtained by adding a constant to the rewards of game GB . Accordingly and from Lemma 1 it follows that any optimal joint strategy σ (GB ) ∈ Σ (GB ) for game GB from the set of optimal joint strategies is also an optimal joint strategy for game GC .

Then a corresponding stochastic game Γ = s0 , S, A , U , f, {ρi }i∈A is constructed by: – A = A and U = U . m – recall the definition of the set of games G. e. S = s∅ , s0 , . . , s0 , s1 , . . , s1 , . . , sm , . . , sm , s∞ . Here, svj denotes the state that is obtained when game Gj was played for the v-th iteration. – the initial state s0 corresponds to state s∅ , which is the state before the first game is played – the state transition function f for any joint action u ∈ U is constructed such that it stays in stage game Gj until it is played nj times and then transitions to the next game Gj+1 .

