Home > marl > agent > learnfuns > teamq.m

teamq

PURPOSE ^

Implements the team Q-learning algorithm

SYNOPSIS ^

function a = teamq(a, state, action, reward, params)

DESCRIPTION ^

Implements the team Q-learning algorithm
  A = TEAMQ(A, STATE, ACTION, REWARD, PARAMS)
  Implements team Q-learning (also known as friend-Q). This is a
  straightforward extension of plain Q to the multiagent case. Each agent
  learns a Q-table on the basis of the full world state and the joint
  action.

  Required values on the agent learning parameters:
   alpha           - the learning rate
   gamma           - the discount factor
   lambda          - the eligibility trace decay rate
   epsilon         - the exploration probability
  Required values on the extra parameters:
   newtrial        - only in episodic environments; whether a new trial is
                   beginning

 Supports discrete states and actions, with 1 action variable per agent.

 Can be coupled with an action function that uses a Q-table indexed on
 full world state and joint action, such as fullstatejoint_greedyact().


  References:
  [1] Littman, M. L. (2001). Friend-or-foe Q-learning in general-sum
      games. In Proceedings of the Eighteenth International Conference on
      Machine Learning (ICML-01), pages 322-328, Williams College,
      Williamstown Massachusets, USA.

  See also agent_learn, teamq_init

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:
Generated on Wed 04-Aug-2010 16:55:08 by m2html © 2005