Home > marl > agent > learnfuns > plainq.m

plainq

PURPOSE ^

Implements the plain Q-learning algorithm

SYNOPSIS ^

function a = plainq(a, state, actions, rewards, params)

DESCRIPTION ^

Implements the plain Q-learning algorithm
  A = PLAINQ(A, STATE, ACTIONS, REWARDS, PARAMS)
  Implements Q-learning as described in [1], employing in
  addition to that an eligibility trace. Expects the Q table to be
  initialized. Uses flat Q and eligibility tables for fast access.

  Required values on the agent learning parameters:
   alpha           - the learning rate
   gamma           - the discount factor
   lambda          - the eligibility trace decay rate
   epsilon         - the exploration probability
  Required values on the extra parameters:
   newtrial        - only in episodic environments; whether a new trial is
                   beginning

 Can be coupled with an action function that uses a Q-table indexed on
 agent state and action, such as greedyact().

  Supports discrete states and actions, with 1 action variable per agent.


  References:
  [1] Watkins, C. J. C. H. and Dayan, P. (1992). Technical note:
      Q-learning. Machine Learning Journal, 8:279-292.

  See also learn, plainq_init

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:
Generated on Wed 04-Aug-2010 16:55:08 by m2html © 2005