Home > marl > agent > learnfuns > asfq_init.m

asfq_init

PURPOSE ^

Initializes the ASF-Q learning algorithm

SYNOPSIS ^

function a = asfq_init(a, info)

DESCRIPTION ^

Initializes the ASF-Q learning algorithm
  A = ASFQ_INIT(A, INFO)
  Creates the structures required for the ASF-Q learning algorithm to
  run. Initializes the Q table and an eligibility trace as flat vectors.
  Optional values on the agent learning parameters (see asfq() for the
  list of mandatory values):
   window          - the analysis window length, default 256. This is
       the number of iterations over which the differences of Q-values
       between successive iterations are maintained. For exact analysis
       stops, make sure this is divisible with 2 and [stops] below.
   stops           - the number of analysis stops per window, default 16.
       A sliding mean of the differences is maintained. This mean is
       subsampled [stops] times per window. The process begins after half
       of a window has elapsed, thus analysis (which requires a full
       window of data) can begin after one and a half window.
   zeromargin      - the zero margin, determining when a value used for
       testing convergence is considered positive, negative or zero.
       Positive real between 0 and 1, default 0.05 (5%). A value will be
       considered positive if it is above [zeromargin]*SignalAmplitude,
       negative if below -[zeromargin]*SignalAmplitude, zero otherwise.
   expresetweight  - how does the exploration behaviour change upon
       switching modes, default []. Either a number between 0 and 1,
       dictating the exploration reset weight (see agent()), or the empty
       matrix, meaning switches have no effect on the exploration
       behaviour.
   reverttobestq   - if true, the algorithm will maintain a cache of the
       Q-table that behaved best so far (i.e. accumulated the best reward)
       and will revert to this Q-table upon expansion. This only works in
       episodic environments, where "best behaviour so far" makes a clear
       sense. The default is 0 (don't revert).

  Values on the info parameters:
   statespacesize      - the state space size
 

  See also asfq, agent_initlearn

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:
Generated on Wed 04-Aug-2010 16:55:08 by m2html © 2005