Home > marl > agent > agent.m

agent

PURPOSE ^

Creates a new agent

SYNOPSIS ^

function a = agent(id, learnparam, learnfun, actfun, explorefun)

DESCRIPTION ^

Creates a new agent
  A = AGENT(ID, LEARNPARAM, LEARNFUN, ACTFUN, EXPLOREFUN)
  Creates an agent with the given parameters.

  Parameters:
   ID          - the agent's numeric ID
   LEARNPARAM  - the agent learning parameters. The actual values required
               here depend on the agent's learning algorithm.
   LEARNFUN    - the learn function employed by the agent while learning.
   ACTFUN      - the action function employed by the agent.
   EXPLOREFUN  - the exploration strategy function employed by the agent. 

  The agent's ID must be between 1 and 999. No ordering or contiguity
  constraints are imposed on the agent IDs.

  The agent makes use of a set of functions during its life. The names of
  these functions are formed using the agent's construction parameters
  LEARNFUN, ACTFUN, EXPLOREFUN.
 
  1. Learning function
       agent = <LEARNFUN>(agent, state, action, reward, params)
   See the lrn() template.

  2. Learning initialization function
       agent = <LEARNFUN>_init(agent, indices, info)
  This function is charged with the initialisation of the learning
  behaviour of the agent. See the lrn_init() template.
  
  3. Action function
       [agent, action] = <ACTFUN>(agent, state, action, reward, params)
  The learning function and action function must obviously be correlated,
  in the sense that if the action function makes use of some kind of
  learned policy, then the learning function must compute it. The action
  function must support a special 'init' mode, activable via a field named
  "mode" of the [params] argument. See the act() template. 
 
  4. Exploration strategy function
       [agent, action] = <EXPLOREFUN>(agent, action, state, action, reward, params)
  The exploration function must support two special modes ('init' and
  'reset'), activable via a field named "mode" of the [params] argument.
  See the explore() template.

  The reason for which this mode switching scheme is employed for action
  selection and exploration, while for the learning initialization a
  second function is used, is that the initialization of the learning
  behaviour is typically much more involved.

  The skeleton of the learning mechanism supports the following default
  features:
   - the learning parameters of the agent contain a field named "episodic"
   signaling whether the task to be learned is episodic.
   - a clock in an agent field named "k"; this clock counts iterations
   elapsed since the agent began its life.
   - a trials clock in an agent field named "trialsk"; this clock counts 
   trials elapsed since the agent began its life. Only in episodic
   environments.
   - the state seen when the agent last acted, in an agent field
   named "state".
 

  See also agent_control, agent_initlearn, act, learn

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:
Generated on Wed 04-Aug-2010 16:55:08 by m2html © 2005