Creates a new agent A = AGENT(ID, LEARNPARAM, LEARNFUN, ACTFUN, EXPLOREFUN) Creates an agent with the given parameters. Parameters: ID - the agent's numeric ID LEARNPARAM - the agent learning parameters. The actual values required here depend on the agent's learning algorithm. LEARNFUN - the learn function employed by the agent while learning. ACTFUN - the action function employed by the agent. EXPLOREFUN - the exploration strategy function employed by the agent. The agent's ID must be between 1 and 999. No ordering or contiguity constraints are imposed on the agent IDs. The agent makes use of a set of functions during its life. The names of these functions are formed using the agent's construction parameters LEARNFUN, ACTFUN, EXPLOREFUN. 1. Learning function agent = <LEARNFUN>(agent, state, action, reward, params) See the lrn() template. 2. Learning initialization function agent = <LEARNFUN>_init(agent, indices, info) This function is charged with the initialisation of the learning behaviour of the agent. See the lrn_init() template. 3. Action function [agent, action] = <ACTFUN>(agent, state, action, reward, params) The learning function and action function must obviously be correlated, in the sense that if the action function makes use of some kind of learned policy, then the learning function must compute it. The action function must support a special 'init' mode, activable via a field named "mode" of the [params] argument. See the act() template. 4. Exploration strategy function [agent, action] = <EXPLOREFUN>(agent, action, state, action, reward, params) The exploration function must support two special modes ('init' and 'reset'), activable via a field named "mode" of the [params] argument. See the explore() template. The reason for which this mode switching scheme is employed for action selection and exploration, while for the learning initialization a second function is used, is that the initialization of the learning behaviour is typically much more involved. The skeleton of the learning mechanism supports the following default features: - the learning parameters of the agent contain a field named "episodic" signaling whether the task to be learned is episodic. - a clock in an agent field named "k"; this clock counts iterations elapsed since the agent began its life. - a trials clock in an agent field named "trialsk"; this clock counts trials elapsed since the agent began its life. Only in episodic environments. - the state seen when the agent last acted, in an agent field named "state". See also agent_control, agent_initlearn, act, learn