Chooses an action for the agent [A ACTION] = AGENT_CONTROL(A, STATE, ACTIONS, REWARDS, PARAMS) Performs a control step for the agent. Parameters: A - the agent STATE - the agent's view over the world state ACTIONS - the agent's view over the joint action REWARDS - the agent's view over the joint reward PARAMS - a structure containing extra information on the basis of which the agent may reason on how to act. In episodic environments, must contain a boolean field "newtrial" signaling whether a new trial has just begun, and a field "finished" signaling whether the agent has finished its task in the current trial. Returns: A - the (possibly updated) agent ACTION - the chosen action. May also return NaN as a special case, when the agent has reached its goal This function calls, in sequence: 1. the actual learning function of the agent, if not at the beginning of a new trial (PARAMS.newtrial), the agent has not finished its task (PARAMS.finished), and learning is not stopped via a flag PARAMS.donotlearn (this is useful in running control after the agent has finished learning). 2. the actual action function of the agent, if the agent did not finish its task. (PARAMS.finished). If finished, NaN is returned as the conventional noop action. 3. the actual exploration function of the agent, if the action function has been called. The learning behaviour must have been initialized prior to calling this function. Usually, this function needs not be called directly, but will be handled by the learning control mechanism. In addition, this function handles the iterations and trials clocks, and makes the received state information available later on by storing it in an agent field 'state'. See also agent, agent_initlearn