Implements the distributed Q-learning algorithm A = TEAMQ(A, STATE, ACTION, REWARD, PARAMS) Implements distributed Q-learning. Required values on the agent learning parameters: gamma - the discount factor epsilon - the exploration probability Required values on the extra parameters: newtrial - only in episodic environments; whether a new trial is beginning Supports discrete states and actions, with 1 action variable per agent. Should be coupled with determpiact. References: [1] Lauer, M. & Riedmiller, M. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems Proceedings 17th International Conference on Machine Learning (ICML-00), 2000, 535-542 See also agent_learn, distrq_init