Replays the agents' current policies in an episodic environment [WORLD, AGENTS, STATS] = EPISODIC_REPLAY(WORLD, AGENTS, SPEED, MAXITER) Parameters: WORLD - the world where the agents live AGENTS - the (possibly heterogeneous) cell array of agents SPEED - the speed of the replay, between 1 and 10 MAXITER - for how many iterations to allow the policy to run at most. May be -1 ('run forever'). Returns: WORLD - the possibly altered world. AGENTS - the possibly altered agents. Replays the agents' policies over a single trial. Assumes that the agents have already been correctly setup during the learning process. Requires that the world implements a view. This view will be shown if not already shown. See also replay, episodic_learn