Implements stochastic action selection for policy indexed on full world state [A, ACTION] = FULLSTATE_STOCHACT(A, STATE, ACTIONS, REWARDS, PARAMS) Implements stochastic action selection for policy indexed on full world state. An action is chosen based on the current stochastic policy of the agent. The elements of the policy corresponding to a given state must form a valid probability distribution over the (discrete) actions. This policy must be stored under field 'PI' of the agent, as a flat vector representing a matrix with dimensions agent-action-space-size X agent-state-space-size. Moreover, this size must be cached in field 'sizes.pi' of the agent. Supports discrete states and actions, with 1 action variable only. See also agent_act