Implements greedy policy-based action selection for policy indexed on agent's state [A, ACTION] = PIGREEDYACT(A, STATE, ACTIONS, REWARDS, PARAMS) Implements greedy policy-based action selection. The policy elements are interpreted as a measure of the agent's preference towards the corresponding actions, and the actions are chosen greedly w.r.t. this preference. Chooses the action with the highest policy value (breaking ties randomly). Supports discrete states and actions, with 1 action variable only. The policy must be stored under field 'PI' of the agent, as a flat vector representing a matrix with dimensions agent-action-space-size X agent-state-space-size. Moreover, this size must be cached in field 'sizes.pi' of the agent. See also agent_act