The repository contains useful downloadable material related to my research and teaching, including Matlab software, presentations, and demonstration movies. Presentations are selectively chosen for tutorial value. If an item has a "»" button to its right, this button can be clicked to reveal more information; the "«" button then hides this information again (requires Javascript).


  • Approximate RL and DP toolbox, latest snapshot, including bugfixes as well as new, work-in-progress algorithms and experiments - possibly with their own, new bugs. (24 November 2015, 1.8 MBytes).
  • Approximate RL and DP toolbox, July 2013 release. (13 July 2013, 1.6 MBytes). »
  • Optimistic planning, a selection of algorithms as a stand-alone package. (13 July 2013, 79.3 KBytes). »
  • MARL toolbox documentation, the documentation files for the MARL toolbox (4 August 2010, 223.1 KBytes). »
  • MARL toolbox ver. 1.3, a Matlab multi-agent reinforcement learning toolbox (4 August 2010, 336.9 KBytes). »
  • Approximate RL and DP toolbox, developed in Matlab. (6 June 2010, 967.6 KBytes). »
  • makepdf, a Windows XP batch script to automate the creation of PDF files from DVI (21 November 2008, 2.4 KBytes).


  • Reinforcement learning and planning algorithms, a high-level overview talk I gave at the IROS 2015 workshop on Machine Learning in Planning and Control of Robot Motion (2 October 2015, 3.0 MBytes). »
  • Nonlinear near-optimal control using optimistic planning, (Algorithms, Networked Control Systems, Real-Time Control), presented at the Italian Institute of Technology (25 September 2014, 4.6 MBytes). »
  • Optimistic planning for near-optimal control in MDPs, an in-depth description of the optimistic planning algorithm for MDPs and its analysis (1 December 2011, 1.1 MBytes). »
  • Reinforcement learning lectures, introducing classical and approximate RL (3 March 2010, 2.1 MBytes). »

Demonstration Movies

  • Assistive mobile manipulator flipping a switch, the first demo with our Cyton Gamma 1500 robot arm mounted on a Pioneer3AT mobile base. With Elod Pall and Levente Tamas. (10 November 2015).
  • Planning to swing up a rotary pendulum in real time, using the continuous-action simultaneous optimistic optimization for planning (SOOP) algorithm. With Elod Pall. (24 November 2014).
  • Final swingup solution, after the online LSPI learning experiment was completed. (8 January 2009, 864.9 KBytes).
  • Learning to swing up an inverted pendulum, using online least-squares policy iteration. (8 January 2009, 51.8 MBytes). »
  • Robot goalkeeper learning to catch the ball, using approximate online RL and experience replay (demo by Sander Adam). (1 October 2008, 13.3 MBytes).