ApproxRL: A Matlab Toolbox for Approximate RL and DP

This toolbox contains Matlab implementations of a number of approximate reinforcement learning (RL) and dynamic programming (DP) algorithms. Notably, it contains the algorithms used in the numerical examples from the book:
L. Busoniu, R. Babuska, B. De Schutter, and D. Ernst, Reinforcement Learning and Dynamic Programming Using Function Approximators, CRC Press, Automation and Control Engineering Series. April 2010, 280 pages, ISBN 978-1439821084.
see http://www.dcsc.tudelft.nl/rlbook, as well as a number of other algorihtms.

Features

Getting started

  1. Unzip the archive into a directory of your choice.
  2. Before using the toolbox, you will need to obtain two additional functions provided by MathWorks:
  3. Start up Matlab, point it to the directory where you unzipped the file, and run startupapproxrl.
  4. Navigate to the demo subdirectory and open the demos. Five demonstration scripts are provided: qi_demo, illustrating the use of Q-iteration algorithms; pi_demo, for offline and online policy iteration; ps_opt_demo, for policy search and fuzzy Q-iteration with CE-optimized MFs; cleaningrobot_demo, an interactive demo illustrating the use of the classical RL and DP algorithms and their results for the cleaning robot problem; and invertedpendulum_demo, an interactive demo illustrating how several approximation-based algorithms work on the inverted pendulum problem. Start the demo scripts from the Matlab prompt to run all the algorithms in a row, or open them in the editor and run them in cell-mode, algorithm-by-algorithm. The comments in the demos should provide enough information for you to get started with using the toolbox.

Software requirements

The basic toolbox requires Matlab 7.3 (R2006b) or later, with the Statistics toolbox included. Some algorithms require additional specialized software, as follows:

Additionally, fittedqi, when used with extra-trees approximators, employs the regression trees package of Pierre Geurts, which is redistributed – with Pierre's permission – with the toolbox, in the subdirectory lib/regtrees. For convenience, precompiled Linux and Mac OS X (64 bits) and Windows (32 bits) MEX-files of the main entry function into the package are included.

Contact

If you get stuck anywhere using the code, chance upon bugs or missing functions, or have any questions, comments, or suggestions, please contact me. I'll be glad to hear from you!

Lucian Busoniu, June 2010

Acknowledgments: Pierre Geurts was extremely kind to supply the code for building (ensembles of) regression trees, and allow the redistribution of his code with the toolbox. This code was developed in close interaction with Robert Babuska, Bart De Schutter, and Damien Ernst. Several functions are taken from/inspired by code written by Robert Babuska.

Final notes: This software is provided as-is, without any warranties. So, if you decide to control your nuclear power plant with it, better do your own verifications beforehand :) I have only tested the toolbox in Windows XP, but it should also work in other operating systems, with some possible minor issues due to, e.g., the use of backslashes in paths. The main algorithm and problem files are thoroughly commented, and should not be difficult to understand given some experience with Matlab. However, this toolbox is very much work-in-progress, which has some implications. In particular, you will find TODO items, WARNINGs that some code paths have not been thoroughly tested, and some options and hooks for things that have not yet been implemented. Lower-level functions generally still have descriptive comments, although these may be sparser in some cases.