**Lecturer: Lucian Busoniu**

Navigation: [Lecture slides|Labs|Contact] [Back to the lecturer's webpage]

This course provides methods for controlling systems that are too complex or insufficiently known to apply classical control design techniques. The focus is placed on learning algorithms for control, in particular reinforcement learning (RL). Attention is also paid to model-based techniques related to RL, as they can be very useful in controlling complex systems even when a model is known. After introducing the RL problem, the dynamic programming algorithms that sit at the foundation of RL are described in the discrete-variable context. Then, classical RL algorithms are introduced in the same context. In the second part of the course, the dynamical programming and RL algorithms are extended with approximation techniques, in order to make them applicable to continuous-variable control, as well as to large-scale discrete-variable problems. We dedicate significant space to deep reinforcement learning techniques.

This course is part of the Master program ICAF of the Automation Department, UTCluj (1st year 2nd semester). As prerequisites, basic knowledge of analysis and linear algebra is needed, together with notions of discrete-time dynamical systems. The teacher responsible is Lucian Busoniu.

The lecture and lab sessions take place on Mondays and Wednesdays, alternating weeks, from 18:00. Lectures are conline via the Microsoft Teams platform, and labs are in room C12. A detailed schedule is given next:

Grading rules:

- 50% exam.
- 10% lecture quizzes.
- 30% Matlab labs.
- 20% Python labs on deep learning and deep reinforcement learning.

The slides are made available here in time for each lecture. The slides are required material for the exam. They, as well as the lectures, are in Romanian.

- Part 1: The reinforcement learning problem (covered in lecture 1).
- Part 2: The optimal solution. Dynamic programming (covered in lectures 2 and 3). You may also download the code for the demos in parts 2 and 3.
- Part 3: Reinforcement learning (covered in lectures 3 and 4).
- Part 4: Function approximation. Approximate dynamic programming. Offline, batch RL (covered in lectures 4 and 5).
- Part 5: Online approximate RL. (covered in lecture 5).
- Part 6: Introduction to neural networks. (covered in lecture 6).

In the lab classes, a set of assignments must be solved. A solution consists of a brief report in PDF and associated code, and must be submitted by a specified deadline. For each lab, the full code or a specified part of it should be completed during the lab session itself. Each lab is graded up to 10, reduced to 5 if handed in late. The Matlab labs are required to participate in the exam.

There is zero tolerance for copying; any copied solution means immediate forfeiture of the discipline for this year.

A discussion session with mandatory participation will be organized before the exam (the exact date will be announced later), where the teachers will discuss the solutions separately with each student group. In this session, detailed questions will be asked to clearly assess whether the assignment solution is original, and the contribution of each student to this solution.

- Assignment 1: Markov decision processes. Q-value and policy iteration (PDF) and the Matlab code used as basis for the assignment.
- Assignment 2: Q-learning (PDF). The Matlab code for Assignment 1 is needed, and in addition two m-files are supplied: a template for implementing the Q-learning algorithm in Matlab: qlearning.m, and a script to compute a (near-)optimal solution for the grid navigation problem: gridnav_nearoptsol.m.

This project aims to help students gaining experience with searching for literature, writing a survey, and presenting scientific work. The project will be graded separately from the course, within the standalone semester project discipline.

The project will be performed in groups of two students. Project material:

- General project description (PDF). This description applies to all the topics; please read it carefully.
- Paper review form (DOC). Use this form to review the paper of your group.
- Presentation review form (DOC). Use this form to review a presentation tryout of your group.
- A single topic from the list below.

Each group of students will choose one topic from the following list. Have a look at the topics to determine which of them fit your background and preferences best. Select a topic and email your selection to Lucian Busoniu (contact info), including your names in the message. You will then receive a confirmation email that contains the title of the topic that has been allocated to your group. Your selection is not final until confirmed by the lecturer!

*Reinforcement learning for HIV treatment*: description, paper.*Reinforcement learning for adaptive brain-machine interfaces*: description, paper.*Learning to stand up using hierarchical reinforcement learning*: description, paper.*Fuzzy Q-iteration for reinforcement learning*: description, paper.*Least-squares policy iteration for reinforcement learning*: description, paper.*Evolutionary function approximation for reinforcement learning*: description, paper.

Since the common language of science is largely English, all the project materials are supplied in English, as well. You can however choose to write the deliverables (paper, presentation, and review forms) in either English or Romanian. If you need Romanian versions of the review forms, please contact the lecturer. See the project description for more details.

Comments, suggestions, questions etc. related to this course or website are welcome; please contact the lecturer.