This page lists a selection of the research and thesis projects in which I am – or have been – participating. Please contact me if you need additional information about any of these projects.

Moreover, our research group always has student projects available on a wide range of topics, from interesting applications to mobile ground and air robots, to analytical projects on control and estimation for the more mathematically inclined. We are looking for motivated, capable students ready to invest themselves fully into the project starting early on (for instance, in Bachelor year 3 or Master year 1). An up-to-date list of projects can be found on our group's website: open projects at ROCON. Of course, original project ideas from students are also more than welcome.

Research grants

  • AUF-RO grant: AI methods for the networked control of assistive UAVs (NETASSIST)

    Start: September 2016      End: December 2017 (ongoing)

    Participants: Lucian Busoniu, PI; Zoltan Kato, Hungarian side coordinator; Constantin Morarescu, French side coordinator.

    Description: This project develops methods for the networked control and sensing for a team of unmanned, assistive aerial vehicles that follows a group of vulnerable persons. On the control side, we consider multiagent and consensus techniques, while on the vision side the focus is egomotion estimation of the UAVs and cooperative tracking of persons with filtering techniques. NETASSIST is an international cooperation project involving the Technical University of Cluj-Napoca in Romania, the University of Szeged in Hungary, and the University of Lorraine at Nancy, France. The project is funded by the Agence Universitaire de la Francophonie (AUF) and the Romanian Institute for Atomic Physics (IFA), under contract no. 09-AUF. Besides the coordinators listed above, the project includes many PhD and MSc students, as well as staff in the three departments, for a total of 11 people.
  • PHC Brancusi grant: Artificial-intelligence-based optimization for the stable and optimal control of networked systems (AICONS)

    Start: January 2014      End: December 2016 (finalized)

    Participants: Lucian Busoniu, PI on Romanian side; Constantin Morarescu, PI on French side; Marcos Bragagnolo, PhD student; Jihene Ben Rejeb, PhD student.

    Project logo Description: The optimal operation of communication, energy, transport, and other networks is of paramount importance in today's society, and will certainly become more important in the future. Operating these networks optimally requires the effective control of their component systems. Our project AICONS therefore focused on the control of general networked systems. We considered both the coordinated behavior of multiple systems having a local view of the network, as well as the networked control of individual systems where new challenges arise from the limitations of the network. Our main innovation was to overhaul optimization and planning algorithms from artificial intelligence to the control of networked systems. We exploited these algorithms' generality and adapted their guarantees on computation and optimality to the networked setting. We developed stability guarantees to complete the framework. This was a Programme Hubert Curien (PHC)-Brancusi cooperation grant with the Research Center in Automatic Control of Nancy (French PI: Constantin Morarescu), CNCS-UEFISCDI contract no. 781/2014 and Campus France grant no. 32610SE.
  • Young Teams project: Reinforcement learning and planning for large-scale systems

    Start: May 2013      End: September 2016 (finalized)

    Participants: Lucian Busoniu, principal investigator; Levente Tamas, team member; Elod Pall, team member.

    Project logo Description: Many controlled systems, such as robots in open environments, traffic and energy networks, etc. are large-scale: they have many continuous variables. Such systems may also be nonlinear, stochastic, and impossible to model accurately. Optimistic planning (OP) is a paradigm for general nonlinear and stochastic control, which works when a model is available; reinforcement learning (RL) additionally works model-free, by learning from data. However, existing OP and RL methods cannot handle the number of variables required in large-scale systems. Therefore, this project developed a planning and reinforcement learning framework for large-scale system control. On the OP side, methods were developed to deal with large-scale actions and next states. An approach that accelerates large-scale OP by integrating RL was also designed. The methods were validated theoretically as well as in applications, with an application focus on assistive robotics for mobile manipulation. This project was funded under the Young Teams program of the Romanian Authority for Scientific Research, via UEFISCDI (number PNII-RU-TE-2012-3-0040), for a total budget of 180 000 EUR.

Ongoing thesis projects

  • PhD: Learning control for power-assisted wheelchairs

    Start: October 2016      End: September 2019

    Participants: MSc Guoxi Feng, PhD candidate; Prof. Thierry Marie Guerra, advisor; Dr. Sami Mohammad, co-advisor; Lucian Busoniu, co-advisor.

    Assistive wheelchair

    Description: Advances in medical research will be able to offer solutions to some of the disabilities considered irreversible today. With an elderly population estimated at over 2 billion in 2050, the issue of mobility is fundamental. Today's existing solutions, for example wheelchair manual, electric and/or assistance tools, do little to address the highly heterogeneous population of the disabled or are not suited to ageing. In this project we therefore seek innovative solutions that adapt to each person according to their disability, as well as to changes in the particular disability of each person (intra-individual component), both in the long term (for example degeneration) and in the short term (fatigue for example). We will use available measurements to estimate relevant unmeasured variables for control (virtual sensors) using unknown input observer techniques, and control algorithms based on reinforcement learning techniques to eliminate the need for precise models. Real-time experiments will be used in order to validate the theoretical approaches. In particular, we will cooperate with SME Autonomad Mobility, which has significant expertise in the mobility of disabled people.

  • BSc+: Deep learning for vision

    Start: Autumn 2015      End: June 2017

    Participants: Paul Dragan, BSc student; Lucian Busoniu, advisor; Elod Pall, co-advisor.

    Deep network

    Description: This is an advance-start (2-year long) BSc project dealing with deep learning techniques for computer vision. Human fall detection from camera images was realized using convolutional neural networks. Currently Paul is visiting the Delft Center for Systems and Control via an Erasmus grant, where he will apply deep learning for vision in the context of reinforcement learning and robotics.

  • BSc+: Minimax planning for switched systems

    Start: Autumn 2015      End: June 2017

    Participants: Ioana Lal, BSc student; Lucian Busoniu, advisor; Jihene Ben Rejeb, collaborator.

    Description: Another advance-start BSc project focusing on the implementation of minimax planning algorithms for switched systems. The application focus is pursuit evasion, but the framework can also be applied e.g. to discrete-time systems with control channel delays.
  • MSc: UAV for person monitoring

    Start: December 2016      End: June 2017 (expected)

    Participants: Cristian Iuga, MSc student; Lucian Busoniu, advisor.

    Description: This project will develop a controller for an UAV to follow a person. A major focus is computer vision to observe the person's position and state. In particular, the fall detection method from Paul's project will be imported and used to alert a caretaker if the monitored person falls.
  • BSc: Optimal control of a communicating robot

    Start: December 2016      End: June 2017 (expected)

    Participants: Gyorgy Kovacs, BSc student; Lucian Busoniu, advisor; Constantin Morarescu, collaborator.

    Description: We use optimal control (e.g. approximate dynamic programming) to solve a combined problem of robot navigation and communication. A mobile robot must transmit packets to a limited-range base station, while at the same time navigating towards a goal. This project is a collaboration with CRAN Nancy.
  • BSc: Deep reinforcement learning

    Start: December 2016      End: June 2017 (expected)

    Participants: Alex Barabasi, BSc student; Lucian Busoniu, advisor.

    Description: In this project we explore deep neural network representations for reinforcement learning.

Finalized thesis projects (selection)

  • PhD: Nonlinear control for commercial drones in autonomous railway maintenance

    Start: October 2013      End: September 2016

    Participants: MSc Koppany Mathe, PhD candidate; Lucian Busoniu, co-advisor; Prof. Liviu Miclea, advisor; Dr. Laszlo Barabas, industry consultant; Prof. Jens Braband, industry consultant.

    UAV inspection

    Description: Drones are getting widespread and low-cost platforms already offer good flight and video recording. This project uses such drones in the context of railway maintenance by developing applications for autonomous navigation in railway environment. The primary focus of the project is to perform vision-based navigation for infrastructure inspection. Using the libraries of ROS and OpenCV, object detection methods (using feature detectors, optical flow, classifiers,...) are studied and implemented for target detection, target/rail track following and obstacle detection purposes. These methods will serve for basic demonstrative use-cases, showing the autonomy of drones in short-range inspection and long-range track following applications. Furthermore, building upon the above vision-based navigation toolset that we implement, optimization techniques are developed and evaluated for the flight trajectory tracking and planning tasks. In this context, we will investigate commonly used planning methods like RRT or MPC techniques and compare them to the novel optimistic planning algorithms. Specifically, we focus on developing methods for planning under communication or computational constraints, common situations in remote control applications or in case of lightweight drones. This project is supported by a grant from Siemens, reference no. 7472/3202246859 and is part of the international Rail Automation Graduate School (iRAGS).

  • PhD: Online Model Learning Algorithms for Actor-Critic Control

    Start: December 2009      End: March 2015

    Participants: MSc Ivo Grondman, PhD candidate; Lucian Busoniu, co-advisor; Prof. Robert Babuska, promotor.

    Model-learning actor-critic

    Description: Although RL is in principle meant to be completely model-free, the absence of a model implies that learning will take a considerably long time as a lot of system states will have to be visited repeatedly to gather enough knowledge about the system such that an optimal policy may be found. A main challenge in RL is therefore to use the information gathered during the interaction with the system as efficiently as possible, such that an optimal policy may be reached in a short amount of time. This project aims at increasing the learning speed by constructing algorithms that search for a relation between the collected transition samples and use this relation to predict the system's behaviour from this by interpolation and/or extrapolation. This relation is in fact an approximation of the system's model and as such this particular feature is referred to as 'model learning'. Furthermore, if (partial) prior knowledge about the system or desired closed-loop behaviour is available, RL algorithms should be able to use this information to their advantage. The final approach to speed up learning addressed in this thesis is to make explicit use of the reward function, instead of only gathering function evaluations of it, that come as part of a transition sample.

  • MSc: Experience replay for efficient online reinforcement learning

    Start: October 2007      End: October 2008

    Participants: Sander Adam, MSc student (graduated cum laude); Lucian Busoniu, co-advisor; Prof. Robert Babuska, advisor.

    Robotic goalkeeper

    Description: Although Reinforcement Learning (RL) is guaranteed to give an optimal controller for many control problems, its practical use is limited due to its slow learning performance. This paper introduces a new class of algorithms which dramatically speeds up learning performance, at moderate computational cost. Opposed to traditional RL algorithms which use each data sample only once, the newly introduced algorithms repeatedly present all collected data samples to the learning controller in a process named experience replay (ER). The use of experience replay in RL has only been researched in some very specific applications, and has never been used as the main learning mechanism. Analysis shows that the ER algorithms learn fast, are computationally efficient and scale up well to multidimensional state-spaces. The ER algorithms are tested on a pendulum swing-up task, both in simulation and in reality. In simulation, the ER algorithms outperform a least-squares policy iteration controller in terms of learning speed and computational complexity. The ER algorithms also perform well on the real pendulum swing-up setup, where they successfully learn to swing up within 100 s. The application on a two-link robotic manipulator simulation shows the ability of the ER algorithms to scale up well to larger state-action spaces. Finally, high performance is obtained on a real robotic goalkeeper setup, illustrating the applicability of ER algorithms to practical control problems.

    Watch on YouTube:

  • MSc: Using prior knowledge to accelerate reinforcement learning

    Start: August 2007      End: June 2008

    Participants: Maarten Vaandrager, MSc student (graduated cum laude); Lucian Busoniu, co-advisor; Prof. Robert Babuska, advisor.

    Value function

    Description: A RL controller learns an optimal policy by online (real-time) exploration of the control task. Usually, RL algorithms take a long time to converge. This is an important obstacle preventing the application of RL to real-life problems. This project investigates ways to speed up the convergence of RL algorithms by using prior knowledge about the controlled process or about the solution. Several new architectures of actor-critic learning are proposed, making use of locar linear regression as an approximator. Then, prior knowledge about the process and solution is added to these algorithm, in the non-parameteric form of measurement samples. The resulting algorithms give better performance in simulation examples that the original algorithms, which did not use prior knowledge.

    Watch movies showing: