This page lists a selection of the research and thesis projects in which I am – or have been – participating. Please contact me if you need additional information about any of these projects.

Moreover, our research group always has student projects available on a wide range of topics, from interesting applications to mobile ground and air robots, to analytical projects on control and estimation for the more mathematically inclined. We are looking for motivated, capable students ready to invest themselves fully into the project starting early on (for instance, in Bachelor year 3 or Master year 1). An up-to-date list of projects can be found on our group's website: open projects at ROCON. Of course, original project ideas from students are also more than welcome.

Research grants

  • Young Teams project: AIRGUIDE - A Learning Aerial Guide for the Elderly and Disabled

    Start: May 2018      End: April 2020 (ongoing)

    Participants: Lucian Busoniu, PI; Levente Tamas, team member; Alexandru Codrean, team member; Ioana Lal, team member.

    Project logo Description: Robotic assistants can greatly improve the life of the ever-increasing elderly and disabled population. AIRGUIDE will develop aerial assistive technology for independent mobility of an elderly or disabled person over a wide, outdoor area, via monitoring risks and guiding the person when needed. Our fundamental approach is to develop a novel learning and planning control framework, by exploiting interdisciplinary, artificial-intelligence and control-theoretic insights. The framework will be implemented and validated in a case study where an at-risk person is monitored over e.g. a park, warned about risks like falling and unsafe areas, and actively guided to safety or a desired destination when required. The project is being funded under the Young Teams program of UEFISCDI, for a total budget of about 100 000 EUR.
  • AUF-RO grant: AI methods for the networked control of assistive UAVs (NETASSIST)

    Start: September 2016      End: December 2017 (finalized)

    Participants: Lucian Busoniu, PI; Zoltan Kato, Hungarian side coordinator; Constantin Morarescu, French side coordinator.

    Description: This project develops methods for the networked control and sensing for a team of unmanned, assistive aerial vehicles that follows a group of vulnerable persons. On the control side, we consider multiagent and consensus techniques, while on the vision side the focus is egomotion estimation of the UAVs and cooperative tracking of persons with filtering techniques. NETASSIST is an international cooperation project involving the Technical University of Cluj-Napoca in Romania, the University of Szeged in Hungary, and the University of Lorraine at Nancy, France. The project is funded by the Agence Universitaire de la Francophonie (AUF) and the Romanian Institute for Atomic Physics (IFA), under contract no. 09-AUF. Besides the coordinators listed above, the project includes many PhD and MSc students, as well as staff in the three departments, for a total of 11 people.
  • PHC Brancusi grant: Artificial-intelligence-based optimization for the stable and optimal control of networked systems (AICONS)

    Start: January 2014      End: December 2016 (finalized)

    Participants: Lucian Busoniu, PI on Romanian side; Constantin Morarescu, PI on French side; Marcos Bragagnolo, PhD student; Jihene Ben Rejeb, PhD student.

    Project logo Description: The optimal operation of communication, energy, transport, and other networks is of paramount importance in today's society, and will certainly become more important in the future. Operating these networks optimally requires the effective control of their component systems. Our project AICONS therefore focused on the control of general networked systems. We considered both the coordinated behavior of multiple systems having a local view of the network, as well as the networked control of individual systems where new challenges arise from the limitations of the network. Our main innovation was to overhaul optimization and planning algorithms from artificial intelligence to the control of networked systems. We exploited these algorithms' generality and adapted their guarantees on computation and optimality to the networked setting. We developed stability guarantees to complete the framework. This was a Programme Hubert Curien (PHC)-Brancusi cooperation grant with the Research Center in Automatic Control of Nancy (French PI: Constantin Morarescu), CNCS-UEFISCDI contract no. 781/2014 and Campus France grant no. 32610SE.
  • Young Teams project: Reinforcement learning and planning for large-scale systems

    Start: May 2013      End: September 2016 (finalized)

    Participants: Lucian Busoniu, principal investigator; Levente Tamas, team member; Elod Pall, team member.

    Project logo Description: Many controlled systems, such as robots in open environments, traffic and energy networks, etc. are large-scale: they have many continuous variables. Such systems may also be nonlinear, stochastic, and impossible to model accurately. Optimistic planning (OP) is a paradigm for general nonlinear and stochastic control, which works when a model is available; reinforcement learning (RL) additionally works model-free, by learning from data. However, existing OP and RL methods cannot handle the number of variables required in large-scale systems. Therefore, this project developed a planning and reinforcement learning framework for large-scale system control. On the OP side, methods were developed to deal with large-scale actions and next states. An approach that accelerates large-scale OP by integrating RL was also designed. The methods were validated theoretically as well as in applications, with an application focus on assistive robotics for mobile manipulation. This project was funded under the Young Teams program of the Romanian Authority for Scientific Research, via UEFISCDI (number PNII-RU-TE-2012-3-0040), for a total budget of 180 000 EUR.

Ongoing thesis projects

  • PhD: Optimal sensing using control

    Start: October 2017      End: September 2020

    Participants: Zoltan Nagy, PhD candidate; Lucian Busoniu, advisor; Zsofia Lendek, advisor; Romain Postoyan, co-advisor.

    Description: Control and estimation are two major pillars of the systems and control field. However, the two problems are usually treated separately: first, an estimator is designed to recover system variables that are not measurable with the available sensors (or are measurable but only with uncertainty), and then a controller is found that takes the recovered variables as feedback, and produces a command signal for the system so as certain control objectives are satisfied. In this project, we take a different perspective: the control design will be performed in a way that optimizes sensing quality. In particular, we will consider scenarios where a separate sensor system monitors a target system, with an error that depends on the difference between the states of the two systems. Examples include two unmanned aerial vehicles (drones), one of which is following the other, or two subsequent road vehicles in an automated platooning system. Then, we aim to design at the same time an observer and controller for the sensor such that the state of the target is accurately recovered, which implicitly requires that the measurement error is minimized. This framework where the sensor is actively controlled in order to observe the target well is called for instance active sensing or active perception in robotics, but is largely unexplored in systems and control.
  • PhD: Active perception methods in collaborative robotics

    Start: October 2017      End: September 2020

    Participants: Daniel Mezei, PhD candidate; Lucian Busoniu, advisor; Levente Tamas, advisor.

    Point clouds

    Description: This is a sister project of the above, where we consider explicitly active perception for robotics. Whereas the project above uses control-theoretic design and analysis techniques, here we will focus on artificial intelligence tools, specifically so-called partially observable Markov decision processes. These allow specifying uncertainty in sensing, and by appropriately modeling the problem, decision-making so as to optimize the combined objective of reducing uncertainty and actually solving the control problem. We will exploit and develop further state-of-the-art solution techniques from the planning class. Two applications will be considered. In the first, a robot must sort objects traveling on a conveyor belt into different classes, but the classifier is inaccurate, and this uncertainty is modeled using a POMDP. For the second application, we will solve a task involving a human user, and use POMDPs to model uncertainty in the behavior of the human.

  • PhD: Learning control for power-assisted wheelchairs

    Start: October 2016      End: September 2019

    Participants: Guoxi Feng, PhD candidate; Thierry Marie Guerra, advisor; Sami Mohammad, co-advisor; Lucian Busoniu, co-advisor.

    Assistive wheelchair

    Description: Advances in medical research will be able to offer solutions to some of the disabilities considered irreversible today. With an elderly population estimated at over 2 billion in 2050, the issue of mobility is fundamental. Today's existing solutions, for example wheelchair manual, electric and/or assistance tools, do little to address the highly heterogeneous population of the disabled or are not suited to ageing. In this project we therefore seek innovative solutions that adapt to each person according to their disability, as well as to changes in the particular disability of each person (intra-individual component), both in the long term (for example degeneration) and in the short term (fatigue for example). We will use available measurements to estimate relevant unmeasured variables for control (virtual sensors) using unknown input observer techniques, and control algorithms based on reinforcement learning techniques to eliminate the need for precise models. Real-time experiments will be used in order to validate the theoretical approaches. In particular, we will cooperate with SME Autonomad Mobility, which has significant expertise in the mobility of disabled people.

  • PhD: Stability analysis of discounted optimal control problems

    Start: October 2016      End: September 2019

    Participants: Mathieu Granzotto, PhD candidate; Romain Postoyan, advisor; Jamal Daafouz, advisor; Dragan Nesic, co-advisor; Lucian Busoniu, co-advisor.


    Description: Artificial intelligence is rich in algorithms for optimal control; for example, the entire field of reinforcement learning is concerned with designing such algorithms when the model of the system is unknown. However, a fundamental question remains unanswered for these AI approaches: stability. In this project, we will therefore study the stability of nonlinear systems when controlled by such algorithms. To this end, we must build a bridge between control theory and artificial intelligence. We will study in particular optimal control problems over an infinite horizon where the costs are discounted. We begin by considering the fully optimal solution, after which we will include approximation errors made by practical, numerical algorithms that use a model. The ultimate goal is to analyze reinforcement learning algorithms that are model-free.

  • MSc+: Control design for a ball balancing robot

    Start: Autumn 2017      End: Summer 2019

    Participants: Ioana Lal, MSc student; Lucian Busoniu, advisor; Alexandru Codrean, coadvisor.

    Ball balancing robot

    Description: An advance-start MSc project focusing on control design for a ball balancing robot. This project is in collaboration with Bosch Cluj, which funds both the student and the hardware itself. We start by developing a nonlinear dynamical model of the robot, either by decoupling the motion along two axes, or otherwise by considering the full 3D motion of the robot. Following that, we will design linear control based on a linearized version of the dynamics, so as to stabilize (balance) around the vertical position. Finally, we will consider nonlinear control design for stabilization, followed by tracking for e.g. fast acrobatic maneuvers.

Finalized thesis projects (selection)

  • PhD: Nonlinear control for commercial drones in autonomous railway maintenance

    Start: October 2013      End: September 2016

    Participants: MSc Koppany Mathe, PhD candidate; Lucian Busoniu, co-advisor; Prof. Liviu Miclea, advisor; Dr. Laszlo Barabas, industry consultant; Prof. Jens Braband, industry consultant.

    UAV inspection

    Description: Drones are getting widespread and low-cost platforms already offer good flight and video recording. This project uses such drones in the context of railway maintenance by developing applications for autonomous navigation in railway environment. The primary focus of the project is to perform vision-based navigation for infrastructure inspection. Using the libraries of ROS and OpenCV, object detection methods (using feature detectors, optical flow, classifiers,...) are studied and implemented for target detection, target/rail track following and obstacle detection purposes. These methods will serve for basic demonstrative use-cases, showing the autonomy of drones in short-range inspection and long-range track following applications. Furthermore, building upon the above vision-based navigation toolset that we implement, optimization techniques are developed and evaluated for the flight trajectory tracking and planning tasks. In this context, we will investigate commonly used planning methods like RRT or MPC techniques and compare them to the novel optimistic planning algorithms. Specifically, we focus on developing methods for planning under communication or computational constraints, common situations in remote control applications or in case of lightweight drones. This project is supported by a grant from Siemens, reference no. 7472/3202246859 and is part of the international Rail Automation Graduate School (iRAGS).

  • PhD: Online model learning algorithms for actor-critic control

    Start: December 2009      End: March 2015

    Participants: MSc Ivo Grondman, PhD candidate; Lucian Busoniu, co-advisor; Prof. Robert Babuska, promotor.

    Model-learning actor-critic

    Description: Although RL is in principle meant to be completely model-free, the absence of a model implies that learning will take a considerably long time as a lot of system states will have to be visited repeatedly to gather enough knowledge about the system such that an optimal policy may be found. A main challenge in RL is therefore to use the information gathered during the interaction with the system as efficiently as possible, such that an optimal policy may be reached in a short amount of time. This project aims at increasing the learning speed by constructing algorithms that search for a relation between the collected transition samples and use this relation to predict the system's behaviour from this by interpolation and/or extrapolation. This relation is in fact an approximation of the system's model and as such this particular feature is referred to as 'model learning'. Furthermore, if (partial) prior knowledge about the system or desired closed-loop behaviour is available, RL algorithms should be able to use this information to their advantage. The final approach to speed up learning addressed in this thesis is to make explicit use of the reward function, instead of only gathering function evaluations of it, that come as part of a transition sample.

  • MSc: Experience replay for efficient online reinforcement learning

    Start: October 2007      End: October 2008

    Participants: Sander Adam, MSc student (graduated cum laude); Lucian Busoniu, co-advisor; Prof. Robert Babuska, advisor.

    Robotic goalkeeper

    Description: Although Reinforcement Learning (RL) is guaranteed to give an optimal controller for many control problems, its practical use is limited due to its slow learning performance. This paper introduces a new class of algorithms which dramatically speeds up learning performance, at moderate computational cost. Opposed to traditional RL algorithms which use each data sample only once, the newly introduced algorithms repeatedly present all collected data samples to the learning controller in a process named experience replay (ER). The use of experience replay in RL has only been researched in some very specific applications, and has never been used as the main learning mechanism. Analysis shows that the ER algorithms learn fast, are computationally efficient and scale up well to multidimensional state-spaces. The ER algorithms are tested on a pendulum swing-up task, both in simulation and in reality. In simulation, the ER algorithms outperform a least-squares policy iteration controller in terms of learning speed and computational complexity. The ER algorithms also perform well on the real pendulum swing-up setup, where they successfully learn to swing up within 100 s. The application on a two-link robotic manipulator simulation shows the ability of the ER algorithms to scale up well to larger state-action spaces. Finally, high performance is obtained on a real robotic goalkeeper setup, illustrating the applicability of ER algorithms to practical control problems.

    Watch on YouTube:

  • MSc: Using prior knowledge to accelerate reinforcement learning

    Start: August 2007      End: June 2008

    Participants: Maarten Vaandrager, MSc student (graduated cum laude); Lucian Busoniu, co-advisor; Prof. Robert Babuska, advisor.

    Value function

    Description: A RL controller learns an optimal policy by online (real-time) exploration of the control task. Usually, RL algorithms take a long time to converge. This is an important obstacle preventing the application of RL to real-life problems. This project investigates ways to speed up the convergence of RL algorithms by using prior knowledge about the controlled process or about the solution. Several new architectures of actor-critic learning are proposed, making use of locar linear regression as an approximator. Then, prior knowledge about the process and solution is added to these algorithm, in the non-parameteric form of measurement samples. The resulting algorithms give better performance in simulation examples that the original algorithms, which did not use prior knowledge.

    Watch movies showing: