Reinforcement Learning: A More Efficient Way for Robots to Learn

Robot nurses — myth or reality? Although this may sound far-fetched, there are already hospitals in which robots assist nurses by bringing them tools, allowing the nurses to focus on providing care to their patients more efficiently. Vittorio Giammarino, a fifth-year PhD candidate (SE) at Boston University, hopes that his work can be useful for applications like these.
At the Center for Information and System Engineering (CISE), Giammarino is working under Ioannis Paschalidis, the Director of the Rafik B. Hariri Institute for Computing and Computational Science & Engineering and a Distinguished Professor of Engineering, on the Multidisciplinary University Research Initiative (MURI) grant from the Department of Defense entitled Neuro-Autonomy: Neuroscience-Inspired Perception, Navigation, and Spatial Awareness for Autonomous Robots.
Giammarino’s work centers around machine learning, more specifically reinforcement learning, a method for robots to learn how to select good actions. “Before, control was mainly mathematical, and now we realize that modeling everything is hard,” explains Giammarino. “So what we try to do is let the agent interact with the environment and try to learn by itself, by trial and error, or experience.”
Designing robots that can learn through reinforcement learning — as humans and animals do — would make robots more computationally efficient. “We want to build algorithms that do not require all this data, cleaning and annotation and all the expenses behind collecting data when there is something that can be built and can be cheaply put into the field and can learn by itself,” Giammarino says.
However, Giammarino and his collaborators quickly realized that this would not be an easy feat; in humans and animals, the learning process takes a lifetime and is largely inefficient. To combat this inefficiency, Giammarino and his collaborators decided to use behavioral data from humans and animals to enhance and improve this learning process for robots, starting with simple experiments such as getting from point A to point B in a room.
During the research process, Giammarino’s role is to begin by thinking about the problem and how to address it. To do this, he and his collaborators come up with a research question and then look for solutions to these questions from data. Their data comes from experiments involving humans and animals, many of which are conducted in the Center for Systems Neuroscience labs led by Michael Hasselmo and Chantal Stern. The first step is to leverage imitation learning techniques, getting the robot to imitate what animals in humans do in similar settings. Then, they try to decrease the complexity of their algorithm — making it easier and quicker for the robot to follow — before letting it improve upon what it has learned through reinforcement learning. Once these are developed, neuroscientists become involved, followed by a discussion and defense phase, during which the team looks for problems in their work and tries to improve upon them.
One recently published paper that Giammarino worked on, “Opportunities and Challenges from Using Animal Videos in Reinforcement Learning for Navigation,” was presented in Yokohama, Japan last summer. In the paper, he and his collaborators focus on the problem of imitation learning from visual observations, where the learning agent has access to videos of experts as its sole learning source. Further, they address the challenges that arise from this framework and describe how they plan to tackle these problems. Currently, Giammarino is working on the paper, “Adversarial Imitation Learning from Visual Observations using Latent Information,” in which he investigates the use of observations in animal videos to improve reinforcement learning efficiency and performance in navigation tasks.
“Vittorio is working on a very challenging and important set of problems,” said Paschalidis, Giammarino’s Ph.D. advisor. “He has made important progress, particularly tackling the very realistic setup where one may not be able to observe the true internal states and corresponding actions of an expert we wish to imitate,” added Paschalidis.
Giammarino completed his undergraduate studies at the University of Bologna in Italy and Tongji University in China, where he majored in Automation Engineering. He also received a Master of Science from the Delft University of Technology in Systems and Control. In the future, he hopes to pursue a career in an industrial-related field such as robotics, recommendation systems, or energy.