Principled Methods for Human-Robot Collaborative Learning and Control

Sep 3, 2019

Enabling seamless human-robot collaboration while ensuring task success requires the reduction of the task information to its essential features through representations that are intuitive for the user. To do this, I focus on developing methods for generating and learning task representations from motion data and information that can be incorporated into model-based control methods. My goal is to develop methods that enable autonomy to extract understandable task embeddings that enable control methods to ensure dynamic feasibility and provably safe behavior and can be communicated to and interpreted by humans, a step in developing explainable AI.

Credit Assignment Safety Learning from Suboptimal and Failure Demonstrations

A critical need in assistive robotics is to learn task intent and safety guarantees through user interactions in order to ensure safe task performance. Most robot learning from demonstration (LfD) and inverse reinforcement learning (IRL) methods primarily rely on optimal demonstration in order to successfully learn a control policy, which can be challenging to acquire from novice users. Recent work does use suboptimal and failed demonstrations to learn about task intent; few focus on learning safety guarantees to prevent repeat failures experienced, essential for assistive robots. Furthermore, interactive human-robot learning aims to minimize effort from the human user to facilitate deployment in the real-world. As such, requiring users to label the unsafe states or keyframes from the demonstrations should not be a necessary requirement for learning. I developed an algorithm to learn a safety value function from a set of suboptimal and failed demonstrations that is used to generate a real-time safety control filter. Importantly, we develop a credit assignment method that extracts the failure states from the failed demonstrations without requiring human labelling or prespecified knowledge of unsafe regions. This method can be combined with standard LfD or IRL methods to learn a task policy that also guarantees safety during execution.

For more information about this project, check out the paper or my talk.

Learning from Variable, Imperfect Demonstrations using Ergodic Control

As robots become more ubiquitous in every day life, they will be interacting with people more and will need to learn from them. At the same time, as regular people are required to teach robots to perform more challenging tasks and provide demonstrations for complicated robotic systems that they may be unfamiliar with, they may provide demonstrations that are non-optimal or even unsuccessful. How do we allow robots to learn a task representation from a set of human demonstrations that 1) can encompass varied solutions to the same task, and, perhaps more importantly, 2) can still generate optimal solutions from a set of imperfect demonstrations from a non-expert user?

Here, I explore the idea of representing the set of demonstrations as an information distribution over the task space. By considering each demonstration as adding information about the task, we can consider imperfect, and even unsuccessful, demonstrations as still adding valuable information about the task to the representation. As such, a set of imperfect demonstrations can collectively create a task representation from which we can generate controls for optimal task performance. Furthermore, this representation allows more flexibility in the demonstration set, allowing for multiple solution sets for the same task. By representing the task as information over the task space, multiple solutions can emerge from a demonstration set and variations that are irrelevant to task success will naturally be averaged out.

For more information about this project, check out the paper.

Robotic Visual Rendering using Information-Theoretic Methods

Drawing is a classic example of tasks where the motions are used to successfully accomplish the task of communicating information. It falls into a unique subset of tasks where the motion is simultaneously essential for accomplishing the task, but there are many motion trajectories that can successfully accomplish it— people may draw the same image completely different, but they ultimately result in the same image. I wanted to give the same level of robustness and generality to a robot’s task performance.

To do this, I represent the task as the distribution over the state space representing the relevant task state information. By representing the task as an information distribution, the definition can be abstracted away from the specific motion trajectories. This representation naturally accommodates uncertainty due to trajectory variability and multiple task solutions. Using this representation, I use an information-based metric ergodicity to define an objective function with model-based predictive control to generate controls that successfully accomplishes the task with the most efficient motion, given the system dynamics and initial conditions.

For more information about this project, check out the paper.

Ahalya Prabhakar

Lecturer and Associate Research Scientist

My research interests include robot active learning from high-dimensional sensory signals and human-robot interaction using information-theoretic algorithms.

Publications

Credit Assignment Safety Learning from Human Demonstrations

A critical need in assistive robotics, such as assistive wheelchairs for navigation, is a need to learn task intent and safety …

A. Prabhakar, A. Billard

PDF Project Video

Ergodic imitation: Learning from what to do and what not to do

With growing access to versatile robotics, it is beneficial for end users to be able to teach robots tasks without needing to code a …

A. Prabhakar*, A. Kalinowska*, K. Fitzsimons and T.D. Murphey

PDF Project

Autonomous Visual Rendering using Physical Motion

. This paper addresses the problem of enabling a robot to represent and recreate visual information through physical motion, focusing …

A. Prabhakar, A. Mavrommati, J. Schultz, and T.D. Murphey

PDF Project Video