Carnegie Mellon University Pioneers Innovative Framework for Effective Autonomous Learning
As children learn to ride a bike, adults can only guide them to a certain point. A large portion of developing this new skill depends on independent trial and error from the child.
After practice, and perhaps a fall or two, children eventually learn how to balance themselves, steer around obstacles, and make informed decisions about their next biking path. The adults can then send them off with confidence, knowing that their child will continually develop and improve with time.
Researchers at the Carnegie Mellon University Robotics Institute(opens in new window) are attempting to answer a cutting-edge question: Can robots independently learn tasks in a similar way?
To examine this inquiry, Ph.D. candidate Russell Mendonca and professor Deepak Pathak(opens in new window) joined forces with Emmanuel Panov, Bernadette Bucher, and Jiuguang Wang from the AI Institute(opens in new window).
In the majority of current work, robots copy behaviors or respond to human-made simulated environments in order to learn and complete tasks. The research team recognized that this is not comparable to how humans learn most of the time, and they consequently aimed to design a framework to imitate the unique trials and errors of human learning.
The team focused on three key strategies to create a reinforcement learning system for mobile robots that can learn with minimal human intervention: directing the robot toward meaningful object interactions, speeding up learning by using basic task knowledge, and creating rewards that blend human understanding with detailed observations from the robot’s environment.
The researchers used a quadruped robot for their experiment and designed four manipulation tasks to test the framework. The tasks included balancing a dustpan and broom on the floor, sweeping an object into a specified goal area, moving a chair to a table in the middle of a room, and moving a chair to a table in a corner of the room. With the help of a camera system to observe the tasks, the team could see how well the robot responded to the integrated reward system and adjusted its behaviors accordingly.
Much like children learning how to ride a bike, the robot practiced each task for 8 to 10 hours at a time and continuously improved in task completion accuracy and efficiency as this time progressed. Overall, the reinforcement learning framework presented an average success rate of 80%, significantly exceeding current RL approaches in its performance.
The team’s reinforcement learning framework demonstrates that robots can independently improve their performance over time, much like children mastering a new skill. This continual improvement is a promising development, highlighting the potential for future robotic systems to be deployed in real-world scenarios where human supervision is limited. Despite the current experiments occurring in a controlled environment, the approach offers a practical pathway for enhancing the safety and efficiency of robots in multiple dynamic environments in the future.