Google Tasks Robots with Learning Skills from One Another via Cloud Robotics

January 02, 2017 by Dr. Steve Arar

Robots are using shared data to learn skills much faster than they do now.

Robots are using shared data to learn skills much faster.

Humans use language to tap into the knowledge of others and learn skills faster. This helps us hone our intuition and go through our daily activities more efficiently. Inspired by this, Google Research, DeepMind (its UK artificial intelligence lab), and Google X have decided to allow their robots share their experiences. Sharing the learning process among multiple robots, the research team has considerably expedited general-purpose skill acquisition of robots. 

Training: How and Why?

Using an artificial neural network, we can teach a robot to achieve a goal by analyzing the result of its previous experiences. At first, the robot may seem to act randomly simply working based on trial and error. However, it examines the result of each trial and, if satisfactory, focuses on similar experiments during the next trials. Making a connection between each experience and the obtained result, the robot would be able to gradually make better choices.

To teach a robot, a great wealth of experience must be gathered which is a quite time-consuming process. For example, in order to teach a robotic arm how to grasp objects, we may need to let the robot experience as many as 800,000 grasps. And this would be just the beginning of its learning process.

Although this learning is really time-consuming, it has interesting outcomes. Robots which are designed to perform certain pre-defined actions or interact with pre-defined objects cannot easily respond to changes in the environment. However, a robot which goes through a training process develops capabilities depending on the wealth of its experience. In this case, the robot gains the capability of adapting itself to the slight variations of the environment.

To rapidly train the robots, Google researchers have decided to let the robots share their experiences–– a concept also known as cloud robotics. Every robot puts its own experience on the server and takes the latest version of the training model, which is the overall result obtained by all of the robots, from the server. Effectively, the robots are teaching each other how to perform a certain task. This cloud-based approach significantly reduces the time required to train the network.

Google’s Previous Robotic Data Sharing Experiment

In an attempt to teach robotic arms to grasp objects, Google observed that the robots have developed pre-grasp behavior. They could push objects away to isolate a certain object from a group and then grasp it. Moreover, the robots learned to treat soft and hard objects differently. The research team achieved these capabilities only through letting robots learn and not by programming them prior to interacting with objects.

In this experiment, which was conducted in March, Google allowed 14 camera-equipped robots to try picking objects. The robots’ experiences were monitored via the cameras and the result was used to train the system which was based on a convolutional neural network (CNN)­ –– a particular field of machine learning.

Sharing the data, the robots were able to learn much faster. Each robot was experimenting under slightly different conditions. For example, the research team slightly changed parameters such as camera position, lighting, and the gripper hardware for each robot. These intentional variations allowed the robots to find a more robust solution and adapt themselves to environment changes. However, the system was still unlikely to operate successfully with significantly different hardware or environment.

In another experiment conducted recently, the Google team gives a group of robots the task of opening a door and investigates the idea of data sharing. They repeat this experiment under three different conditions:

Reinforcement Learning

In the first experiment, the robots simply rely on reinforcement learning, or trial and error, combined with deep neural networks. The researchers apply interference to make the neural networks build up data more quickly. The server monitors the result of trials and helps the robots arrive at a better solution.

It takes the robotic arms about 20 minutes to open the door for the first time. However, in three hours, they figure out how to neatly reach for the handle, turn it, and then pull to open the door. Although the robots successfully open the door, they are not necessarily building an explicit model of this task.

Learning Based on a Predictive Model

In the second experiment, a predictive model is developed and tested. The scientists provide the robots with a tray of everyday objects. Nudging these objects around a table, the robots build a model which helps them somehow predict what might happen if they take a certain course of action. This cause and effect model is again shared between the robots.

Then, the researchers use a computer interface showing the test environment to tell the robots to move an object to a certain location. The robots use their predictive model to find out how to move the object.

Learning Based on Human Guidance

The final experiment is designed to help robots learn directly from humans. In this experiment, the robots are physically moved by the human to reach for the door and open it.

This is analyzed and converted into a neural network which forms the base of robots’ subsequent learning. Then, the researchers allow the robots to try opening the door on their own. Again, the robots are allowed to share their experiences with each other. Within a few hours, the learning process leads to even more versatile robots.

Cloud robotics can provide robots with rapidly-downloadable intelligence. By employing this method, we may soon witness robots capable of learning tasks much more complicated than simply opening a door. While humans need a lot of time to grasp others' knowledge, robots would be able to put their information on a shared network and instantly acquire each others' skills.