OpenAI’s Universe Platform Lets AI Learn by Playing Games

May 20, 2017 by Robin Mitchell

We have seen computers recognize faces, prepare food, and even beat grandmasters at chess and Go. But what about everyday gaming such as GTA or SimCity?

We have seen computers recognize faces, prepare food, and even beat grandmasters at chess and Go. But what about everyday gaming such as GTA or SimCity?

Jack of All Trades, Master of None

Some of the first computers (such as Colossus and ENIAC), were able to solve problems much faster than vast teams of people could by hand (even when armed with calculators). For many years, computers were large and expensive devices that only a few individuals were privileged enough to use.

To justify the staggering costs of creating and maintaining such machines, computers had to do work that was either vitally important or extremely profitable. For example, some were used in situations involving transactions by banks (such as mainframe computers). Others would crunch numbers to find oil. A few were even used in the military to calculate artillery trajectories and process numbers from atomic explosions.

As time passed, more and more data could be analyzed which led to the creation of supercomputers. Now, such machines are being used in the field of artificial intelligence where they can recognize patterns and learn to improve their ability to solve problems. IBM's supercomputer, Watson, was able to analyze hundreds of thousands of patient medical information and successfully diagnosed a patient with cancer when doctors could not.


Watson also went on Jeopardy. Image courtesy of Raysonho [CC BY 3.0]


Artificial intelligence is becoming a big industry with many different companies striving for the best AI. But is this obsession with single-minded task-solving harming AI development? A computer that can beat any Go player is great but what else is it capable of? Could it play other games now that it's mastered Go? How about classics such as Space Invaders? Surely the best AI would have the ability to transfer skills into other situations.

This issue is what Universe (OpenAI), a software platform, aims to solve.

OpenAI – The AI Learning Environment

OpenAI is an artificial intelligence lab funded by Tesla. It's recently demonstrated a unique virtual world that is designed for artificial systems to learn in, called “Universe”.

Other companies and groups (such as Google’s Deep Mind), have created similar systems where AI systems can learn how to play games but Universe is much more complex. Universe, like others, is a software layer that sits in between the AI and the target application and uses reinforcement learning but instead of being limited to games, Universe can be used to interact with any software ranging from gaming to protein folding.

So what makes Universe different from other systems currently in place? The answer lies in what AI actually is. Many scientists, engineers, and even users of this website cannot agree on what artificial intelligence actually is. Some believe that Watson is intelligent whereas others (myself included) believe that Watson is an over-glorified Wikipedia-based computer with some clever database management skills.

Intelligence, for the purpose of developing AI, is generally understood to be the ability to approach new problems and generate solutions without needing to look at every possible solution. People are naturally intelligent in this way.

Consider a game of Go. A player will naturally disregard certain moves based on past experiences. However, chess programs do not operate in this way. Instead, they rely on sheer brute-force ability to look at each piece and make predictions for the outcome of each move. Such information processing, when used to consider the ramifications of even clearly unwise moves, is a time-consuming waste of resources.

One way to solve this "brute force" issue is to create an AI system that's good at multiple tasks rather than specializing in one. Here's where OpenAI’s Universe comes in. Universe currently has over 1,000 games in its collection, allowing AI systems to be presented with different unfamiliar situations.


A game of Go between professional Go player, Fan Hui, and AlphaGo. Screenshot courtesy of Google DeepMind.


So, to provide an interface for AI systems, Universe provides a software layer that simulates a mouse and keyboard strokes via Virtual Network Computing. This system then returns information so that AI agents can learn through trial and error.

This is in-line with other AI "training" systems, but Universe takes this a step further by allowing this reinforcement learning to occur with any piece of software. AI agents can jump from one software package to the next so as to face new unfamiliar challenges.

But it does not end there. The developers hope that AI systems will then take their learned skills and use them in other software packages to solve them faster based on past experiences. One common example is the use of menu items in Windows and the common names given to menu options. "File" is commonly associated with new file creation, opening other files, and saving. But such an interface experience goes deeper. New releases of Windows operating systems have features that are similar but not identical—yet users generally don't need to read a manual to understand that the new system functions similarly to the old one.


Read More

Future of Universe

While only games are currently available, Universe's long-term plan is to include many other software applications, including those involving protein folding as to provide more complex problems.

If Universe is successful in its mission, what would future AI look like? Such a system may be useful for general applications such as home automation (finally giving us the future homes seen in sci-fi movies like Her). 

Such systems may also prove useful in scenarios where split decision making and experience are required. One example would involve military scenarios such as close-in-weapon systems. An AI system could approach new situations and come up with solutions on the fly, feasibly hundreds of times fasters than a human could.


A close-in-weapon system in action. Image courtesy of the U.S. Navy. Photo by Mass Communication Specialist 3rd Class Stuart Phillips.


Overall, the idea of training AI in many situations and getting them to adapt to new solutions has to be the way forward. Continuously creating more powerful supercomputers to solve games and perform specific tasks is not the solution when the goal of AI is to create a system that is intelligent.

While it is difficult to say, Universe could be the key to truly making the first intelligent system that can generate solutions when presented with a situation for the first time.