OpenAI Gym is an open source toolkit that provides a diverse collection of tasks, called environments, with a common interface for developing and testing your intelligent agent algorithms. The toolkit introduces a standard Application Programming Interface (API) for interfacing with environments designed for reinforcement learning. Each environment has a version attached to it, which ensures meaningful comparisons and reproducible results with the evolving algorithms and the environments themselves.
This article is an excerpt taken from the book, Hands-On Intelligent Agents with OpenAI Gym, written by Praveen Palanisamy. In this article, you will get to know what OpenAI Gym is, its features, and later create your own OpenAI Gym environment.
The Gym toolkit, through its various environments, provides an episodic setting for reinforcement learning, where an agent’s experience is broken down into a series of episodes. In each episode, the initial state of the agent is randomly sampled from a distribution, and the interaction between the agent and the environment proceeds until the environment reaches a terminal state. Do not worry if you are not familiar with reinforcement learning.
Some of the basic environments available in the OpenAI Gym library are shown in the following screenshot:
Examples of basic environments available in the OpenAI Gym with a short description of the task
The OpenAI Gym natively has about 797 environments spread over different categories of tasks. The famous Atari category has the largest share with about 116 (half with screen inputs and half with RAM inputs) environments! The categories of tasks/environments supported by the toolkit are listed here:
- Board games
- Classic control
- Doom (unofficial)
- Minecraft (unofficial)
- Toy text
- Robotics (newly added)
The various types of environment (or tasks) available under the different categories, along with a brief description of each environment, is given next. Keep in mind that you may need some additional tools and packages installed on your system to run environments in each of these categories. To have a detailed overview of each of these categories, head over to the book.
With that, you have a very good overview of all the different categories and types of environment that are available as part of the OpenAI Gym toolkit. It is worth noting that the release of the OpenAI Gym toolkit was accompanied by an OpenAI Gym website (gym.openai.com), which maintained a scoreboard for every algorithm that was submitted for evaluation. It showcased the performance of user-submitted algorithms, and some submissions were also accompanied by detailed explanations and source code. Unfortunately, OpenAI decided to withdraw support for the evaluation website. The service went offline in September 2017.
Now you have a good picture of the various categories of environment available in OpenAI Gym and what each category provides you with. Next, we will look at the key features of OpenAI Gym that make it an indispensable component in many of today’s advancements in intelligent agent development, especially those that use reinforcement learning or deep reinforcement learning.
Understanding the features of OpenAI Gym
Here, we will take a look at the key features that have made the OpenAI Gym toolkit very popular in the reinforcement learning community and led to it becoming widely adopted.
Simple environment interface
OpenAI Gym provides a simple and common Python interface to environments. Specifically, it takes an action as input and provides observation, reward, done and an optional info object, based on the action as the output at each step. If this does not make perfect sense to you yet, do not worry. We will go over the interface again in a more detailed manner to help you understand. This paragraph is just to give you an overview of the interface to make it clear how simple it is. This provides great flexibility for users as they can design and develop their agent algorithms based on any paradigm they like, and not be constrained to use any particular paradigm because of this simple and convenient interface.
Comparability and reproducibility
We intuitively feel that we should be able to compare the performance of an agent or an algorithm in a particular task to the performance of another agent or algorithm in the same task. For example, if an agent gets a score of 1,000 on average in the Atari game of Space Invaders, we should be able to tell that this agent is performing worse than an agent that scores 5000 on average in the Space Invaders game in the same amount of training time. But what happens if the scoring system for the game is slightly changed? Or if the environment interface was modified to include additional information about the game states that will provide an advantage to the second agent? This would make the score-to-score comparison unfair, right?
To handle such changes in the environment, OpenAI Gym uses strict versioning for environments. The toolkit guarantees that if there is any change to an environment, it will be accompanied by a different version number. Therefore, if the original version of the Atari Space Invaders game environment was named SpaceInvaders-v0 and there were some changes made to the environment to provide more information about the game states, then the environment’s name would be changed to SpaceInvaders-v1. This simple versioning system makes sure we are always comparing performance measured on the exact same environment setup. This way, the results obtained are comparable and reproducible.
Ability to monitor progress
All the environments available as part of the Gym toolkit are equipped with a monitor. This monitor logs every time step of the simulation and every reset of the environment. What this means is that the environment automatically keeps track of how our agent is learning and adapting with every step. You can even configure the monitor to automatically record videos of the game while your agent is learning to play.
Creating your first OpenAI Gym environment
MacOS and Ubuntu Linux systems come with Python installed by default. You can check which version of Python is installed by running python --version from a terminal window. If this returns python followed by a version number, then you are good to proceed to the next steps! If you get an error saying the Python command was not found, then you have to install Python.
- Install virtualenv:
$pip install virtualenv
If pip is not installed on your system, you can install it by typing sudo easy_install pip.
- Create a virtual environment named openai-gym using the virtualenv tool:
- Activate the openai-gym virtual environment:
- Install all the packages for the Gym toolkit from upstream:
$pip install -U gym
If you get permission denied or failed with error code 1 when you run the pip install command, it is most likely because the permissions on the directory you are trying to install the package to (the openai-gym directory inside virtualenv in this case) needs special/root privileges. You can either run sudo -H pip install -U gym[all] to solve the issue or change permissions on the openai-gym directory by running sudo chmod -R o+rw ~/openai-gym.
- Test to make sure the installation is successful:
$python -c 'import gym; gym.make("CartPole-v0");'
Creating and visualizing a new Gym environment
In just a minute or two, you have created an instance of an OpenAI Gym environment to get started!
Let’s open a new Python prompt and import the gym module:
Once the gym module is imported, we can use the gym.make method to create our new environment like this:
>>env = gym.make('CartPole-v0') >>env.reset() env.render()
This will bring up a window like this:
In this post, you learned what OpenAI Gym is, its features, and created your first OpenAI Gym environment. You now have a very good idea about OpenAI Gym.
If you’ve enjoyed this post, head over to the book, Hands-On Intelligent Agents with OpenAI Gym, to know about other latest learning environments and learning algorithms.
- Extending OpenAI Gym environments with Wrappers and Monitors [Tutorial]
- How to build a cartpole game using OpenAI Gym
- Top 5 tools for reinforcement learningTop 5 tools for reinforcement learning