OpenAI Gym is an environment for developing and testing learning agents. It’s best suited as a reinforcement learning agent, but it doesn’t prevent you from trying other methods, such as hard-coded game solver or other deep learning approaches.
What Is OpenAI Gym?
OpenAI Gym is a free Python toolkit that provides developers with an environment for developing and testing learning agents for deep learning models. It’s useful as a reinforcement learning agent, but it’s also adept at testing new learning agent ideas, running training simulations and speeding up the learning process for your algorithm.
When to Use OpenAI’s Gym Environment
There are a number of situations when using OpenAI’s Gym environment is useful. This includes:
- You want to learn reinforcement learning algorithms. There are a variety of environments for you to play with and try different reinforcement learning algorithms.
- You have a new idea for learning agents and want to test that. This environment is best suited to try new algorithms in simulation and compare the results with existing ones.
- You want to train agents on cases that we don’t want to model in reality. Deep learning requires a lot of training examples, both positive and negative, and it’s hard to provide such examples. For example, if you’re training a self-driving car to learn about accidents, it’s important that the AI knows what and how accidents can happen. However, this can be costly as well as risky to model in the real world. Simulations can help us here.
- Speeding up learning. Is simulating activities better than doing them in real life? Of course it is. Simulation is better because we can do them in 10X, allowing our agent to learn faster.
How to Set Up OpenAI Gym
OpenAI Gym is very easy to set up.
What Are the Requirements for OpenAI Gym?
- Python 3.5+
- pip: A pip will be required whether you are installing from the source or not.
How to Install OpenAI Gym
There are two ways to install OpenAI Gym. These include:
Install Using Pip
pip install gym
Install From the Source
git clone https://github.com/openai/gym && cd gym
pip install -e .
OpenAI Gym Example
First, we need to import gym.
import gym
Then we need to create an environment to try it out.
env = gym.make(‘MountainCar-v0’)
Wait, what is this environment? Gym is all about this interaction of agents in this environment.
There are plenty of environments for us to play with. As of now, there are 797 environments. Use the following snippet to learn about them:
import gym
envs = gym.envs.registry.all()
print(envs)
print('Total envs available:', len(envs))
OpenAI Gym With a Random Agent
import gym
env = gym.make('MountainCar-v0')
# Uncomment following line to save video of our Agent interacting in this environment
# This can be used for debugging and studying how our agent is performing
# env = gym.wrappers.Monitor(env, './video/', force = True)
t = 0
while True:
t += 1
env.render()
observation = env.reset()
print(observation)
action = env.action_space.sample()
observation, reward, done, info = env.step(action)
if done:
print("Episode finished after {} timesteps".format(t+1))
break
env.close()
In the source code, we are using the following APIs of environment:
action_space
: Set of valid actions at this state.step
: This takes specified action and returns updated information gathered from the environment, such asobservation
,reward
, whether the goal is reached or not and miscellaneousinfo
that can be useful for debugging.
Some other key terms to know include:
Observation
: An observation is specific to the environment. For example, in Mountain-Car, it will return the speed and velocity, which is required for building the momentum in order to achieve the goal. In some cases, it will be raw pixel data.Reward
: This is the amount achieved by the last action. By default, the goal is to maximize the reward.Done
: Done tells us when we are done and the agent has achieved the goal.Info
: This emits debug information, which is useful for when something goes wrong and you have to figure out what the agent is doing.
Now, our above random agent is designed to achieve the goal once. Add an outer loop just like the number of epochs in training other deep learning algorithms. An epoch is also referred to as an episode.
OpenAI Gym Template
for n times
while Goal is not achieved
take_action()
take_step()
end of while
end of for