Skip to content

This repo implements the HIRO algorithm for Hierarchical Reinforcement Learning in the original environment using Tensorflow 2.

Notifications You must be signed in to change notification settings

P-Schumacher/ant_repo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Implementation of HIRO: Data-Efficient Hierarchical Reinforcement Learning

This repository implements the HIRO algorithm for Hierarchical Reinforcement Learning on the original AntMaze environment as presented by Ofir Nachum (Data-Efficient Hierarchical Reinforcement Learning, 2018)

Dependencies

  • gym==0.16.0
  • mujoco-py==1.50.1.68
  • tensorflow==2.0
  • wandb==0.8.29
  • omegaconf==1.4.1
  • numpy==1.18.1

Usage

$ python3 main.py ant_config

This loads the settings in the experiments/ant_config.yaml which trains the agent for 1.5 millions steps. Every 20000 timesteps, 10 evaluative episodes are played where exploratory noise is turned off. The performance of the agent is recorded and the model parameters are saved. Run:

$ python3 main.py ant_render

to then load that model and render the environment. I use OmegaConf to load different configurations. The default settings are kept in configs/ant_default while configs for specific experiments are saved in experiments/. I use the wandb framework to save and analyse data from different runs.

About

This repo implements the HIRO algorithm for Hierarchical Reinforcement Learning in the original environment using Tensorflow 2.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages