IRLwPython

Inverse Reinforcement Learning Algorithm implementation with python.

Exploring Maximum Entropy Inverse Reinforcement Learning

My seminar paper can be found in paper, which is based on IRLwPython version 0.0.1

Implemented Algorithms

Maximum Entropy IRL:

Implementation of the Maximum Entropy inverse reinforcement learning algorithm from [1] and is based on the implementation of lets-do-irl. It is an IRL algorithm using Q-Learning with a Maximum Entropy update function.

Maximum Entropy IRL (MEIRL):

Implementation of the maximum entropy inverse reinforcement learning algorithm from [1] and is based on the implementation of lets-do-irl. It is an IRL algorithm using q-learning with a maximum entropy update function for the IRL reward estimation. The next action is selected based on the maximum of the q-values.

Maximum Entropy Deep IRL (MEDIRL:

An implementation of the maximum entropy inverse reinforcement learning algorithm, which uses a neural-network for the actor. The estimated irl-reward is learned similar as in MEIRL. It is an IRL algorithm using deep q-learning with a maximum entropy update function. The next action is selected based on an epsilon-greedy algorithm and the maximum of the q-values.

Maximum Entropy Deep RL (MEDRL):

MEDRL is a RL implementation of the MEDIRL algorithm. This algorithm gets the real rewards directly from the environment, instead of estimating IRL rewards. The NN architecture and action selection is the same as in MEDIRL.

Experiment

Mountaincar-v0

The Mountaincar-v0 is used for evaluating the different algorithms. Therefore, the implementation of the MDP for the Mountaincar from gym is used.

The expert demonstrations for the Mountaincar-v0 are the same as used in lets-do-irl.

Heatmap of Expert demonstrations with 400 states:

Comparing the algorithms

The following tables compare the result of training and testing the two IRL algorithms Maximum Entropy and Maximum Entropy Deep. Furthermore, results for the RL algorithm Maximum Entropy Deep algorithm are shown, to highlight the differences between IRL and RL.

Algorithm	Training Curve after 1000 Episodes	Training Curve after 5000 Episodes
Maximum Entropy IRL
Maximum Entropy Deep IRL
Maximum Entropy Deep RL

Algorithm	State Frequencies Learner: 1000 Episodes	State Frequencies Learner: 2000 Episodes	State Frequencies Learner: 5000 Episodes
Maximum Entropy IRL
Maximum Entropy Deep IRL
Maximum Entropy Deep RL

Algorithm	IRL Rewards: 1000 Episodes	IRL Rewards: 2000 Episodes	IRL Rewards: 5000 Episodes	IRL Rewards: 14000 Episodes
Maximum Entropy IRL		None
Maximum Entropy Deep IRL				None
Maximum Entropy Deep RL	None	None	None	None

Algorithm	Testing Results: 100 Runs
Maximum Entropy IRL
Maximum Entropy Deep IRL
Maximum Entropy Deep RL

References

The implementation of MaxEntropyIRL and MountainCar is based on the implementation of: lets-do-irl

[1] BD. Ziebart, et al., "Maximum Entropy Inverse Reinforcement Learning", AAAI 2008.

Installation

cd IRLwPython
pip install .

Usage

usage: irl-runner [-h] [--version] [--training] [--testing] [--render] ALGORITHM

Implementation of IRL algorithms

positional arguments:
  ALGORITHM   Currently supported training algorithm: [max-entropy, max-entropy-deep, max-entropy-deep-rl]

options:
  -h, --help  show this help message and exit
  --version   show program's version number and exit
  --training  Enables training of model.
  --testing   Enables testing of previously created model.
  --render    Enables visualization of mountaincar.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
demo		demo
logo		logo
paper		paper
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IRLwPython

Exploring Maximum Entropy Inverse Reinforcement Learning

Implemented Algorithms

Maximum Entropy IRL:

Maximum Entropy IRL (MEIRL):

Maximum Entropy Deep IRL (MEDIRL:

Maximum Entropy Deep RL (MEDRL):

Experiment

Mountaincar-v0

Comparing the algorithms

References

Installation

Usage

About

Releases 2

Packages

Languages

License

HokageM/IRLwPython

Folders and files

Latest commit

History

Repository files navigation

IRLwPython

Exploring Maximum Entropy Inverse Reinforcement Learning

Implemented Algorithms

Maximum Entropy IRL:

Maximum Entropy IRL (MEIRL):

Maximum Entropy Deep IRL (MEDIRL:

Maximum Entropy Deep RL (MEDRL):

Experiment

Mountaincar-v0

Comparing the algorithms

References

Installation

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages