Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the actor-critic methods #1

Open
originholic opened this issue Apr 10, 2016 · 1 comment
Open

Implement the actor-critic methods #1

originholic opened this issue Apr 10, 2016 · 1 comment

Comments

@originholic
Copy link

Hello,
In the asynchronous dqn paper, they also described an on policy method, the advantage actor-critic (A3C), which achieved better results than others, do you currently have any plan to include this method in this repo as well?
Because I am working off this repo as a starting point, and attempt to reproduce the results of the A3C method on the continuous action domain, but I am still trying to figure out the network model they used in the physical state case when apply to Mojoco, and how the policy gradient is accumulated.

@Zeta36
Copy link
Owner

Zeta36 commented Apr 10, 2016

No, originholic. I'm working in others things right now :(.

Maybe in the futurre I try with the advantage actor-critic, but not now. I'm sorry.

Regards.
Samu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants