Implement the actor-critic methods #1

originholic · 2016-04-10T00:04:24Z

Hello,
In the asynchronous dqn paper, they also described an on policy method, the advantage actor-critic (A3C), which achieved better results than others, do you currently have any plan to include this method in this repo as well?
Because I am working off this repo as a starting point, and attempt to reproduce the results of the A3C method on the continuous action domain, but I am still trying to figure out the network model they used in the physical state case when apply to Mojoco, and how the policy gradient is accumulated.

Zeta36 · 2016-04-10T07:13:22Z

No, originholic. I'm working in others things right now :(.

Maybe in the futurre I try with the advantage actor-critic, but not now. I'm sorry.

Regards.
Samu.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement the actor-critic methods #1

Implement the actor-critic methods #1

originholic commented Apr 10, 2016

Zeta36 commented Apr 10, 2016

Implement the actor-critic methods #1

Implement the actor-critic methods #1

Comments

originholic commented Apr 10, 2016

Zeta36 commented Apr 10, 2016