You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When comparing your code to the pseudocode snippet in the article, you seem to use the bootstrapped value R_t for every iteration where you append values to the R array (currently line 255). Shouldn't we append something in the line of `rewards[i] + GAMMA*R[i] where the first element is either 0.0 or V(s_t, Theta'_t) depending on if s_t was terminal or not?
Kind regards,
joabim
The text was updated successfully, but these errors were encountered:
Hi,
When comparing your code to the pseudocode snippet in the article, you seem to use the bootstrapped value R_t for every iteration where you append values to the R array (currently line 255). Shouldn't we append something in the line of `rewards[i] + GAMMA*R[i] where the first element is either 0.0 or V(s_t, Theta'_t) depending on if s_t was terminal or not?
Kind regards,
joabim
The text was updated successfully, but these errors were encountered: