You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In your pseudocode for calculating q*, if π is deterministic (as stated in initialization and in pseudocode given for v*), then you don't need to loop on all a∈A in step 2 and you don't need to a to ponderate on all a' for the Q(s,a) calculation.
Again, in step 3 you shouldn't loop on a because you get old-action with the deterministic policy.
Thanks for considering this fix ;) Have a nice day !
The text was updated successfully, but these errors were encountered:
In your pseudocode for calculating q*, if π is deterministic (as stated in initialization and in pseudocode given for v*), then you don't need to loop on all a∈A in step 2 and you don't need to a to ponderate on all a' for the Q(s,a) calculation.
Again, in step 3 you shouldn't loop on a because you get old-action with the deterministic policy.
Thanks for considering this fix ;) Have a nice day !
The text was updated successfully, but these errors were encountered: