who do struggle with tf.nn.softmax_cross_entropy_with_logits_v2 in Cartpole REINFORCE Monte Carlo Policy Gradients #85

gekator · 2022-12-08T16:46:35Z

Guys, if you struggle with
neg_log_prob = tf.nn.softmax_cross_entropy_with_logits_v2(logits = fc3, labels = actions)
in n Cartpole REINFORCE Monte Carlo Policy Gradients.
I killed some time to understand what is happening there
You can change code as bellow:

y_hat_softmax = tf.nn.softmax(fc3)

y_cross = actions * tf.log(y_hat_softmax)

neg_log_prob = - tf.reduce_sum(y_cross, 1)

loss = tf.reduce_mean(neg_log_prob * discounted_episode_rewards_)

also change
actions = tf.placeholder(tf.float32, [None, action_size], name="actions")

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

who do struggle with tf.nn.softmax_cross_entropy_with_logits_v2 in Cartpole REINFORCE Monte Carlo Policy Gradients #85

who do struggle with tf.nn.softmax_cross_entropy_with_logits_v2 in Cartpole REINFORCE Monte Carlo Policy Gradients #85

gekator commented Dec 8, 2022 •

edited

Loading

who do struggle with tf.nn.softmax_cross_entropy_with_logits_v2 in Cartpole REINFORCE Monte Carlo Policy Gradients #85

who do struggle with tf.nn.softmax_cross_entropy_with_logits_v2 in Cartpole REINFORCE Monte Carlo Policy Gradients #85

Comments

gekator commented Dec 8, 2022 • edited Loading

gekator commented Dec 8, 2022 •

edited

Loading