missing the initialization of target action value and refreshing the Qhat #13

fi000 · 2018-05-03T08:52:09Z

I have several questions:
1- When I compared with algorithm presented in"Human-level control through deep reinforcement learning", I can not find the third initialization (initial target action value)? Also, I do not find the last step "every C step Qhat=Q"? Would you please explain where are them or what is the difference to reach them? These steps seems essential!
2- I have my own environment, If I want to have a state=[a,b,c] as input instead of just one input for DQN showing the state what I should do?

WorksWellWithOthers · 2020-12-05T00:20:25Z

There is a function updating the target model. Does this answer your question?
How about, state = [[a, b, c]] ?

fi000 changed the title ~~multi-inputs instead of one input and missing the initialization of target action value and refreshing the Qhat~~ missing the initialization of target action value and refreshing the Qhat May 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

missing the initialization of target action value and refreshing the Qhat #13

missing the initialization of target action value and refreshing the Qhat #13

fi000 commented May 3, 2018

WorksWellWithOthers commented Dec 5, 2020

missing the initialization of target action value and refreshing the Qhat #13

missing the initialization of target action value and refreshing the Qhat #13

Comments

fi000 commented May 3, 2018

WorksWellWithOthers commented Dec 5, 2020