-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About PPO train on the Intersection and DQN train on the highway-v0 #585
Comments
Hi, So I would set the observation config to relative (absolute: False) and try again. For intersection-v0 however, absolute coordinates are more appropriate since it's always the locations in the scenes that are visited (but relative coordinates may work well too). PPO should definitely be able to learn a medium policy, e.g. tries to cross the intersection and sometimes collides. The MLP is a bad model for this task because it cannot easily understand and generalise interactions between vehicles , and I got much better results with Transformer models (see paper), but MLP should at least get off the ground and improve a bit over a random policy. |
Thank you very much for your help, I will follow your suggestions to modify the code, thank you very much for your reply to the open source community. By the way, I don't unsderstand the ego_spacing and "destination": "o1" and "scaling": 5.5 * 1.3 these parameter Could you please explain for me ? I would appreciate it very much. |
o1 is the west outer location.
|
Hello dear authors, thanks for your contributions in highway-env, but I recently had some questions when training the agent with stable-baselines3:
1.I learned 20,000 steps with DQN in highway-v0, but it only learned to steer to the far right, and can't dodge vehicles or even overtake, and the code is the official documentation code as follows:
Is there anything wrong with this code? please
2.Even I learned 400,000 steps with PPO at the intersection, but the learning effect was very bad, I don't know what went wrong, can you help me? code as follows:
The text was updated successfully, but these errors were encountered: