-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Got bad performance when reproducing HACO #6
Comments
Thank you for running our code!
This is expected.
This is also expected and observed during our experiments.
Sure! First I am really happy you reproduce (even the strange behavior) our experiments. In the beginning of the training, we sometimes will take a full control for 1 or 2 episodes, in order to fill the human buffer with more useful data. Then we will enter human-AI shared control and intervenes when something go wrong. In our experiment we are very conservative and maintain the speed to near 15~20 kmh (so either the throttle we gave during intervention is also in a low value). We never brake during intervention, because we find the policy will soon converge to emergency stopping and never move again. And as you already saw, after training a long period the policy will collapse. We have some hypothesis behind this:
|
Hello, how to solve the problem of suddenly emergency braking after several iterations? |
That's a good question! We observe that too! The answer is:
|
Hi, I just attempt to reproduce HACO with keyboard by running "train_haco_keyboard_easy.py ", but encountered unsatisfactory training performance.
At the early stage, I can see the model was improved with the help of human interventions. After around 20~40 iterations, the car has learned some driving skills, and occasionally managed to reach the destination, albeit with uneven performance. However, after a few more iterations, strange things occurred. The car failed to start normally and would brake suddenly while driving. It seems like the model forgot the skills it previously learned and its performance worsened.
Could you please explain the reasons behind this issue? Is it related to improper timing for human intervention, an excessive focus on exploration, or some other factor?
The screenshot below is the evaluation results by running "eval_haco.py ", with EPISODE_NUM_PER_CKPT = 2.
The text was updated successfully, but these errors were encountered: