About the training curve #6

xu-yang16 · 2024-04-30T05:37:41Z

I am using the code to train A1 on RTX 4090. However, I've noticed a significant decrease in the mean reward around the 2800th iteration. Is everything right? Should I continue training?

hdShang · 2024-04-30T12:59:04Z

I had the same problem, the rewards fluctuated a lot.The first drop in rewards may be due to the command curriculum update, but the reason for the second drop is something I don't know yet.

sixFlag · 2024-05-02T15:53:39Z

为啥a1.urdf 中的limit effort 会被修改啊？

sixFlag · 2024-05-03T11:40:24Z

Please post the learning rate for a look. Visual inspection suggests that there are too many iterations. 1500 iterations is enough.

Junfeng-Long · 2024-05-03T11:58:04Z

Sorry for the late reply. We did not train for such a long time. However, the significant decrease in the mean reward is probably due to the curriculum of commands and terrains. I suggest training for at most 2000 iterations. We also updated the config for a1 and go1, high speed is a bit of a conflict with the ability to cross difficult terrains for small dogs, so we limited the highest command for small dogs to 2m/s.

Junfeng-Long · 2024-05-03T12:04:09Z

为啥a1.urdf 中的limit effort 会被修改啊？

We use the a1.urdf from the original repo of legged_gym. There is also a file named a1.urdf.origin from unitree_ros, which seems hard to train. But the deployment is built upon the control loop of joint position and unitree SDK seems to handle this issue well. Therefore, don't worry about this. Position limits, velocity limits and power penalization are enough.

sixFlag · 2024-05-03T15:35:01Z

为啥a1.urdf 中的limit effort 会被修改啊？

We use the a1.urdf from the original repo of legged_gym. There is also a file named a1.urdf.origin from unitree_ros, which seems hard to train. But the deployment is built upon the control loop of joint position and unitree SDK seems to handle this issue well. Therefore, don't worry about this. Position limits, velocity limits and power penalization are enough.

In other words, the current environment is not suitable for small dogs, but in order not to change the environment, the model is changed. Am i right?

UltronAI · 2024-05-10T07:05:07Z

@Junfeng-Long
Thank you for your impressive work on this project. I've been trying to replicate the results shown in Figure 5 of the HIMloco paper but have encountered some discrepancies. My training results are notably lower than those reported. I noticed that the curves I obtained (shown below) appear similar to those previously posted by other users.

I ran the python train.py command without making any modifications. Here are the curves I obtained:

If I want to compare different methods in simulation, can I directly compare them with these curves?

hanzhi0410 · 2024-05-13T09:04:00Z

Hello, I have also encountered the same problem. Do you know where the problem lies? Thank you for your help

Junfeng-Long · 2024-05-14T14:43:52Z

Sorry for the late reply. There are some bugs and improper configs in the code. Already fixed, please try the new one.

hanzhi0410 · 2024-05-15T08:29:11Z

为啥a1.urdf 中的limit effort 会被修改啊？

We use the a1.urdf from the original repo of legged_gym. There is also a file named a1.urdf.origin from unitree_ros, which seems hard to train. But the deployment is built upon the control loop of joint position and unitree SDK seems to handle this issue well. Therefore, don't worry about this. Position limits, velocity limits and power penalization are enough.

Thank you for your impressive work on this project.I used this project to train a policy and wanted to simulate it in Gazebo, but I found that the policy performed well in isaac. However, when I used Gazebo, the robot shook violently and could not stand properly. Have you done similar work before? Can you give me some suggestions? Thank you for your help

Junfeng-Long · 2024-05-15T08:37:46Z

为啥a1.urdf 中的limit effort 会被修改啊？

We use the a1.urdf from the original repo of legged_gym. There is also a file named a1.urdf.origin from unitree_ros, which seems hard to train. But the deployment is built upon the control loop of joint position and unitree SDK seems to handle this issue well. Therefore, don't worry about this. Position limits, velocity limits and power penalization are enough.

Thank you for your impressive work on this project.I used this project to train a policy and wanted to simulate it in Gazebo, but I found that the policy performed well in isaac. However, when I used Gazebo, the robot shook violently and could not stand properly. Have you done similar work before? Can you give me some suggestions? Thank you for your help

We have done this test with Aliengo in gazebo, it works well but is still worse than Isaac. I would like to help if you can offer more information. For example, video, inference output, or the code. You can send me directly or post them here.

hanzhi0410 · 2024-05-17T09:34:25Z

为啥a1.urdf 中的limit effort 会被修改啊？

We use the a1.urdf from the original repo of legged_gym. There is also a file named a1.urdf.origin from unitree_ros, which seems hard to train. But the deployment is built upon the control loop of joint position and unitree SDK seems to handle this issue well. Therefore, don't worry about this. Position limits, velocity limits and power penalization are enough.

Thank you for your impressive work on this project.I used this project to train a policy and wanted to simulate it in Gazebo, but I found that the policy performed well in isaac. However, when I used Gazebo, the robot shook violently and could not stand properly. Have you done similar work before? Can you give me some suggestions? Thank you for your help

We have done this test with Aliengo in gazebo, it works well but is still worse than Isaac. I would like to help if you can offer more information. For example, video, inference output, or the code. You can send me directly or post them here.
May I ask what inference tool you are using, is it libtorch? Thank you for your help.

Junfeng-Long · 2024-05-17T10:51:37Z

为啥a1.urdf 中的limit effort 会被修改啊？

We use the a1.urdf from the original repo of legged_gym. There is also a file named a1.urdf.origin from unitree_ros, which seems hard to train. But the deployment is built upon the control loop of joint position and unitree SDK seems to handle this issue well. Therefore, don't worry about this. Position limits, velocity limits and power penalization are enough.

Thank you for your impressive work on this project.I used this project to train a policy and wanted to simulate it in Gazebo, but I found that the policy performed well in isaac. However, when I used Gazebo, the robot shook violently and could not stand properly. Have you done similar work before? Can you give me some suggestions? Thank you for your help

We have done this test with Aliengo in gazebo, it works well but is still worse than Isaac. I would like to help if you can offer more information. For example, video, inference output, or the code. You can send me directly or post them here.
May I ask what inference tool you are using, is it libtorch? Thank you for your help.

We use pytorch since there are cuda on dog's ob-board computer.

Junfeng-Long · 2024-05-17T16:37:53Z

为啥a1.urdf 中的limit effort 会被修改啊？

We use the a1.urdf from the original repo of legged_gym. There is also a file named a1.urdf.origin from unitree_ros, which seems hard to train. But the deployment is built upon the control loop of joint position and unitree SDK seems to handle this issue well. Therefore, don't worry about this. Position limits, velocity limits and power penalization are enough.

Thank you for your impressive work on this project.I used this project to train a policy and wanted to simulate it in Gazebo, but I found that the policy performed well in isaac. However, when I used Gazebo, the robot shook violently and could not stand properly. Have you done similar work before? Can you give me some suggestions? Thank you for your help

It seems that you accidentally open an issue under my homepage repo. Sorry for not noticing that. But happy to see that you figured out the problem:)

hanzhi0410 · 2024-05-18T06:17:24Z

为啥a1.urdf 中的limit effort 会被修改啊？

We use the a1.urdf from the original repo of legged_gym. There is also a file named a1.urdf.origin from unitree_ros, which seems hard to train. But the deployment is built upon the control loop of joint position and unitree SDK seems to handle this issue well. Therefore, don't worry about this. Position limits, velocity limits and power penalization are enough.

Thank you for your impressive work on this project.I used this project to train a policy and wanted to simulate it in Gazebo, but I found that the policy performed well in isaac. However, when I used Gazebo, the robot shook violently and could not stand properly. Have you done similar work before? Can you give me some suggestions? Thank you for your help

It seems that you accidentally open an issue under my homepage repo. Sorry for not noticing that. But happy to see that you figured out the problem:)
Hello, thank you for your reply. I'm sorry, but as a beginner, I'm not familiar with GitHub. I left a message in the wrong position before, and I noticed that during real machine deployment, there may be a situation where the hind legs are inside eight and the dog's steering is not responsive. Have you ever encountered this situation? I hope you can give me some advice. Thank you for your help

hanzhi0410 · 2024-05-18T06:23:17Z

And I found that the training strategy also has an inner eight situation

Junfeng-Long · 2024-06-04T15:54:55Z

I think this is due to improper target height configuration. Try a lower target height, for example, 0.25m for a1.

hanzhi0410 · 2024-06-12T05:09:05Z

I think this is due to improper target height configuration. Try a lower target height, for example, 0.25m for a1.

Thank you for your reply. I noticed that your code has set the leg lifting height, but the weight setting is very low and the leg lifting height after training is not ideal, which is much different from the effect in the video you posted. I would like to ask if you have any other methods to improve this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the training curve #6

About the training curve #6

xu-yang16 commented Apr 30, 2024

hdShang commented Apr 30, 2024 •

edited

Loading

sixFlag commented May 2, 2024

sixFlag commented May 3, 2024

Junfeng-Long commented May 3, 2024

Junfeng-Long commented May 3, 2024

sixFlag commented May 3, 2024

UltronAI commented May 10, 2024

hanzhi0410 commented May 13, 2024

Junfeng-Long commented May 14, 2024

hanzhi0410 commented May 15, 2024

Junfeng-Long commented May 15, 2024

hanzhi0410 commented May 17, 2024

Junfeng-Long commented May 17, 2024

Junfeng-Long commented May 17, 2024

hanzhi0410 commented May 18, 2024

hanzhi0410 commented May 18, 2024

Junfeng-Long commented Jun 4, 2024

hanzhi0410 commented Jun 12, 2024

About the training curve #6

About the training curve #6

Comments

xu-yang16 commented Apr 30, 2024

hdShang commented Apr 30, 2024 • edited Loading

sixFlag commented May 2, 2024

sixFlag commented May 3, 2024

Junfeng-Long commented May 3, 2024

Junfeng-Long commented May 3, 2024

sixFlag commented May 3, 2024

UltronAI commented May 10, 2024

hanzhi0410 commented May 13, 2024

Junfeng-Long commented May 14, 2024

hanzhi0410 commented May 15, 2024

Junfeng-Long commented May 15, 2024

hanzhi0410 commented May 17, 2024

Junfeng-Long commented May 17, 2024

Junfeng-Long commented May 17, 2024

hanzhi0410 commented May 18, 2024

hanzhi0410 commented May 18, 2024

Junfeng-Long commented Jun 4, 2024

hanzhi0410 commented Jun 12, 2024

hdShang commented Apr 30, 2024 •

edited

Loading