[HANDS-ON BUG] mlagents-learn in unit 5 not working #571

benbekir · 2024-10-24T11:56:51Z

Describe the bug

The command

!mlagents-learn ./config/ppo/SnowballTarget.yaml --env=./training-envs-executables/linux/SnowballTarget/SnowballTarget --run-id="SnowballTarget1" --no-graphics

doesnt work.
It always results in the following error:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_CUDA_addmm).

I have tried specifying the device with the --torch-device argument but that didnt help either.
Maybe this has something to do with the fact that dev versions are used for the ml-agents and ml-agents-envs packages?

  ml-agents: 1.2.0.dev0,
  ml-agents-envs: 1.2.0.dev0,
  Communicator API: 1.5.0,
  PyTorch: 2.5.0+cu121

Material

Did you use Google Colab?
Yes

The text was updated successfully, but these errors were encountered:

TPK-MAKG · 2024-10-25T17:08:27Z

Experiencing the same issue. Additionally, the runtime requires a restart after the installation of numpy packages, otherwise it can't find the hyperparameter file in /config/ppo/SnowballTarget.yaml. Maybe it's a package version conflict?

staffanrolfsson · 2024-11-01T09:45:35Z

I get the same errors.
The first problem after restart (for numpy), is that you will lack the change of working directory "% cd ml-agents", so the following cells will not work as intended. Easy fixed by just adding a cell to do this change and installations following will be correct. But main issue is still there when you try to start the learning:
"RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)"
The Pyramids learning further down in the same Notebook works fine...

staffanrolfsson · 2024-11-01T10:43:12Z

Tested to change the config file for SnowballTarget, setting threaded: false made the learning possible to start...

brumocas · 2024-11-04T13:41:15Z

Tested to change the config file for SnowballTarget, setting threaded: false made the learning possible to start...

This worked for me :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HANDS-ON BUG] mlagents-learn in unit 5 not working #571

[HANDS-ON BUG] mlagents-learn in unit 5 not working #571

benbekir commented Oct 24, 2024 •

edited

Loading

TPK-MAKG commented Oct 25, 2024

staffanrolfsson commented Nov 1, 2024 •

edited

Loading

staffanrolfsson commented Nov 1, 2024

brumocas commented Nov 4, 2024

[HANDS-ON BUG] mlagents-learn in unit 5 not working #571

[HANDS-ON BUG] mlagents-learn in unit 5 not working #571

Comments

benbekir commented Oct 24, 2024 • edited Loading

Describe the bug

Material

TPK-MAKG commented Oct 25, 2024

staffanrolfsson commented Nov 1, 2024 • edited Loading

staffanrolfsson commented Nov 1, 2024

brumocas commented Nov 4, 2024

benbekir commented Oct 24, 2024 •

edited

Loading

staffanrolfsson commented Nov 1, 2024 •

edited

Loading