Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting device for SapienRenderer doesn't work #170

Open
XYZ-99 opened this issue Aug 5, 2024 · 2 comments
Open

Setting device for SapienRenderer doesn't work #170

XYZ-99 opened this issue Aug 5, 2024 · 2 comments

Comments

@XYZ-99
Copy link

XYZ-99 commented Aug 5, 2024

System:

  • OS version: Ubuntu 22.04
  • Python version (if applicable): Python 3.10
  • SAPIEN version (pip freeze | grep sapien): 2.2.2
  • Environment: Server with xvfb

Describe the bug
#139 mentions setting device for SapienRenderer like this:

sapien_renderer = SapienRenderer(..., device="pci:0")

However, it seems the device somehow couldn't be found—

[2024-08-05 16:43:29.704] [svulkan2] [error] GLFW error: X11: The DISPLAY environment variable is missing
[2024-08-05 16:43:29.704] [svulkan2] [warning] Continue without GLFW.
[2024-08-05 16:43:29.944] [svulkan2] [info] Vulkan instance initialized
[2024-08-05 16:43:29.944] [svulkan2] [info] Devices visible to Vulkan
 Id                                    name   Present Supported    PciBus    CudaId     RayTracing
  0                   NVIDIA H100 80GB HBM3         0         1         0         0              1
  1                   NVIDIA H100 80GB HBM3         0         1         0         0              1

[2024-08-05 16:43:29.944] [svulkan2] [info] Devices visible to Cuda
    CudaId    PciBus             PciBusString
         0         0             000A:00:00.0
         1         0             000B:00:00.0

[2024-08-05 16:43:29.944] [svulkan2] [info] Vulkan finished
0it [00:22, ?it/s]
Traceback:
...
File "[...]/ManiSkill2_real2sim/mani_skill2_real2sim/envs/sapien_env.py", line 107, in __init__
    self._renderer = sapien.SapienRenderer(**renderer_kwargs)
RuntimeError: Cannot find cuda device suitable for rendering cuda:1

P.S. [error] GLFW error: X11: The DISPLAY environment variable is missing isn't a real error, since I can still run the code, if I don't specify device for SapienRenderer.

I tried both "cuda:1" and "pci:1" but neither worked.

However, my issue shouldn't be the same as #115 because I can run the code without specifying the device.

Could you tell me what the device format should be?

@fbxiang
Copy link
Collaborator

fbxiang commented Aug 30, 2024

I am recently getting many different types of issues related to H100, probably because this GPU does not even include rendering cores to run graphics workloads. Your issue seems like a new one.
First, you can try using SAPIEN 3.0.0b1, SAPIEN 2 is a bit too old. Next you can try adding environment variable SAPIEN_DISABLE_RAY_TRACING=1, somehow simply enabling ray tracing can break H100 completely, even if the driver decides to report it can support ray tracing. For now my own workaround is just to avoid H100 altogether as it is not a good choice for rendering anyway.

@fbxiang
Copy link
Collaborator

fbxiang commented Aug 30, 2024

Regarding the X11 error, your observation is correct. SAPIEN logs an "error" when it can successfully workaround it (in this case, SAPIEN simply disables on-screen display), otherwise it throws an exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants