Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

D3 (RTD3) Power Management not working #755

Closed
1 of 2 tasks
nikonikolov opened this issue Dec 24, 2024 · 3 comments
Closed
1 of 2 tasks

D3 (RTD3) Power Management not working #755

nikonikolov opened this issue Dec 24, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@nikonikolov
Copy link

NVIDIA Open GPU Kernel Modules Version

565.77

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • I confirm that this does not happen with the proprietary driver package.

Operating System and Version

Slackware Linux current

Kernel Release

Linux dell-xps15.niko-pc 6.12.6 #1 SMP PREEMPT_DYNAMIC Thu Dec 19 14:44:30 CST 2024 x86_64 13th Gen Intel(R) Core(TM) i9-13900H GenuineIntel GNU/Linux

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Hardware: GPU

GPU 0: NVIDIA GeForce RTX 4060 Laptop GPU (UUID: GPU-89583c91-b9ba-d10e-d31d-93019a0fb626)

Describe the bug

I am running my GUI entirely on the built-in intel GPU. I am using my nvidia GPU only on demand for cuda-related computations. I want to 'turn off' my nvidia GPU while it's not being used.

Following https://download.nvidia.com/XFree86/Linux-x86_64/565.77/README/dynamicpowermanagement.html, it seems like the nvidia GPU should automatically 'power off' while it's not in use.
Indeed, running cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status outputs suspended. However, the output of sudo nvidia-smi -q -d POWER is

==============NVSMI LOG==============

Timestamp                                 : Tue Dec 24 17:08:56 2024
Driver Version                            : 565.77
CUDA Version                              : 12.7

Attached GPUs                             : 1
GPU 00000000:01:00.0
    GPU Power Readings
        Power Draw                        : 11.72 W
        Current Power Limit               : 35.00 W
        Requested Power Limit             : 35.00 W
        Default Power Limit               : 35.00 W
        Min Power Limit                   : 5.00 W
        Max Power Limit                   : 50.00 W
    Power Samples
        Duration                          : Not Found
        Number of Samples                 : Not Found
        Max                               : Not Found
        Min                               : Not Found
        Avg                               : Not Found
    GPU Memory Power Readings 
        Power Draw                        : N/A
    Module Power Readings
        Power Draw                        : N/A
        Current Power Limit               : N/A
        Requested Power Limit             : N/A
        Default Power Limit               : N/A
        Min Power Limit                   : N/A
        Max Power Limit                   : N/A

This shows that the GPU is using 11.72 W instead of 0 W and depletes precious battery life. I am expecting that when the GPU is in suspended state, it would be drawing 0 W. In case my assumption is not correct, how can I dynamically 'turn off' the GPU such that it draws 0 Watts and then dynamically turn it on without rebooting my system?

To Reproduce

  • No applications running on the nvidia GPU
  • Run cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status and verify it outputs suspended
  • Check the power usage via sudo nvidia-smi -q -d POWER

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

More Info

No response

@nikonikolov nikonikolov added the bug Something isn't working label Dec 24, 2024
@aaronp24
Copy link
Member

If you monitor the runtime status with something like watch -n 0.2 cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status, do you see it switch to active when you run nvidia-smi? My understanding is that nvidia-smi wakes the GPU from sleep in order to query data from it, so it's expected to see a non-zero power draw when you run it.

@nikonikolov
Copy link
Author

Yes, I see it changing from suspended to active. What's would be the recommended way to confirm the GPU is indeed drawing 0 Watts?

@aaronp24
Copy link
Member

You'll have to rely on runtime_status and the /proc/driver/nvidia/gpus/*/power files. The power regulator that nvidia-smi queries can't report power usage when it's turned off. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants