Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hello, why /var/log/nv-hostengine.log file had many ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() #174

Open
13416157913 opened this issue May 24, 2024 · 0 comments

Comments

@13416157913
Copy link

Hello everyone, why my /var/log/nv-hostengine.log file had many ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() .
like this :
2024-05-24 10:11:29.243 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:11:59.243 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:12:29.243 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:12:59.243 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:13:29.243 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:13:59.244 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:14:29.244 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:14:59.244 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:15:29.244 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:15:59.244 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:16:29.244 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:16:59.244 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:17:29.244 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:17:59.244 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:18:29.245 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:18:59.245 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:19:29.245 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:19:59.245 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:20:29.245 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:20:59.245 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:21:29.245 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:21:59.245 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:22:29.245 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:22:59.245 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:23:29.246 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:23:59.246 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:24:29.246 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:24:59.246 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:25:29.246 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:25:59.246 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:26:29.246 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:27:59.246 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:28:29.247 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:28:59.247 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:29:29.247 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:29:59.247 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:30:29.247 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:30:59.247 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:31:29.247 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:32:08.069 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:32:38.069 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:33:08.069 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:33:38.069 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]
2024-05-24 10:34:08.069 ERROR [5231:5273] [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/workspaces/dcgm-rel_dcgm_2_2-postmerge/modules/nvswitch/DcgmModuleNvSwitch.cpp:388] [DcgmNs::DcgmModuleNvSwitch::RunOnce]

My Nvidia-smi:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A800-SXM4-80GB On | 00000000:5B:00.0 Off | 0 |
| N/A 31C P0 64W / 400W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA A800-SXM4-80GB On | 00000000:5E:00.0 Off | 0 |
| N/A 30C P0 59W / 400W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+

dcgmi -v:
Version : 2.2.9
Build ID : 14
Build Date : 2021-07-23
Build Type : Release
Commit ID : 3d9c443e28d491a942d3f0bbad0cf0579a20fdfd
Branch Name : rel_dcgm_2_2
CPU Arch : x86_64
Build Platform : Linux 4.4.0-116-generic NVIDIA/dcgm-exporter#140 SMP Mon Feb 12 21:23:04 UTC 2018 x86_64
CRC : a015d3b885ad821a2424294a80e2366e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant