Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaxMemPerNode ignored #883

Open
pneerincx opened this issue Nov 3, 2023 · 0 comments
Open

MaxMemPerNode ignored #883

pneerincx opened this issue Nov 3, 2023 · 0 comments

Comments

@pneerincx
Copy link
Contributor

Seen on nb-node-b01:

$> scontrol show node nb-node-b01
NodeName=nb-node-b01 Arch=x86_64 CoresPerSocket=1 
   CPUAlloc=5 CPUEfctv=32 CPUTot=32 CPULoad=2.01
   AvailableFeatures=tmp02,gpu,A40
   ActiveFeatures=tmp02,gpu,A40
   Gres=gpu:a40:8
   NodeAddr=nb-node-b01 NodeHostName=nb-node-b01 Version=22.05.2
   OS=Linux 3.10.0-1160.92.1.el7.x86_64 #1 SMP Tue Jun 20 11:48:01 UTC 2023 
   RealMemory=110122 AllocMem=108544 FreeMem=24082 Sockets=32 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=975 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=gpu_a40 
   BootTime=2023-09-04T14:35:52 SlurmdStartTime=2023-09-04T14:36:45
   LastBusyTime=2023-10-30T01:39:38
   CfgTRES=cpu=32,mem=110122M,billing=35,gres/gpu=8
   AllocTRES=cpu=5,mem=106G,gres/gpu=2
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

That should not be possible with MaxMemPerNode=93738 in /etc/slurm/slurm.conf:

PartitionName=gpu_a40 Default=False Nodes=nb-node-b[01-02] MaxNodes=1 MaxCPUsPerNode=30 MaxMemPerNode=93738 TRESBillingWeights="CPU=1.0,Mem=0.333G" DenyQos=ds-short,ds-medium,ds-long
....
NodeName=nb-node-b01 Sockets=32 CoresPerSocket=1 ThreadsPerCore=1 State=UNKNOWN RealMemory=110122 TmpDisk=975 Feature=tmp02,gpu,A40 Gres=gpu:a40:8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant