Add the attributes needed in OpenShift i.e. nvidia.com/gpu specifically for RHODS to use GPU resource #111

Milstein · 2023-08-01T17:32:18Z

Set nvidia.com/gpu: 0 by default
We can repurpose the current OpenStack quota attribute i.e. OpenStack GPU Quota as Allocation GPU quota
PI/ Manager(s) will change request to enable GPU resource in OpenShift as well

The text was updated successfully, but these errors were encountered:

Milstein · 2023-09-12T20:38:57Z

@knikolla : any idea how can we get this feature added to our current ocp approval plugin

knikolla · 2023-10-31T14:48:41Z

@Milstein

Based on some reading, it seems nvidia.com/gpu is a resource type that a limit can be set on.

First openshift-acct-mgt needs to be made aware that such a limit exists by adding it to the quotas.json file. Ignore base and coefficient.

":limits.nvidia.com/gpu":         { "base": 0, "coefficient": 0 },

Second, create a new attribute OpenShift GPU Quota in attributes.py here. I don't think repurposing an existing attribute makes things any easier and would suggest creating a new one.

QUOTA_LIMITS_GPU = 'OpenShift Limit on GPUs'

Add the attribute under the static quota section for OpenShift as zero.

{
    attributes.QUOTA_LIMITS_GPU: 0,
}

Add the quota key mapping in openshift.py here. This maps the attribute to the expected entry in the call to openshift-acct-mgt.

attributes.LIMITS_GPU: lambda x: {":limits.nvidia.com/gpu": f"{x}"},

Test

joachimweyl · 2023-11-07T13:46:06Z

@jtriley it looks like @Milstein has assigned this to you. Do you feel you have the details you need to resolve this issue?

jtriley · 2023-11-09T19:40:26Z

@knikolla re: 1) I made a PR here CCI-MOC/openshift-acct-mgt#100

NOTE: from our testing the quota has to be set on requests.nvidia.com/gpu not limits.nvidia.com/gpu otherwise users are able to still get a GPU.

Similar PR here in the config repo:

OCP-on-NERC/nerc-ocp-config#315

Both have been merged

For 2-4) I have a draft PR here #123

Looking into 5) if it makes sense to test this from this repo via CI/CD

Milstein assigned knikolla Aug 1, 2023

Milstein changed the title ~~Add the attributes needed for RHODS i.e. nvidia.com/gpu~~ Add the attributes needed in OpenShift i.e. nvidia.com/gpu in RHODS Aug 1, 2023

Milstein changed the title ~~Add the attributes needed in OpenShift i.e. nvidia.com/gpu in RHODS~~ Add the attributes needed in OpenShift i.e. nvidia.com/gpu specifically for RHODS Aug 1, 2023

Milstein changed the title ~~Add the attributes needed in OpenShift i.e. nvidia.com/gpu specifically for RHODS~~ Add the attributes needed in OpenShift i.e. nvidia.com/gpu specifically for RHODS to use GPU resource Aug 1, 2023

joachimweyl unassigned knikolla Aug 1, 2023

Milstein assigned knikolla Aug 28, 2023

joachimweyl unassigned knikolla Aug 29, 2023

Milstein assigned jtriley Sep 12, 2023

joachimweyl added AAA Test and removed AAA Test labels Oct 16, 2023

jtriley mentioned this issue Nov 9, 2023

add OpenShift GPU quota support #123

Merged

jtriley closed this as completed in #123 Nov 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the attributes needed in OpenShift i.e. nvidia.com/gpu specifically for RHODS to use GPU resource #111

Add the attributes needed in OpenShift i.e. nvidia.com/gpu specifically for RHODS to use GPU resource #111

Milstein commented Aug 1, 2023 •

edited

Loading

Milstein commented Sep 12, 2023

knikolla commented Oct 31, 2023

joachimweyl commented Nov 7, 2023

jtriley commented Nov 9, 2023

Add the attributes needed in OpenShift i.e. nvidia.com/gpu specifically for RHODS to use GPU resource #111

Add the attributes needed in OpenShift i.e. nvidia.com/gpu specifically for RHODS to use GPU resource #111

Comments

Milstein commented Aug 1, 2023 • edited Loading

Milstein commented Sep 12, 2023

knikolla commented Oct 31, 2023

joachimweyl commented Nov 7, 2023

jtriley commented Nov 9, 2023

Milstein commented Aug 1, 2023 •

edited

Loading