Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable GPU on demand on rag #272

Open
gsantopaolo opened this issue Jun 24, 2024 · 2 comments
Open

enable GPU on demand on rag #272

gsantopaolo opened this issue Jun 24, 2024 · 2 comments
Assignees
Labels

Comments

@gsantopaolo
Copy link
Contributor

create a script that allows some pods to run on GPU on the rag cluster
another script to stop the GPU instance

Make that K8 will do the GPU passthrough.

Service to run on GPU when enabled:

  • semantic
  • embedder (going to be renamed AI services)

https://learn.microsoft.com/en-us/azure/aks/gpu-cluster?tabs=add-ubuntu-gpu-node-pool

@noelhermans
Copy link
Collaborator

noelhermans commented Jun 30, 2024

Quota increased requested with Microsoft, but was denied for Eastindia region.
@gsantopaolo requested quotas for EASTUS region, but it is not possbile to add a nodepool from another region to the Cognix AKS cluster

@noelhermans
Copy link
Collaborator

Waiting for Microsoft support to come back to us with list of GPU instances supported in East India region.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants