Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Koordinator as one batch scheduler option #2572

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

kingeasternsun
Copy link

@kingeasternsun kingeasternsun commented Nov 26, 2024

Why are these changes needed?

Koordinator is a QoS-based scheduling for efficient orchestration of microservices, AI, and big data workloads on Kubernetes. It aims to improve the runtime efficiency and reliability of both latency sensitive workloads and batch jobs, simplify the complexity of resource-related configuration tuning, and increase pod deployment density to improve resource utilizations.

The integration is easy, Koordinator support annotation way to support gang scheduling without podgroup CR

gang.scheduling.koordinator.sh/name           
gang.scheduling.koordinator.sh/min-available

Koordinator are compatible with pod-group.scheduling.sigs.k8s.io, pod-group.scheduling.sigs.k8s.io/name and pod-group.scheduling.sigs.k8s.io/min-available in community.

https://koordinator.sh/docs/designs/gang-scheduling

Related issue number

#2573

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@kevin85421
Copy link
Member

Is this PR ready for review? The PR title is still marked as "draft."

@andrewsykim
Copy link
Collaborator

@kingeasternsun can you please add some tests before marking this ready for review? See other implementation for batch schedulers for reference

@kingeasternsun
Copy link
Author

Is this PR ready for review? The PR title is still marked as "draft."

Thanks for your comment, It's still developing, and now it lacks test code.

@kingeasternsun
Copy link
Author

@kingeasternsun can you please add some tests before marking this ready for review? See other implementation for batch schedulers for reference

Thanks for your advice, I'll add the tests quickly

Signed-off-by: kingeasternsun <[email protected]>
@kingeasternsun kingeasternsun changed the title Draft: Support Koordinator as one batch scheduler option Support Koordinator as one batch scheduler option Dec 7, 2024
@kingeasternsun
Copy link
Author

Is this PR ready for review? The PR title is still marked as "draft."

Hey, everything is fine now.

@kingeasternsun
Copy link
Author

@kingeasternsun can you please add some tests before marking this ready for review? See other implementation for batch schedulers for reference

Hi, tests had been added .

}

func analyzeGangGroupsFromApp(app *rayv1.RayCluster) ([]string, map[string]wokerGroupReplicas) {
gangGroups := make([]string, 1+len(app.Spec.WorkerGroupSpecs))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: len(app.Spec.WorkerGroupSpecs)+1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: len(app.Spec.WorkerGroupSpecs)+1

Thank you for your review! I will fix it as soon as possible.

}

for i, workerGroupSepc := range app.Spec.WorkerGroupSpecs {
gangGroups[1+i] = generateGangGroupName(app, workerGroupSepc.Template.Namespace, workerGroupSepc.GroupName)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: gangGroups[i+1]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: gangGroups[i+1]

Thank you for your review! I will fix it as soon as possible.

for i, workerGroupSepc := range app.Spec.WorkerGroupSpecs {
gangGroups[1+i] = generateGangGroupName(app, workerGroupSepc.Template.Namespace, workerGroupSepc.GroupName)
minMemberMap[workerGroupSepc.GroupName] = wokerGroupReplicas{
Replicas: *(workerGroupSepc.Replicas),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the brackets are not needed here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the brackets are not needed here

Thank you for your review! I will fix it as soon as possible.

gangGroups[1+i] = generateGangGroupName(app, workerGroupSepc.Template.Namespace, workerGroupSepc.GroupName)
minMemberMap[workerGroupSepc.GroupName] = wokerGroupReplicas{
Replicas: *(workerGroupSepc.Replicas),
MinReplicas: *(workerGroupSepc.MinReplicas),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the brackets are not needed here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the brackets are not needed here

Thank you for your review! I will fix it as soon as possible.

logger := ctrl.LoggerFrom(ctx).WithName(SchedulerName)

// when gang scheduling is enabled, extra annotations need to be added to all pods
if y.isGangSchedulingEnabled(app) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if !y.isGangSchedulingEnabled(app) {
    return
}

....

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if !y.isGangSchedulingEnabled(app) {
    return
}

....

Thank you for your review! I will fix it as soon as possible.

},
)

setHeadPodNamespace(rayClusterWithGangScheduling, "ns0")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this call needed? The namespace should be inherited from the RayCluster namespace right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this call needed? The namespace should be inherited from the RayCluster namespace right?

What you said is absolutely correct. However, to make this module's code more general, I considered that there might be scenarios in the future where the namespace for head pods or worker pods needs to be specifically designated. Therefore, I implemented compatibility here: if the namespace for the head pod or worker pod is empty, it will inherit the namespace of the rayCluster; otherwise, it will be processed according to the specified namespace.

Of course, if you think we don’t need to consider such special cases for now, I completely agree as well.


setHeadPodNamespace(rayClusterWithGangScheduling, "ns0")
addWorkerPodSpec(rayClusterWithGangScheduling, "workergroup1", "ns1", 4, 2)
addWorkerPodSpec(rayClusterWithGangScheduling, "workergroup2", "ns2", 5, 3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Koordinator support having RayCluster pods spread across different namespaces? I don't think that's really supported with Kuberay

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Koordinator support having RayCluster pods spread across different namespaces? I don't think that's really supported with Kuberay

As mentioned in the documentation at https://koordinator.sh/docs/designs/gang-scheduling/#annotation-way, the Koordinator scheduler supports highly flexible GangGroup scheduling requirements, including the ability to support different pods in a GangGroup belonging to different namespaces.

Of course, if you believe such special scheduling requirements are unlikely to occur in KubeRay, I fully agree as well.

@andrewsykim
Copy link
Collaborator

@andrewsykim
Copy link
Collaborator

Please update main.go

flag.StringVar(&batchScheduler, "batch-scheduler", "",
"Batch scheduler name, supported values are volcano and yunikorn.")

@andrewsykim
Copy link
Collaborator

Please update batch schedulers in helm chart

# Enable customized Kubernetes scheduler integration. If enabled, Ray workloads will be scheduled
# by the customized scheduler.
# * "enabled" is the legacy option and will be deprecated soon.
# * "name" is the standard option, expecting a scheduler name, supported values are
# "default", "volcano", and "yunikorn".
#
# Note: "enabled" and "name" should not be set at the same time. If both are set, an error will be thrown.
#
# Examples:
# 1. Use volcano (deprecated)
# batchScheduler:
# enabled: true
#
# 2. Use volcano
# batchScheduler:
# name: volcano
#
# 3. Use yunikorn
# batchScheduler:
# name: yunikorn
#
batchScheduler:
# Deprecated. This option will be removed in the future.
# Note, for backwards compatibility. When it sets to true, it enables volcano scheduler integration.
enabled: false
# Set the customized scheduler name, supported values are "volcano" or "yunikorn", do not set
# "batchScheduler.enabled=true" at the same time as it will override this option.
name: ""

@kingeasternsun
Copy link
Author

Can you add an example YAML similar to this one https://github.com/ray-project/kuberay/blob/e9f31556c14fae6391fb27a4a96bfbe01f917d46/ray-operator/config/samples/ray-cluster.yunikorn-scheduler.yaml?

Thank you for your review! I will add it as soon as possible.

@kingeasternsun
Copy link
Author

Please update main.go

flag.StringVar(&batchScheduler, "batch-scheduler", "",
"Batch scheduler name, supported values are volcano and yunikorn.")

Thank you for your review! I will fix it as soon as possible.

@kingeasternsun
Copy link
Author

Please update batch schedulers in helm chart

# Enable customized Kubernetes scheduler integration. If enabled, Ray workloads will be scheduled
# by the customized scheduler.
# * "enabled" is the legacy option and will be deprecated soon.
# * "name" is the standard option, expecting a scheduler name, supported values are
# "default", "volcano", and "yunikorn".
#
# Note: "enabled" and "name" should not be set at the same time. If both are set, an error will be thrown.
#
# Examples:
# 1. Use volcano (deprecated)
# batchScheduler:
# enabled: true
#
# 2. Use volcano
# batchScheduler:
# name: volcano
#
# 3. Use yunikorn
# batchScheduler:
# name: yunikorn
#
batchScheduler:
# Deprecated. This option will be removed in the future.
# Note, for backwards compatibility. When it sets to true, it enables volcano scheduler integration.
enabled: false
# Set the customized scheduler name, supported values are "volcano" or "yunikorn", do not set
# "batchScheduler.enabled=true" at the same time as it will override this option.
name: ""

Thank you for your review! I will fix it as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants