-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Multi-Job][3/N] Distinguish cross_silo msg with the job name #172
Changes from 14 commits
652f2ab
8654dbe
98b348e
30494d8
903f52a
cb58975
e04b7c3
4e60186
d1b3e5c
0c81850
bbd2f03
c4c7434
3d923e4
d643823
2557c53
0b787cf
333f592
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -46,7 +46,10 @@ def __init__( | |
submit_ray_task_func, | ||
options={}, | ||
) -> None: | ||
self._party = fed_config.get_cluster_config().current_party | ||
# Note(NKcqx): FedCallHolder will only be created in driver process, where | ||
# the GlobalContext must has been initialized. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good note. |
||
job_name = get_global_context().job_name() | ||
self._party = fed_config.get_cluster_config(job_name).current_party | ||
self._node_party = node_party | ||
self._options = options | ||
self._submit_ray_task_func = submit_ray_task_func | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,23 +48,25 @@ def cross_silo_comm_config_dict(self) -> Dict: | |
_job_config = None | ||
|
||
|
||
def get_cluster_config(): | ||
def get_cluster_config(job_name: str = None): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hum, so this method name should be changed to get_job_config or get_config? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Both "cluster_config" and "job_config" are part of the JOB, see #156. So I think it's fine to retrieve a "cluster config" by "job name" though it's definitely hard to understand 🤣 I think in the near future, we can merge these two configs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK. Then let us leave a TODO comment on that target? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure |
||
"""This function is not thread safe to use.""" | ||
global _cluster_config | ||
if _cluster_config is None: | ||
compatible_utils._init_internal_kv() | ||
compatible_utils.kv.initialize() | ||
assert job_name is not None, \ | ||
"Initializing internal kv need to provide job_name." | ||
compatible_utils._init_internal_kv(job_name) | ||
raw_dict = compatible_utils.kv.get(fed_constants.KEY_OF_CLUSTER_CONFIG) | ||
_cluster_config = ClusterConfig(raw_dict) | ||
return _cluster_config | ||
|
||
|
||
def get_job_config(): | ||
def get_job_config(job_name: str = None): | ||
"""This config still acts like cluster config for now""" | ||
global _job_config | ||
if _job_config is None: | ||
compatible_utils._init_internal_kv() | ||
compatible_utils.kv.initialize() | ||
assert job_name is not None, \ | ||
"Initializing internal kv need to provide job_name." | ||
compatible_utils._init_internal_kv(job_name) | ||
raw_dict = compatible_utils.kv.get(fed_constants.KEY_OF_JOB_CONFIG) | ||
_job_config = JobConfig(raw_dict) | ||
return _job_config | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Copyright 2023 The RayFed Team | ||
NKcqx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# https://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us just restrict the type to
str
only?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we can't restrict the KV
get
andput
to only use thestr
type key.Do you mean changing all of the current KV accesses to use
str
type only ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant: