Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[batch] Create Python Ray Runner based on Beam Portability API #2

Open
pdames opened this issue Feb 23, 2022 · 0 comments
Open

[batch] Create Python Ray Runner based on Beam Portability API #2

pdames opened this issue Feb 23, 2022 · 0 comments

Comments

@pdames
Copy link
Member

pdames commented Feb 23, 2022

To support Beam pipeline graph construction, optimization, and submission for execution, we need to build a Ray Runner class based on Beam Portability API's FnApiRunner that supports both local (i.e. single-node) and remote (i.e. multi-node) pipeline execution.

At a high-level, its run_pipeline method should take a Beam pipeline DAG to run as input, and produce a PipelineResult that can describe/manage pipeline execution state as output. The Ray Runner Beam Components Doc provides additional details about the FnApiRunner's end-to-end workflow, and how it relates to Ray.

This runner will also depend on successful implementation of the following components:

  1. Ray Work Item Scheduler: The Ray Work Item Scheduler takes work items from the batch FnApiRunner's topological scheduler as input, and submits them for execution as Ray tasks.
  2. Ray Pipeline State Manager: The Ray Pipeline State Manager is a central service that consolidates the execution state of all scheduled pipeline work items in Ray's object store.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant