You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Ray Pipeline State Manager is a central service that consolidates the execution state of all scheduled pipeline work items in Ray's object store. This should be based on the single-process current implementation used in Beam's FnApiRunner.
At a high-level, this should be a Ray Actor that worker tasks use for (1) durable persistence of any ObjectRef that they have persisted in Ray's object store via ref = ray.put(obj) and (2) on-demand retrieval of any persisted ObjectRef which they can materialize via obj = ray.get(ref).
The state manager should also support efficient, atomic checkpointing and restoration of all state persisted in Ray's in-memory object store to durable storage (e.g. on-disk or to a durable cloud storage service etc.).
The text was updated successfully, but these errors were encountered:
The state manager should also support efficient, atomic checkpointing and restoration of all state persisted in Ray's in-memory object store to durable storage (e.g. on-disk or to a durable cloud storage service etc.).
This is interesting. We might have to do some work on improving the semantics of ObjectRefs serialization (in particular the interaction with ref-counting), since right now they're pinned forever in memory if exported. Hence, checkpointing may cause these objects to be leaked in the object store. cc @jjyao
The Ray Pipeline State Manager is a central service that consolidates the execution state of all scheduled pipeline work items in Ray's object store. This should be based on the single-process current implementation used in Beam's FnApiRunner.
At a high-level, this should be a Ray Actor that worker tasks use for (1) durable persistence of any
ObjectRef
that they have persisted in Ray's object store viaref = ray.put(obj)
and (2) on-demand retrieval of any persistedObjectRef
which they can materialize viaobj = ray.get(ref)
.The state manager should also support efficient, atomic checkpointing and restoration of all state persisted in Ray's in-memory object store to durable storage (e.g. on-disk or to a durable cloud storage service etc.).
The text was updated successfully, but these errors were encountered: