We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the following REP: https://github.com/ray-project/enhancements/blob/main/reps/2023-06-06-simplify-sync.md
Subtasks:
ExperimentAnalysis
Checkpoint
train.Checkpoint
ray.air.Checkpoint
SyncConfig
Result.from_path
.tune_metadata
Trainable.save_to_object
air.Checkpoint
Related prs:
pyarrow.fs
StorageContext
ray.train.Checkpoint
Trial
Experiment
resume_from_checkpoint
FunctionTrainable
Trainable
The text was updated successfully, but these errors were encountered:
Keep this open until docs merge as part of RC tomorrow @justinvyu
Sorry, something went wrong.
Docs are all merged!
justinvyu
No branches or pull requests
Implement the following REP: https://github.com/ray-project/enhancements/blob/main/reps/2023-06-06-simplify-sync.md
Subtasks:
ExperimentAnalysis
for new persistence mode #38567Checkpoint
#38570Checkpoint
totrain.Checkpoint
#38571Checkpoint
#38574ray.air.Checkpoint
dependency #38575SyncConfig
#38577Result.from_path
w/o.tune_metadata
file #38579Trainable.save_to_object
to not depend onair.Checkpoint
#38589Related prs:
pyarrow.fs
persistence: IntroduceStorageContext
and use it for driver syncing (1/n) #37690pyarrow.fs
persistence: PassStorageContext
to Train workers (2/n) #37909pyarrow.fs
persistence (3/n): Introduce newCheckpoint
API #37925pyarrow.fs
persistence (4/n): Introduce a simplified checkpoint manager #37962pyarrow.fs
persistence (5/n):ray.train.Checkpoint
save direction #37888pyarrow.fs
persistence (6/n): FixTrial
+Experiment
paths to use theStorageContext
#38057pyarrow.fs
persistence (7/n):ray.train.Checkpoint
restore: Auto-recovery fault tolerance #38141pyarrow.fs
persistence (8/n):ray.train.Checkpoint
restore:resume_from_checkpoint
#38143pyarrow.fs
persistence (9/n):ray.train.Checkpoint
restore: Manual restore #38128pyarrow.fs
persistence (10/n): Unify Tune and Train sessions to support new persistence path inFunctionTrainable
#38284pyarrow.fs
persistence (11/n): Support pausing trials (and certain schedulers) #38355pyarrow.fs
persistence (12/n): Patch new persistence path for ClassTrainable
#38382pyarrow.fs
persistence (13/n): Support theResult.from_path
API #38617pyarrow.fs
persistence (14/n): SimplifiedExperimentAnalysis
implementation #38648The text was updated successfully, but these errors were encountered: