hi are there any docs on how to use the `@checkpoi...
# ask-metaflow
h
hi are there any docs on how to use the
@checkpoint
decorator when using
@parallel
steps? The docs I found here say TODO https://github.com/outerbounds/metaflow-checkpoint-examples/blob/master/documentation/checkpoint_deco/checkpoint_usage.md#saving--loa[…]rallel-steps
1
specifically what I would like to know if it is possible to get a cloud storage path where the checkpoint associated data could be stored in the datastore. Ideally something that I could put into the storage_path for
ray.train.RunConfig
https://docs.ray.io/en/latest/train/api/doc/ray.train.RunConfig.html
a
@hallowed-glass-14538 might have stuff handy
h
so the checkpoint decorator gives more explicit controls over checkpointing but that means something has to call the api (
current.checkpoint.save
). If you are fully using Ray then you don't even need the decorator since you can let ray fully control the checkpoint storage/loading. Example :
Copy code
from metaflow.metaflow_config import DATASTORE_SYSROOT_S3
from metaflow import current
import os

path_to_checkpoints = os.path.join(
    DATASTORE_SYSROOT_S3,
    "mf.ray_checkpoints",
    current.flow_name,
    current.run_id,
    current.step_name,
    current.parallel.control_task_id,
)
from ray.train import RunConfig
RunConfig(
    storage_path = path_to_checkpoints
    # ...
)
This way every execution will store checkpoints in a different location. You can set the value of
path_to_checkpoints
to
self
to make it a data artifact.
👍 1
h
oh nice thanks for explaining this