Hi all - Does anyone have a good workflow for part...
# ask-metaflow
l
Hi all - Does anyone have a good workflow for partitioning or tagging objects in the metaflow datastore in order to comply with data retention policy? For example, let's say this is a default path for artifacts:
Copy code
<s3://metaflow-bucket/metaflow/data/HelloFlow/><run_id>/<step_name>/<artifacts>
If we were able to save by date
Copy code
.../HelloFlow/<run_date>/<run_id>/...
Or by customer
Copy code
.../HelloFlow/<customer_name>/<run_id>/...
that would make it easier to comply with potential requests to delete data. Although I can imagine other ways to do this as well 🤔. Curious if anyone else has to delete artifacts in a targeted way?
a
Hi Sara, all artifacts for a flow are content addressed - this ensures that your overall storage space doesn't blow up as your usage starts to scale. we have been thinking about introducing first class support for data retention policies recently but meanwhile, one easy workaround could be to generate flow names that are per-date or per-customer - that should allow for setting up lifecycle policies neatly on S3.
👀 1
👍 1