faint-hair-28386
11/09/2022, 3:24 PMdry-beach-38304
11/09/2022, 4:18 PMfaint-hair-28386
11/09/2022, 4:22 PMfaint-hair-28386
11/09/2022, 4:22 PMfaint-hair-28386
11/09/2022, 4:23 PMdry-beach-38304
11/09/2022, 7:07 PMpersist
here: https://github.com/Netflix/metaflow/blob/master/metaflow/task.py#L665
• this leads to here: https://github.com/Netflix/metaflow/blob/master/metaflow/datastore/task_datastore.py#L672 where we collect all the artifacts to save. Note that some things are excluded (methods, functions, certain names, etc.). In that function you will see we have a generator over the attributes of the flow to pickle. It’s a destructive iterator in the sense that it tries to be somewhat memory conscious
• we then reach the function you were pointing to here: https://github.com/Netflix/metaflow/blob/master/metaflow/datastore/task_datastore.py#L236 which is responsible for encoding and dumping each artifact. This is again a generator function pickle_iter
.
• we then end up here: https://github.com/Netflix/metaflow/blob/master/metaflow/datastore/task_datastore.py#L236 with another generator function packing_iter
here which is responsible for compressing the pickled blob (in the current implementation).
• you then end up in one of the backends, for s3, it would be here: https://github.com/Netflix/metaflow/blob/master/metaflow/datastore/s3_storage.py#L77. In this implementation, you can see that it’s at this point that all the previous generators basically get “resolved” and things are pushed to S3. For a few artifacts, it’s one at a time, for many artifacts, it’ll happen in parallel.
I hope this helps. Let me know if you have more questions.ambitious-bird-15073
11/09/2022, 11:38 PMfaint-hair-28386
11/10/2022, 6:46 AMfaint-hair-28386
11/10/2022, 6:47 AMfaint-hair-28386
11/10/2022, 6:51 AMgetattr(flow, attr)
returns the object definition, and generators are the tricky ones. So at this call, it knows that it is a generator but it cannot restore its state, that’s why it’s emptydry-beach-38304
11/10/2022, 7:01 AMfaint-hair-28386
11/10/2022, 7:03 AMnext
function, that can work through generatorsdry-beach-38304
11/10/2022, 3:26 PM