Is there an intuitive way to run a step if any of the previo Outerbounds #ask-metaflow

Is there an intuitive way to run a step if any of ...

handsome-postman-16645

01/10/2025, 5:07 PM

Is there an intuitive way to run a step if any of the previous steps fail? I need to run a final step to tear down infrastructure at the end of the run in case of failure. It looks like the

@catch

operator is going to need to be added to all previous steps, but there are 18 previous steps in my DAG, so this adds a lot of boilerplate. Is there a simpler way to just say "run this tear-down step regardless of the status of all previous steps"?

✅ 1

hundreds-rainbow-67050

01/10/2025, 5:29 PM

You can add a catch to all your steps using —`-with catch`

hundreds-rainbow-67050

01/10/2025, 5:31 PM

Another way I can think of would be to trigger your flow with another one that can then do your cleanup

ancient-application-36103

01/10/2025, 6:09 PM

also curious what is the nature of cleanup? you can add a pure python decorator that wraps the user code in try ... except and does the clean up if except is triggered.

handsome-postman-16645

01/13/2025, 2:52 PM

> also curious what is the nature of cleanup? you can add a pure python decorator that wraps the user code in try ... except and does the clean up if except is triggered. The use case is standing up a PVC before a run begins, mounting it in all of the nodes in the DAG to share some data, and then destroying the PVC at the end of the run. Currently, if the run fails, the PVC will remain stood up because the final node responsible for deleting the PVC isn't reached

handsome-postman-16645

01/13/2025, 2:54 PM

This pipeline is automated and runs in Argo Workflows.

Another way I can think of would be to trigger your flow with another one that can then do your cleanup

This approach definitely seems like a good one! Might try this out. Thanks!

hundreds-rainbow-67050

01/13/2025, 3:08 PM

Given your description, I feel like creating a step decorator to do the setup and teardown would be the cleaner way

handsome-postman-16645

01/13/2025, 3:11 PM

Yeah, that would definitely feel like a much better approach. Are there concrete examples of how to do this, or should I just follow the approach taken here?

hundreds-rainbow-67050

01/13/2025, 3:23 PM

You probably want

task_pre_step

and

task_finished

. Caveat: there isn't fine-grained control over decorator ordering at the moment, so if you need your PVC available from within another decorator then things can get tricky due to the order of the setup/teardown https://netflix.slack.com/archives/C02116BBNTU/p1730834209671669?thread_ts=1730831644.349479&cid=C02116BBNTU

handsome-postman-16645

01/13/2025, 3:24 PM

Ah yeah, so then this wouldn't be a viable approach. My pods will only launch via the

@kubernetes

decorator if the PVC already exists. Otherwise, they'll hang in a

pending

state forever. Ordering is super important here

handsome-postman-16645

01/13/2025, 3:25 PM

This strikes me as the kind of thing that I would want a flow-level decorator for. The PVC is created in the first step, and all subsequent nodes mount the same PVC, before finally tearing it down once processing is completed.

hundreds-rainbow-67050

01/13/2025, 3:27 PM

So in your particular case you need it to run before

@kubernetes

and teardown after? You could just have the decorator wrap

@kubernetes

then?

handsome-postman-16645

01/13/2025, 3:27 PM

I have a sequence of

@kubernetes

decorators in my flow. It needs to be created before the sequence starts, and deleted after the sequence ends.

handsome-postman-16645

01/13/2025, 3:28 PM

One PVC serves every

@kubernetes

decorator in my flow

hundreds-rainbow-67050

01/13/2025, 3:29 PM

Oh I see. Then yeah this would be a good application for

flow_finalize

(which doesn't exist yet). Maybe @ancient-application-36103 has other ideas

handsome-postman-16645

01/13/2025, 3:30 PM

Thanks for taking a look! It actually does sound analogous to the approach taken here, so I might just take a stab at replicating that

hundreds-rainbow-67050

01/13/2025, 3:34 PM

Where do you setup the PVC right now? Also just read the link you sent and it does seem like what you want

handsome-postman-16645

01/13/2025, 3:35 PM

There's an initial step in the flow where a PVC is created, and an end step in the flow where that PVC is deleted

hundreds-rainbow-67050

01/13/2025, 3:35 PM

Although if you did catch and teardown the PVC on exception, won't it continue to the next step which expects the PVC to exist?

handsome-postman-16645

01/13/2025, 3:38 PM

If any exception occurs, the intended behaviour would be to terminate the entire flow and delete the PVC

👍 1

ancient-application-36103

01/13/2025, 4:43 PM

@handsome-postman-16645 curious - why use a PVC and not use say S3 directly?

ancient-application-36103

01/13/2025, 4:44 PM

this approach may still work https://outerboundsco.slack.com/archives/C02116BBNTU/p1736532569880309?thread_ts=1736528843.378529&channel=C02116BBNTU&message_ts=1736532569.880309. The exception block will tear down the PVC and reprise the exception

handsome-postman-16645

01/13/2025, 4:50 PM

@square-wire-39606 There are nodes in my flow that run processing over large files by calling out to binaries (without Python bindings) on the OS via

subprocess

. These binaries take input files and produce output files, which are intermediate outputs in my flow. In this case, it doesn't really make sense to load each large intermediate file into memory in Python just to bring it into GCS/S3 via Metaflow

handsome-postman-16645

01/13/2025, 4:52 PM

For this specific case, it's just generally more convenient to have all files on a mounted storage device shared across tasks, rather than pass a bunch of references to input and output URIs in each task

square-wire-39606

01/13/2025, 4:56 PM

I see. It will likely be faster and cheaper to do it via GCS/S3, but if you want to tear down PVC in case of any error - wrapping a simple try except in a pure python decorator should suffice.

handsome-postman-16645

01/13/2025, 4:59 PM

Thanks! Definitely going to try out the pure Python decorator approach

5 Views

Open in Slack

Previous Next