Hi team, I wanted to know if there’s a way to impo...
# ask-metaflow
q
Hi team, I wanted to know if there’s a way to impose some conditions on the retry behaviour of metaflow. We use
@retry
decorator to configure retries for many of our steps to prevent against infrastructural failures, OOM issues and certain retryable application-level errors. But, sometimes the application error is known to be non-retryable and in such cases we end up needlessly wasting compute on retrying the step multiple times. Is there a way to annotate an exception somehow with a non-retryable attribute so that metaflow does not retry it? I know AWS Batch supports conditional retries in its retry strategy where you could configure the retry to take place only for certain exit codes which could theoretically be used to build such a feature but I am not sure if there are any plans to support it.
1
d
This is not supported at this time no — one thing you could do though, especially if you have an application error, is catch that normally within your flow (using a try/except or a regular decorator around your step) and not fail the task itself.
q
Is this something that’s in the backlog or worth considering for you guys?
s
if the application error is non-retryable, then you wouldn't want to retry the container against an infrastructural failure either?
q
Yes but only if the container failed due to the application-level error. Concretely speaking, if we could have the container exit with a special exit code in case of user-determined non-retryable errors, we could theoretically instrument it to detect when the retry need NOT be made. Wdyt?
s
theoretically, you might still miss out on scenarios where the retry should NOT be made if the container fails due to a platform issue at an inopportune moment.