Hi all, when deploying flows as AWS SFN tags get s...
# dev-metaflow
q
Hi all, when deploying flows as AWS SFN tags get statically encoded into SFN’s definition, thus every time the flow is run using SFN as scheduler, the same (user) tags get applied. Is there any way to modify tags on a per-run (SFN trigger) basis without re-deploying the flow every time that I missed? For example, being able to modify tags from within the flow (runtime), or having a special
tags
parameter perhaps which can then be used to apply the tags on the fly. Use case: once deployed we want to trigger the SFN multiple times with different input parameters, we would then like to differentiate between the flows. One of the input parameters is a business-specific unique id, so if we could tag the flow with that id, we could then be able to filter the flow afterwards using that tag.
1
With some context on which source files are most relevant, I would be happy to contribute the feature myself.
d
Tag mutation is currently in PR form (1049 iirc) and should be out shortly.
🙌 1
With that you will be able to add tags from within your run.
So you can then read off whatever value you want from your trigger and store as tag when the run starts.
s
@quick-lighter-52296 We should be able to roll this feature out early next week 🙂
q
That’s really great guys, awesome work!
@square-wire-39606 @dry-beach-38304 I had a look at the PR, but have just 2 questions if I may ask - 1. How does this PR help in mutating tags from within the run (as opposed to after the run completes using then the client API)? Do we have access to metadata provider object within a step so that we may call the new
mutate_user_tags_for_run
method directly? 2. As I understand even if this PR is merged and rolled out, the feature won’t be live until
netflix/metaflow-service
is also updated to implement the new interface right? Do you have very rough timeline on when that would be?
d
1. You can use the client within a step to access your current run and then modify it that way. This is a common pattern here. Here is an example code we have for this:
Copy code
namespace(None)
me = Run("%s/%s" % (current.flow_name, current.run_id))
my_tags = me.user_tags

# Look for a tag that says "mftest_group"
self.group_tag = None
for t in my_tags:
    if t.startswith("mftest_group:"):
        self.group_tag = t
        break
if self.group_tag is None:
    # Create one
    self.group_tag = "mftest_group:%s" % str(uuid.uuid4())[:8]
    me.add_tags([self.group_tag])
me.add_tags(["mftest_bootstrap"])
2. This is already released in 2.3.0 IIRC.
1
🙌 1
u
Our workaround to this (and other potential ambiguity about deploy vs trigger-time configuration) has been to have our CI/CD system always couple triggering with deployment -- i.e., users trigger executions via CI/CD system, and CI/CD will first deploy and then trigger. A follow-on to this is that we usually use our CI/CD system for cron scheduled runs instead of relying on Metaflow's
@schedule
decorator.