Hey Folks, I am trying to resurrect discussions on...
# dev-metaflow
q
Hey Folks, I am trying to resurrect discussions on subflow support in metaflow. I know that in past there has been discussion in this community in this matter, and a few proposals have also come up. I wish to know if the maintainers have so far blessed any proposal as “accepted”, or is any work WIP in this matter, any experimental patches that I may try out? If not, I may have (yet another :p) proposal to bring to the table, and I would absolutely love to get community’s feedback on it. In the coming months, as a part of one of the projects in our org which critically requires subgraph support, I would be very keen on contributing to upstream if that helps.
1
v
hey Sarthak! Thanks for resurfacing the topic
in fact we just chatted about it with @dry-beach-38304 and others yesterday. Comprehensive support for flow composition (including subflows) is on our roadmap for next year, the exact timing is TBD
this 1
s
event triggering is the first step towards that direction - it will go out in a month
I'd love to hear your thoughts! @quick-lighter-52296
q
Thanks for the update. I saw the doc for event based triggering, it is indeed exciting but we were looking for a more synchronous approach wherein the parent flow would wait for (potentially multiple, as in foreach) child flows to complete and then be able to access their outputs in downstream steps as if they were outputs produced by its own step. Our motivations such flow composition is two-fold. First, ease of maintenance, we collect various flow-level statistics such as time taken, failure rate etc, doing this at each subflow-level would be useful, plus we can then version each subflow separately (if each subflow were maintained by a different team). Second is that we want to enable flow-level caching. That is, by intelligently determining a cache key based on input and source code of subflow, we can determine if a matching subflow run already exists, and if so save ourselves a trigger. We initially thought of doing this caching at a step-level but that is more complicated because metaflow DAG is by design static.
s
makes sense. The synchronous mode is exactly what subflows will enable. We have been thinking supporting subflows by composing a static DAG before run (or deployment) starts, so much of the current semantics stay the same. The main difference is that you'll be able to compose a static DAG out of static sub-DAGs. By default, this approach doesn't lend itself to caching as you described it, as we don't support conditional steps and hence no conditional subflows if we keep the existing semantics. It's great that you brought this up so we can consider this case in more detail
the caching behavior would be doable using event triggering more easily: • Check if the results R exists: ◦ If they do, trigger flow X that consume R ◦ If not, trigger flow Y to produce R, which upon completion triggers X
so maybe both of your use cases will be covered by a combination of event triggering and subflows in the future 🤔
q
Indeed. As for subflows, are you guys in favour of inheritance based composition, or some other technique? I would like to bring a counter-argument against inheritance based composition in that it is limited in following aspects 1. A inherited class/step can only be transitioned to once (I presume), though certain use cases may require repeat/multiple transitions to the same sub-flow (as in branching, but not foreach). I am not saying these multiple transitions will always be to the same sub-graph. More formally, there should be some provision for a sub-grpah to be embedded at multiple places in a top-level graph. 2. Semantically speaking inheritance represents a “is-a” relationship, however sub-graphs / sub-flows represent more of a “has-a” relationship, that is why it feels a bit awkward to say that a flow embeds its inherited parent.
Though I am not by any measure well-versed with metaflow internal details or source code (I am only just starting to dig into it), I am of the opinion that a following syntax should be possible to achieve. As far as implementation is concerned, for AWS Step Functions, it is possible to synchronously trigger-and-wait for another step function run (first class support, aka optimised integration). Curious to see what are your immediate thoughts? (Attaching as a screenshot, as slack thread view wrapping-hell may render code unreadable).