Hey everyone :wave: current Kubeflow user who has...
# dev-metaflow
m
Hey everyone 👋 current Kubeflow user who has been following Metaflow for some time. We're currently running into limitations with workflows with ~500-1000 nodes/a high degree of fan-out and are investigating either returning to vanilla Argo or other tools. I'd be curious if anyone has experience running very large flows/if there are any special considerations there. Also, looking forward to the Argo support -- once that lands am quite interested in giving it a try for some of our thorniest pipelines.
1
a
Hey Rahul, welcome to the community!
We are just giving final touches to the Argo integration - it should allow you to trivially fan out to 1000 parallel tasks.
🙌 1
m
Fantastic -- would be happy to follow up with you separately since we have some pipelines I'd be interested in trying it out on. Do you have an idea when this will land: https://github.com/Netflix/metaflow/pull/992
a
We should be able to get it in this week or early next week.
q
I tried a flow with 2000 parallel tasks in kubernetes, it took couple of attempts to finish it as some tasks died randomly (if I remember correctly I forgot to add the retry) (not so random I think was due to OOM) But all went fine, nodes scaled up and down automatically etc All this without Argo obviously, sure Argo is going to be amazing!
❤️ 1
Can we consider Argo integration is going to be like the "step functions of kubernetes" or something totally different?
a
@quiet-motherboard-43023 That's correct - it will be like SFN for k8s