I asked another vendor, “what are the weaknesses of Metaflow?” does anyone want to challenge some of these?
———-
Quoting the vendor:
Based on my understanding
• Metaflow cannot have custom containers per step (at least I could not find where to push them)
• DAG only execution. I.e. you cannot have logic driven flows
• cannot connect git repositories to different component in the pipeline
• Visualization of results / artifacts is rather limited
• Only Kubernetes is supported as underlying provisioning
◦ Although plugins for IaaS (AWS/GCP/Azure) are available, they do not seem trivial to configure, and seem to need to be configured as part of the pipeline itself (but I might be wrong here)
• No caching available (i.e. if a component/step was already executed wiht the same arguments/code reuse it)
• I do not believe there is any role based access control on top (i.e. it seems everyone is an "admin")
As a rule of thumb, Metaflow was created to build inference batch piplines, and I think it is very good at it as alternative to for example SageMaker.
I was not however design to be a tool for R&D to production acceleration, and this is exactly what ClearML does. ClearML helps you build the pipeliens as part of the research and engineering, not as a standalone "production" process. This means flexibility and visibility are key concepts that seem to be missing from Metaflow, that is designed with more "devops" in mind, rather than ML engineers / data scientist
My two cents of course 🙂 and if anyone feels differently or want to share their experience please do!