Hey! I'm in DevOps but have proposed to the Data S...
# ask-metaflow
b
Hey! I'm in DevOps but have proposed to the Data Science team at my company that we switch from Sagemaker to Metaflow hosted on EKS for better scaling and pipeline orchestration. I'm meeting with the Data Science team soon. Does anybody have a PowerPoint or anything they can share to increase my chances of convincing them 😅
l
Oh sick! Been there. We just went under contract with Outerbounds in January and it was a major struggle. Some big downsides of sagemaker are • cryptic error messages • compute is more expensive (i don't remember the premium, but it's like 50% over vanilla ec2 instances) • you have to build/tag/push/reference docker images yourself--usually waiting for CI to do that for you each time you make a change • pretty much zero local development story • really convoluted sdk, DS will have struggle to be productive on it on their own, e.g. picking their own ec2 instance sizes, picking which iam roles to use, chaining different obscure types of operators together If you want a wing man, I'd be down to come walk you guys how we're running it at pattern. We're obviously on the managed version, but the same principles apply. We also looked at ZenML, Dagster, and Airflow--so we were pretty confident about our choice. Although if we were self-hosting, it'd be a hard choice between ZenML and metaflow for me.