Hello,
Long time user of Metaflow here.
We've got Metaflow set up with AWS Batch and it's running smoothly with the Datastore on AWS S3.
I can run jobs from a single on-prem node and scale up to AWS Batch as needed. I'm aware that Metaflow can be integrated with Kubernetes for on-prem nodes, as mentioned in
https://youtu.be/3zYK0w7Y6L4▾
Right now, we're using on-prem nodes managed by a SLURM scheduler. Since we need to utilize these on-prem nodes and managing them with Kubernetes isn't an option, I'm wondering what it would take to get Metaflow working with SLURM.
Is this something I could realistically do on my own? I'd appreciate any guidance you can give. If it's going to be a big project, please let me know so we can look into other options.