The human-centric platform for production ML & AI

Outerbounds

Hi! I am creating a flow that calculates clip embedding along with some other intermediate model outputs.
I loop on every image, with `foreach='images'`  (iterating on images), and each process reload the clip model.
Is there a way to prevent the model reloading? i.e share model weights between processes, or something else for inference? is batch prediction the only way to go?

Hey Nir, each of the tasks for `foreach` are independent. there are a few different approaches to make the setup a bit more optimal -
1. simplest would be to batch more images in every task so that number of tasks are reduced
2. you could host the model as a separate service that lives only for the lifecycle of step/flow and the tasks are able to communicate with this service for inference. such a functionality is not available in oss metaflow but in outerbounds.

Hi, can you give me please a reference for this functionality in outerbounds?
Thank you!

here is a very timely article (dropped today) that talks about this pattern - <https://developer.nvidia.com/blog/building-llm-powered-production-systems-with-nvidia-nim-and-outerbounds/>