Hi! I am creating a flow that calculates clip embe...
# ask-metaflow
c
Hi! I am creating a flow that calculates clip embedding along with some other intermediate model outputs. I loop on every image, with
foreach='images'
(iterating on images), and each process reload the clip model. Is there a way to prevent the model reloading? i.e share model weights between processes, or something else for inference? is batch prediction the only way to go?
1
a
Hey Nir, each of the tasks for
foreach
are independent. there are a few different approaches to make the setup a bit more optimal - 1. simplest would be to batch more images in every task so that number of tasks are reduced 2. you could host the model as a separate service that lives only for the lifecycle of step/flow and the tasks are able to communicate with this service for inference. such a functionality is not available in oss metaflow but in outerbounds.
c
Hi, can you give me please a reference for this functionality in outerbounds? Thank you!
a
here is a very timely article (dropped today) that talks about this pattern - https://developer.nvidia.com/blog/building-llm-powered-production-systems-with-nvidia-nim-and-outerbounds/