The human-centric platform for production ML & AI

Outerbounds

Hey friends, not strictly a Metaflow question, but some of you smart people must be working with Huggingface models in your flows. We're seeing unreasonably slow model load times from AWS EBS volumes with 7-20B parameter models. As we didn't want to download the models from HF at every run, we store them on the volume and use `AutoModel.from_pretrained("path/to/model", local_files_only=True`) to load the models when needed. But it takes ages, and what is odd is that downloading (which also writes to cache) + loading the model to memory on the spot seems to be faster, so I'm not willing to blame disk speed... soo anybpdy seen something similar? :)