👋 My pipeline is processing 200k documents with LLM. I'm thinking to run the Metaflow flow code in a single GPU-supported machine. steps: 1) fetch document data; 2) do information extraction with LLM; 3) save results to a database
Consider I have to do processing in batches of 1000 documents: fetch 1000 documents, do information extraction with LLM all 1000 at once (batched LLM calls), save to DB, and then repeat with the next batch of 1000 documents.
Q: How to implement such batching? 1) Do I have a flow that processes a single batch and I just call that flow for every batch? Feels like I'm missing something; 2) I could have a batching in
start
and then use
foreach
for every batch. However, how do I then ensure that the information extraction step works for only 1 batch at a time? Do I use internal semaphore?