happy-wolf-7852
02/21/2025, 1:58 PMclass Postprocessing(FlowSpec):
@retry(times=0)
@batch(cpu=8, memory=64000, queue='metaflow-684414486554-55ff636')
@step
def start(self):
self.regions = ['reg1', 'reg2', 'reg3']
self.next(self.post_process_by_region, foreach='regions')
@batch(cpu=48, memory=384000, queue='metaflow-684414486554-55ff636')
@step
def post_process_by_region(self):
current_region_name = self.input
# do some really exciting stuff !
self.next(self.join)
@step
def join(self, inputs):
# do some even more really exciting stuff !
self.next(self.end)
@step
def end(self):
pass
if __name__ == '__main__':
Postprocessing()
I run it as (scheduled) step function like so:
python Postprocessing.py --package-suffixes .sql --environment=conda --with retry step-functions create
python Postprocessing.py --package-suffixes .sql --environment=conda step-functions trigger
Any idea why it does not run in parallel? Thanks!square-wire-39606
02/21/2025, 3:47 PMhundreds-rainbow-67050
02/21/2025, 4:06 PMsquare-wire-39606
02/21/2025, 4:10 PMhappy-wolf-7852
02/21/2025, 5:42 PMwonderful-gpu-7095
02/21/2025, 5:43 PMhundreds-rainbow-67050
02/21/2025, 5:59 PMhappy-wolf-7852
02/24/2025, 11:07 AMc7i.large, r7i.xlarge, r7i.16xlarge, r7i.12xlarge, r7i.8xlarge, r7i.24xlarge, r7i.large, r7i.4xlarge, r7i.2xlarge
and my Maximum vCPUs is set to 96. I cannot see a max memory. Let us say, ideally I want to run N>>10 jobs in parallel using “foreach steps” each requiring about 32 GB memory like my local machine. What would I have to spec? I do not care too much about speed/CPU …
Often/thus far something like this:
@batch(cpu=4, memory=32000, queue=‘metaflow-1234’)
resulted in N<<10 parallel jobs and the rest “stuck in” runnable.hundreds-rainbow-67050
02/24/2025, 2:17 PMhappy-wolf-7852
02/24/2025, 2:20 PMhundreds-rainbow-67050
02/24/2025, 4:25 PMhappy-wolf-7852
02/24/2025, 5:36 PM