brave-fall-84099
02/20/2025, 2:38 PMAn error occurred (TooManyRequestsException) when calling the DescribeJobDefinitions operation (reached max retries: 4): Too Many Requests
when fanning out to ~100 steps.
I see some open PRs about the subject. I am wondering about two things:
• Are there config values (max retries, backoff?) that I can play with
• Can I catch this issue happening in runner api?
I would be happy if the whole run fails immediately (or if I can make it fail). The annoying thing that happens that 2 out of 100 jobs fail to even be registered, the whole job runs and crashes before join because some jobs were not even registered, wasting a lot of time.brave-fall-84099
02/20/2025, 2:49 PM@secrets
. How can I catch this / make note of it?
Currently I can only figure out why 2 / 100 tasks are stillborn because I happen to see the error fly by in my console output.ancient-application-36103
02/20/2025, 4:30 PMTooManyRequestException
for DescribeJobDefinitions
is unfortunately due to a global AWS limit - a better bet would be to get those limits raised by working with your AWS TAM.ancient-application-36103
02/20/2025, 4:31 PM@secrets
?ancient-application-36103
02/20/2025, 4:34 PM