bulky-portugal-95315
09/18/2024, 5:09 PM@subflow
decorator that functions similarly to the way the metaflow.Runner
functionality executes flows remotely, but we have some flows using our custom decorator that are deployed to step functions and running via batch. We found the issue with our custom decorator by deploying a flow to step functions, where the subflow being executed is using the @batch
decorator, found we had issues kicking off the job within the flow on batch with an IAM role issue. We validated the same thing occurs if we call metaflow.Runner
within the step function deployed flow, but can be remedied by specifying the iam_role
for @batch
. This tells us that the batch role passed to the step function is not being passed down to the flow within i think because the step function passes it as an argument to python flow.py run
. We wanted to check if this was the expected behavior or if this was a use case that was new for metaflow.Runner
. Thank you and happy to jump on a call and describe in more detail if needed.square-wire-39606
09/18/2024, 5:11 PMsquare-wire-39606
09/18/2024, 5:12 PMbulky-portugal-95315
09/18/2024, 5:14 PMbulky-portugal-95315
09/18/2024, 5:42 PM@batch
job from step executing a @batch
job would be able to share the same iam_role
set further up the hierarchy?bulky-portugal-95315
09/18/2024, 5:47 PMiam_role
is set as part of the python flow.py step start
call, which I believe is why the first @batch
call works but then it’s not part of the environment executed by the metaflow.Runner
type call within since there is no ~/.metaflowconfig/config.json
?
&& python mock_flow.py --with batch:cpu=1,gpu=0,memory=4096,image=<http://1234567890.dkr.ecr.us-east-1.amazonaws.com/metaflow:latest,queue=arn:aws:batch:us-east-1:1234567890:job-queue/metaflow-queue,iam_role=arn:aws:iam::1234567890:role/metaflowBatchRole,use_tmpfs=False,tmpfs_tempdir=True,tmpfs_path=/metaflow_temp|1234567890.dkr.ecr.us-east-1.amazonaws.com/metaflow:latest,queue=arn:aws:batch:us-east-1:1234567890:job-queue/metaflow-queue,iam_role=arn:aws:iam::1234567890:role/metaflowBatchRole,use_tmpfs=False,tmpfs_tempdir=True,tmpfs_path=/metaflow_temp> --quiet --metadata=service --environment=local --datastore=s3 --datastore-root=<s3://metaflow-bucket/metaflow> --event-logger=nullSidecarLogger --monitor=nullSidecarMonitor --no-pylint --with=step_functions_internal step start --run-id sfn-$METAFLOW_RUN_ID --task-id ${AWS_BATCH_JOB_ID} --retry-count $((AWS_BATCH_JOB_ATTEMPT-1)) --max-user-code-retries 0 --input-paths sfn-${METAFLOW_RUN_ID}/_parameters/${AWS_BATCH_JOB_ID}-params)
square-wire-39606
09/18/2024, 8:00 PMsquare-wire-39606
09/18/2024, 8:00 PMbulky-portugal-95315
09/18/2024, 8:11 PMconfig.json
in the container image?square-wire-39606
09/18/2024, 8:11 PMsquare-wire-39606
09/18/2024, 8:12 PM