Hi all - we're getting the following error when tr...
# ask-metaflow
n
Hi all - we're getting the following error when trying (for the first time) to run our terraform-aws-metaflow instance with batch:
Copy code
botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when calling the DescribeJobQueues operation: User: arn:aws:iam::12345678910:user/our-ds-user is not authorized to perform: batch:DescribeJobQueues on resource:
Looking at the TF here and in our cluster, it doesn't seem like the
METAFLOW_ECS_S3_ACCESS_IAM_ROLE
is granted batch permissions? We are also setting
AWS_ACCESS_KEY_ID
and
AWS_SECRET_ACCESS_KEY
in our .env file to a custom IAM created to access a non-metaflow s3 bucket the flow needs to access. All flows run locally, hitting the metadata service and uploading artifacts to s3, all visible with the UI. Our
config_metaflow.json
looks like this:
Copy code
{
  "METAFLOW_DATASTORE_SYSROOT_S3": "<s3://our_bucket/metaflow>",
  "METAFLOW_DATATOOLS_S3ROOT": "<s3://our_bucket/data>",
  "METAFLOW_BATCH_JOB_QUEUE": "arn:aws:batch:us-east-1:12345678910:job-queue/metaflow-batch-queue",
  "METAFLOW_ECS_S3_ACCESS_IAM_ROLE": "arn:aws:iam::12345678910:role/metaflow-batch-s3-task-role",
  "METAFLOW_DEFAULT_DATASTORE": "s3",
  "METAFLOW_DEFAULT_METADATA": "service",
  "METAFLOW_SERVICE_INTERNAL_URL": "<http://metaflow-nlb-123.elb.us-east-1.amazonaws.com/>",
  "METAFLOW_SERVICE_URL": "<https://metaflow-metadata-service.company.com>",
  "METAFLOW_SFN_STATE_MACHINE_PREFIX": "metaflow-"
}
Wondering how the batch job queue permissions should be working and if setting the AWS access key and secret creds in the .env is problematic? Thanks! cc: @green-analyst-32514 @billions-city-11284
1