enough-article-90757
06/14/2025, 12:36 AMMETAFLOW_DEFAULT_DATASTORE
and METAFLOW_DATASTORE_SYSROOT_S3
), Boto3 defaulted to using us-east-1
when the bucket region was us-west-2
.
We isolated it down to Boto3 and not Metaflow by shelling into a worker pod and running these commands:
>>> import boto3, os
>>> boto3.client("s3").download_file("our-bucket--usw2-az1--x-s3", "metaflow/OurFlow/data/4e/4ef8def1d5cc6f655cd71109ee45ef2d6be4f55c", "job.tar")
...
File "/usr/local/lib/python3.9/site-packages/botocore/httpsession.py", line 493, in send
raise EndpointConnectionError(endpoint_url=request.url, error=e)
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "<https://our-bucket--usw2-az1--x-s3.s3express-usw2-az1.us-east-1.amazonaws.com/?session>"
Notice that METAFLOW_S3_ENDPOINT_URL
is not set or used in the boto3
invocation. We're also using an S3 Express directory bucket, not sure if that changes things.
It looks like boto3
defaults to us-east-1
when a region is not set, cf. https://github.com/boto/boto3-legacy/blob/b3091c5c3062c5a8ddd19926069d069cb6957dae/boto3/core/constants.py#L13.
Is it desirable for Metaflow to provide some way to set this region? METAFLOW_S3_ENDPOINT_URL
could work, but that seems clunky to me. Thoughts?enough-article-90757
06/14/2025, 1:54 AMhundreds-rainbow-67050
06/14/2025, 6:27 AMAWS_DEFAULT_REGION
. Did you try that?enough-article-90757
06/14/2025, 5:42 PMhundreds-rainbow-67050
06/14/2025, 7:11 PMenough-article-90757
06/14/2025, 7:12 PMhundreds-rainbow-67050
06/14/2025, 7:15 PM@kubernetes(env={"MY_VAR": "some_value", "OTHER_VAR": "123"})
if not, then @environment
would do it: https://docs.outerbounds.com/set-env-vars-with-decorator/enough-article-90757
06/14/2025, 7:26 PM