how can I set different values for `METAFLOW_S3_EN...
# ask-metaflow
g
how can I set different values for
METAFLOW_S3_ENDPOINT_URL
when running a flow locally with
python flow.py run
and when creating an argo workflow using
python flow.py argo-workflows create
?
1
a
you can set
METAFLOW_S3_ENDPOINT_URL
in your env or metaflow config
g
I feel this is the same issue as I need diff URLs for
METAFLOW_SERVICE_URL
, which is solved with an extra
METAFLOW_SERVICE_INTERNAL_URL
, but there appears no
METAFLOW_S3_INTERNAL_ENDPOINT_URL
a
ah i misread your question
g
you can set
METAFLOW_S3_ENDPOINT_URL
in your env or metaflow config
But this will be used for both
run
and
argo-workflows create
, right? I need it to be different depending on the command, is there already a way to do so? feel this could be a common problem.
a
is the goal to use different urls when running the task locally vs running it as a kubernetes pod?
because
python flow.py run
can also run the workload on kubernetes
g
is the goal to use different urls when running the task locally vs running it as a kubernetes pod?
correct
a
here is how we have structured it in the dev stack where workloads running locally target a different endpoint vs workloads running inside kubernetes
g
this is also the reason why we have both
METAFLOW_SERVICE_URL
and
METAFLOW_SERVICE_INTERNAL_URL
, is that right? so that when creating argo workflow, the env var is set as
Copy code
"METAFLOW_SERVICE_URL": SERVICE_INTERNAL_URL,
https://github.com/Netflix/metaflow/blob/27c6aaefb3966fafadda68f2831ab3ca5510c92f/metaflow/plugins/argo/argo_workflows.py#L1677
a
would the secrets oriented approach not work for you?
g
I'm not familiar with tilt or
devtools
yet. Could you clarify please a bit more how it solves the problem of accessing diff s3 endpoint depending on local vs argo?
Copy code
k8s_yaml(encode_yaml({
        'apiVersion': 'v1',
        'kind': 'Secret',
        'metadata': {'name': 'minio-secret'},
        'type': 'Opaque',
        'stringData': {
            'AWS_ACCESS_KEY_ID': 'rootuser',
            'AWS_SECRET_ACCESS_KEY': 'rootpass123',
            'AWS_ENDPOINT_URL_S3': '<http://minio.default.svc.cluster.local:9000>',
        }
    }))
a
this is the relevant bit - we add a kubernetes secret in the metaflow config that is auto-applied to all pods and has the right env vars for configuring boto3
locally, the value is read from your aws config
this provides you with the option to have two different values depending on the context
g
hmm.. not sure i'm following. the way we set up metaflow is via extension (https://github.com/Netflix/metaflow-extensions-template/blob/master/README.md), we config most of the the env vars automatically so users of metaflow don't need to know about it. users usually iterate locally with
python flow.py run
and when it feels ready, they create an argo workflow template with
python flow.py argo-workflows create
. I'm not sure how to set diff
METAFLOW_S3_ENDPOINT_URL
values depending on
run
or
argo-workflows create
. Am I missing some thing obvious? (btw, this may not matter much, but we're using azure instead of aws)
a
i am assuming you are also setting up a metaflow config for the user?
also, how are you setting up the S3_ENDPOINT_URL today?
g
i am assuming you are also setting up a metaflow config for the user?
we set the configs in
METAFLOW_HOME/config_{prod|staging}.json
and
Copy code
metaflow_extensions/org/config/mfextinit_org.py
also, how are you setting up the S3_ENDPOINT_URL today?
we're using the same endpoint for both local and remote currently, but this is unideal.
a
is it through the config too?
g
yeah, it's through setting
METAFLOW_S3_ENDPOINT_URL
in the
METAFLOW_HOME/config_{prod|staging}.json
.
a
in this scenario, the value wouldn't come from the config
basically boto3 looks at a few specific spots for the endpoint_url. you can set those up correctly on behalf of the user both locally as well as within the kubernetes pod
💡 1
that's the approach we use to set up the dev environment
we set a different value for endpoint url in aws config and a different value inside the pod through kubernetes secrets
💡 1
g
in your case, are you using
argo-workflows create
too?
a
yep
thankyou 1