Good afternoon - I wonder if anyone can help me. I...
# ask-metaflow
r
Good afternoon - I wonder if anyone can help me. I've been tasked with setting up Metaflow/Argo Workflows for a POC and I'm stuck with defining the S3 datasource...
Copy code
(metaflow) ➜  metaflow python parameter_flow.py argo-workflows create


Metaflow 2.12.25 executing ParameterFlow for user:home
Validating your flow...
    The graph looks good!
Running pylint...
    Pylint is happy!
Deploying parameterflow to Argo Workflows...
It seems this is the first time you are deploying parameterflow to Argo Workflows.

A new production token generated.

The namespace of this production flow is
    production:parameterflow-0-gtqo
To analyze results of this production flow add this line in your notebooks:
    namespace("production:parameterflow-0-gtqo")
If you want to authorize other people to deploy new versions of this flow to Argo Workflows, they need to call
    argo-workflows create --authorize parameterflow-0-gtqo
when deploying this flow to Argo Workflows for the first time.
See "Organizing Results" at <https://docs.metaflow.org/> for more information about production tokens.

    S3 access failed:
    S3 operation failed.
    Key requested: <s3://REMOVED/ParameterFlow/data/3d/3d432aa822ce96c4ffaa418f182b005792d00589>
    Error: An error occurred (400) when calling the HeadObject operation: Bad Request
The bucket policy is fine and I'm able to uploaded/fetch objects via the awscli using my default aws profile I'll slack some more info in a thread
Copy code
➜  .metaflowconfig cat config_default.json
{
    "METAFLOW_DEFAULT_DATASTORE": "s3",
    "METAFLOW_DATASTORE_SYSROOT_S3": "<s3://REMOVED>",
    "METAFLOW_DATATOOLS_S3ROOT": "<s3://REMOVED/data>",
    "METAFLOW_KUBERNETES_NAMESPACE": "default",
    "METAFLOW_KUBERNETES_SERVICE_ACCOUNT": "default",
    "METAFLOW_PROFILE": "default",
    "AWS_DEFAULT_REGION": "eu-west-1",
    "AWS_REGION": "eu-west-1",
    "AWS_PROFILE": "default",
    "METAFLOW_S3_RETRY_COUNT": 0
}
Here is the basic python script I pulled form the net (Note I had to add the retry decorator because it took an age just to return the failed response
Copy code
from metaflow import FlowSpec, Parameter, step, retry

class ParameterFlow(FlowSpec):
    alpha = Parameter('alpha',
                      help='Learning rate',
                      default=0.01)

    @retry(times=0)
    @step
    def start(self):
        print('alpha is %f' % self.alpha)
        self.next(self.end)

    @retry(times=0)
    @step
    def end(self):
        print('alpha is still %f' % self.alpha)

if __name__ == '__main__':
    ParameterFlow()
The aws region is defined including the profile (it's already exported in my shell anyway) I know my AWS profile is fine, because if I change it and run the python command, metaflow complains about not being able to connect to Kubernetes to check for Argo workflows
To add a bit more info, I haven't followed the recommended Metaflow AWS tutorial... I'm hoping to keep this simple and learn as I go along. All I have at the moment is metaflow installed locally via
pip
and Argo workflows running in Kubernetes.
s
do you have permissions locally to make the HEAD call to S3?
Copy code
Error: An error occurred (400) when calling the HeadObject operation: Bad Request
the metaflow config looks fine
r
I don't think there is such a permission, it's more of an alias, at least that's my understanding:
Copy code
The following actions are related to HeadObject:

GetObject
GetObjectAttributes
Copy code
aws s3 cp parameter_flow.py <s3://REMOVED/ParameterFlow/data/05/05088f624c10273fa6ef6ad23916bce59b4cda17> --profile default
upload: ./parameter_flow.py to <s3://dREMOVED/ParameterFlow/data/05/05088f624c10273fa6ef6ad23916bce59b4cda17>

aws s3 cp <s3://REMOVED/ParameterFlow/data/05/05088f624c10273fa6ef6ad23916bce59b4cda17> test.tx --profile default
download: <s3://REMOVED/ParameterFlow/data/05/05088f624c10273fa6ef6ad23916bce59b4cda17> to ./test.tx
The 400 response for HeadObject has caused me issues in the past... I think once it was related to KMS. But still, if I can run it via the AWS CLI, I'm totally confused why it wouldn't work with this python script 🤔 I'm not a python developer, so maybe I am doing something wrong. I'm using a python virtualenv
OK I got it working.... I unset the AWS_SECRET_ACCESS_KEY, AWS_ACCESS_KEY_ID and AWS_SESSION_TOKEN env vars and suddenly it works I really don't get it, because those vars are dynamically populated by my SSO session using leapp... and the cli was fine.
Incase it helps anyone else: https://docs.aws.amazon.com/cli/latest/topic/config-vars.html#id1
Copy code
If AWS_PROFILE environment variable is set and the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables are set, then the credentials provided by AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY will override the credentials located in the profile provided by AWS_PROFILE.
Still don't understand how the CLI worked...