is there a reference for how to set up metaflow wi...
# ask-metaflow
g
is there a reference for how to set up metaflow without having to port-forward? a little confused on which ENV variables need to be configured to what both within the deployed services and then in the .metaflow config file. we are using istio
I'm guessing - metaflowconfig: anything referencing localhost -> api.mydomain.com METAFLOW_SERVICE_INTERNAL_URL -> should stay the same (metaflow-service.metaflow:8080) k8s manifests: localhost -> api.mydomain.com
a
Correct!
g
so that kindof works, the ui-static <-> ui-backend loads no problem. however, not sure how to get past IAP auth when kicking off local runs currently we have metaflow-service metaflow-ui metaflow-ui-static deployed behind a GCP iap-protected domain
a
What is the issue that you are running into? Any logs or stack traces?
g
tried sending
Copy code
METAFLOW_SERVICE_HEADERS="{\"Authorization\": \"Bearer $(gcloud auth print-identity-token --impersonate-service-account sa-...@project.iam.gserviceaccount.com  --audiences=....<http://apps.googleusercontent.com|apps.googleusercontent.com> --include-email )\"}" METAFLOW_DEBUG_CONDA=1 GOOGLE_AUTH_TOKEN=$(gcloud auth print-identity-token --impersonate-service-account sa-...@project.iam.gserviceaccount.com  --audiences=....<http://apps.googleusercontent.com|apps.googleusercontent.com> --include-email)   python3 workflows/gen_synthetic_data.py --environment=conda run
with a service account but we get > Metaflow service error: Metadata request (/flows/GenSyntheticDataFlow/run) failed (code 405): 405: Method Not Allowed
a
Are you able to curl the metadata service by setting the right headers outside of metaflow?
g
I guess before that, is the metadata service the METAFLOW_SERVICE url (METAFLOW_SERVICE_URL": https://my-domain.com") or something else after it ( https://my-domain.com/api)?
I printed some of the python logs > 2024-07-10 113844.555 All packages already cached in gs. 2024-07-10 113844.555 All environments already cached in gs. RESP IS <PreparedRequest [GET]> -----------START----------- GET https://metaflow-ui.mydomain.com/api/metadata/flows/GenSyntheticDataFlow User-Agent: python-requests/2.32.3 Accept-Encoding: gzip, deflate Accept: / Connection: keep-alive Authorization: Bearer TOKEN-FROM-IMPERSONATE-SA None RESP IS <PreparedRequest [POST]> -----------START----------- POST https://metaflow-ui.mydomain.com/api/metadata/flows/GenSyntheticDataFlow/run User-Agent: python-requests/2.32.3 Accept-Encoding: gzip, deflate Accept: / Connection: keep-alive Authorization: Bearer TOKEN Content-Length: 212 Content-Type: application/json b'{"flow_id": "GenSyntheticDataFlow", "user_name": "maja", "tags": [], "system_tags": ["python_version:3.10.13", "metaflow_version:2.12.6+netflix-ext(1.2.0)", "runtime:dev", "user:maja"], "ts_epoch": 1720625925354}' Metaflow service error: Metadata request (/flows/GenSyntheticDataFlow/run) failed (code 405): 405: Method Not Allowed
a
Outside of metaflow, are you able to curl the endpoint?
g
no
Copy code
majas-MBP:~ maja$ curl -X POST <https://metaflow-ui.mydomain.acom/links> -H "Authorization: Bearer $(gcloud auth print-identity-token --impersonate-service-account sa-metaflow-user@proj.iam.gserviceaccount.com  --audiences=....apps.googleusercontent.com --include-email)"
WARNING: This command is using service account impersonation. All API calls will be executed as [sa-@proj.iam.gserviceaccount.com].
405: Method Not Allowed(base) majas-MBP-:~ maja$
within the cluster if I curl
Copy code
root@dagster-run-57c7de96-33fc-49c8-bb28-590b1e751334-bwbfq:/opt/dagster/app# curl -X GET <http://metaflow-ui.metaflow.svc.cluster.local:8083/api/metadata/flows/GenSyntheticDataFlow>      url -X POST <http://metaflow-ui.metaflow.svc.cluster.local:8083/api/metadata/flows/GenSyntheticDataFlow>
405: Method Not Allowedroot@dagster-run-57c7de96-33fc-49c8-bb28-590b1e751334-bwbfq:/opt/dagster/app#
which seems like the same thing I get when I run locally. does it have anything to do with this allow_get_requests_only
not sure how to set this up properly then
a
Do you have any existing services (outside of metaflow ones) using IAP on GCP?
g
yes
for MLFlow we were able to get a similar approach with just sending the SA token in a header ^
a
And are you able to curl them successfully using the custom headers?
g
yep
a
You should be able to do pretty much the same thing here
g
yeah but we hit the 405 error ^
wondering if this middleware is in the way
a
I would first leave ui stuff aside and just get the service to work, possibly on a different subdomain
g
so that kind of works (I can trigger the flow) but I get
Copy code
2024-07-10 15:42:48.604 [100/start/375 (pid 53663)] Task failed.
2024-07-10 15:42:48.767 Workflow failed.
2024-07-10 15:42:48.767 Terminating 0 active tasks...
2024-07-10 15:42:48.767 Flushing logs...
    Step failure:
    Step start (task-id 375) failed.
in the UI when I update the urls to have METAFLOW_SERVICE = mf-service.mydomain.com (this is the public.ecr.aws/outerbounds/metaflow_metadata_service:v2.4.9)
a
Hmm is there anything else about that task failure? I'd make sure you can run flows and maybe use metaflow python API, pointing to
<http://mf-service.mydomain.com|mf-service.mydomain.com>
in metaflow config. Ignore anything about the UI for now
g
Nothing else on the task failure, and the only reason I went to the ui is bc I didn’t see any other logs (except for that in the ui :) ) METAFLOW_SERVICE in my local config is indeed metaflow-service.mydomain.com (as in, the metaflow_metadata_service on port 8080 — not the “ui backend”)
a
I wonder if its something with other config parameters like METAFLOW_DATASTORE_SYSROOT_S3 etc. that makes it fail. You can also probably confirm in metaflow_metadata_service logs if your flow is able to communicate with the service
g
ok cool, running flows seems to be working (turns out there was an error in the flow buried) but the UI part is the last thing to tackle.
in metaflow-ui-static I have - name: METAFLOW_SERVICE value: https://metaflow-ui.mydomain.com/api which is an IAP-protected domain so getting this in logs > but unsure how to add the SERVICE_HEADERS or how to configure SERVICE_AUTH_KEY for the backend (seems this is an option instead of putting it behind IAP)
I am guessing that's why. the dot in upper right corner is green so at least some parts are working. and the homescreen (list of all flows) works
a
what do you have as
METAFLOW_SERVICE_URL
on the UI backend container? (not to be confused with METAFLOW_SERVICE on metaflow-ui-static)
try setting that to
"<http://localhost:8083/api/metadata>"
if you haven't already (its the default in our helm chart)
đź‘€ 1
🙌 1
g
fabulous! it worked 🙂 yep...i changed over some of the localhost references to the domain but turns out that one needs to stay. tysm
👍 1