Hi! Have a question regarding using `metaflow_ray`...
# ask-metaflow
b
Hi! Have a question regarding using
metaflow_ray
on argo workflows, I've installed JobSet to the same namespace with my argo workflows runner namespace, and had the permission added to my service account, but hitting some error at the moment about the webhook service path, is there anyway i can modify this value? can't seem to find it either from the metaflow configs or anywhere, thanks in advance!
error logs from argo workflows:
Copy code
time="2025-06-30T09:02:18 UTC" level=info msg="Starting Workflow Executor" version=v3.5.8
time="2025-06-30T09:02:18 UTC" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2025-06-30T09:02:18 UTC" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=mlops-metaflow-runtime podName=test.user.mlow.testflow-bbftx-train-1708488516 templateName=train version="&Version{Version:v3.5.8,BuildDate:2024-06-18T03:43:17Z,GitCommit:3bb637c0261f8c08d4346175bb8b1024719a1f11,GitTag:v3.5.8,GitTreeState:clean,GoVersion:go1.21.10,Compiler:gc,Platform:linux/amd64,}"
time="2025-06-30T09:02:18 UTC" level=info msg="Loading manifest to /tmp/manifest.yaml"
time="2025-06-30T09:02:18 UTC" level=info msg="kubectl create -f /tmp/manifest.yaml -o json"
Error from server (InternalError): error when creating "/tmp/manifest.yaml": Internal error occurred: failed calling webhook "<http://mutate-jobset-x-k8s-io-v1alpha2-jobset.x-k8s.io|mutate-jobset-x-k8s-io-v1alpha2-jobset.x-k8s.io>": failed to call webhook: Post "<https://jobset-webhook-service.argocd.svc:443/mutate-jobset-x-k8s-io-v1alpha2-jobset?timeout=10s>": service "jobset-webhook-service" not found
time="2025-06-30T09:02:19 UTC" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 1
okay, error above seems to be issue with messed up namespace, was able to work around it, now a separate issue where im hitting:
Copy code
Error from server (Forbidden): error when creating "/tmp/manifest.yaml": admission webhook "validate-jobset-x-k8s-io-v1alpha2-jobset.x-k8s.io" denied the request: spec.replicatedJobs[0].groupName: Invalid value: "": a DNS-1035 label must consist of lower case alphanumeric characters or '-', start with an alphabetic character, and end with an alphanumeric character (e.g. 'my-name',  or 'abc-123', regex used for validation is '[a-z]([-a-z0-9]*[a-z0-9])?')
seems like from the workflow template
spec.replicatedjobs.groupname
is not set
seems like when i try to update the workflow template with the specified value:
spec.replicatedJobs[*].groupName
manually it throws error saying invalid fields provided, kinda stuck on this
Error when providing `groupName`:
Copy code
Error from server (BadRequest): error when creating "/tmp/manifest.yaml": JobSet in version "v1alpha2" cannot be handled as a JobSet: strict decoding error: unknown field "spec.replicatedJobs[0].groupName", unknown field "spec.replicatedJobs[1].groupName"
a
@hundreds-zebra-57629 any thoughts?
b
for context my deployments versions are: • Argo Workflows:
v3.5.8
• JobSet:
v0.8.1