Question for you guys with the new `@card` goodies...
# dev-metaflow
f
Question for you guys with the new
@card
goodies in the metaflow UI 🚀 I’ve pulled in the fresh metaflow service image from dockerhub and have triggered a new deployment of the UI service, however I’ve noticed it’ll get some errors on startup. It seems to be related to the UI server startup logic that preloads 50 runs, and if one of those was executed by a user still on an older metaflow version before the DAGs started becoming created by default. From the cloudwatch logs from the failed ECS task I’m seeing:
Copy code
services.ui_backend_service.data.cache.utils.DAGParsingFailed: DAG Parsing failed: \'NoneType\' object has no attribute \'flowspec\'
More in 🧵
1
👀 1
There’s not much in the ECS task’s cloudwatch logs to indicate which flow/run is the underlying problem (lots of intermixed logs from multiprocessing), but my hunch is it’s a run (that’s within the 50 most recent) that was executed by a user via local metaflow on version
2.4.3
FWIW in the UI for that run’s DAG we can see it’s not generated. That’s still on the v2.1 UI since the unhealthy containers prevented the ALB from swapping over
full error from cloudwatch
Copy code
ERROR:CacheStore:{'type': 'error', 'message': "DAG Parsing failed: 'NoneType' object has no attribute 'flowspec'", 'id': 'DAGParsingFailed', 'traceback': 'Traceback (most recent call last):\n  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main\n    "__main__", mod_spec)\n  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code\n    exec(code, run_globals)\n  File "/root/services/ui_backend_service/data/cache/client/cache_server.py", line 302, in <module>\n    cli(auto_envvar_prefix=\'MFCACHE\')\n  File "/opt/latest/lib/python3.7/site-packages/click/core.py", line 1128, in __call__\n    return self.main(*args, **kwargs)\n  File "/opt/latest/lib/python3.7/site-packages/click/core.py", line 1053, in main\n    rv = self.invoke(ctx)\n  File "/opt/latest/lib/python3.7/site-packages/click/core.py", line 1395, in invoke\n    return ctx.invoke(self.callback, **ctx.params)\n  File "/opt/latest/lib/python3.7/site-packages/click/core.py", line 754, in invoke\n    return __callback(*args, **kwargs)\n  File "/root/services/ui_backend_service/data/cache/client/cache_server.py", line 298, in cli\n    Scheduler(store, max_actions).loop()\n  File "/root/services/ui_backend_service/data/cache/client/cache_server.py", line 199, in __init__\n    maxtasksperchild=512,  # Recycle each worker once 512 tasks have been completed\n  File "/usr/lib/python3.7/multiprocessing/context.py", line 119, in Pool\n    context=self.get_context())\n  File "/usr/lib/python3.7/multiprocessing/pool.py", line 176, in __init__\n    self._repopulate_pool()\n  File "/usr/lib/python3.7/multiprocessing/pool.py", line 241, in _repopulate_pool\n    w.start()\n  File "/usr/lib/python3.7/multiprocessing/process.py", line 112, in start\n    self._popen = self._Popen(self)\n  File "/usr/lib/python3.7/multiprocessing/context.py", line 277, in _Popen\n    return Popen(process_obj)\n  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__\n    self._launch(process_obj)\n  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 74, in _launch\n    code = process_obj._bootstrap()\n  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap\n    self.run()\n  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run\n    self._target(*self._args, **self._kwargs)\n  File "/usr/lib/python3.7/multiprocessing/pool.py", line 121, in worker\n    result = (True, func(*args, **kwds))\n  File "/root/services/ui_backend_service/data/cache/client/cache_worker.py", line 29, in execute_action\n    execute(tempdir, action_cls, request)\n  File "/root/services/ui_backend_service/data/cache/client/cache_worker.py", line 56, in execute\n    invalidate_cache=req.get(\'invalidate_cache\', False))\n  File "/root/services/ui_backend_service/data/cache/generate_dag_action.py", line 97, in execute\n    results[result_key] = json.dumps(dag)\n  File "/usr/lib/python3.7/contextlib.py", line 130, in __exit__\n    self.gen.throw(type, value, traceback)\n  File "/root/services/ui_backend_service/data/cache/utils.py", line 130, in streamed_errors\n    get_traceback_str()\n  File "/root/services/ui_backend_service/data/cache/utils.py", line 124, in streamed_errors\n    yield\n  File "/root/services/ui_backend_service/data/cache/generate_dag_action.py", line 95, in execute\n    dag = generate_dag(run)\n  File "/root/services/ui_backend_service/data/cache/generate_dag_action.py", line 123, in generate_dag\n    raise DAGParsingFailed(f"DAG Parsing failed: {str(ex)}")\n\nservices.ui_backend_service.data.cache.utils.DAGParsingFailed: DAG Parsing failed: \'NoneType\' object has no attribute \'flowspec\'\n', 'key': None}
Update – whatever it was, it seems to have worked itself out after cycling for awhile and a deployment got through 😅
Some oddities though 😬
able to poke around older SFN-deployed flows that had working DAGs before and get some errors now
a
Thanks for reporting. We are investigating!
❤️ 2
b
@fresh-laptop-72652 we hope to have a fix for this in a release within the next couple of days.
f
Appreciate it! I’ll keep an eye out for it 🙌
b
@fresh-laptop-72652 we have just released
metaflow-service
v2.2.1 and
metaflow-ui
v1.1.1. These releases should fix the DAG issues you were having. Please let us know.
💯 1
f
🏃 🏃 🏃
Fixes look good, DAGs are rendering, and
@cards
are up and running!
also digging the searchable logs + timestamps now ❤️