Hello Folks, I'm trying to get access to a sandbo...
# ask-metaflow
n
Hello Folks, I'm trying to get access to a sandbox at https://docs.outerbounds.com/sandbox/, but got stuck on the "Warming up your Sandbox" screen with the
user_state_id
being
SandboxProvisioning
in the
nextstate
polls
āœ… 1
In the past weeks, creating a new sandbox took up to 1 min, but now it's been like that for like 30 mins. Tried refreshing the page a few times as well
a
can you try to request one more. i just killed your current one.
thankyou 1
n
thank you! got one or two
SandboxExpired
after your message, but now back to a similar experience like before: it's provisioning for a very long time
it just got ready - thanks for the help again šŸ™‡
among us party 1
but something might be off/overloaded, as e.g. trying the
ScaleOutFlow
without any modification fails, see e.g.
Copy code
$ metaflow-sandbox$ /home/workspace/mambaforge/envs/sandbox-tutorial/bin/python /home/workspace/workspaces/tutorials/scaling/flow.py run

Metaflow 2.11.15.3+ob(v1) executing ScaleOutFlow for user:sandbox
Validating your flow...
    The graph looks good!
Running pylint...
    Pylint is happy!
2025-03-18 21:53:20.559 Workflow starting (run-id 6), see it in the UI at <https://ui-pw-1324319814.outerbounds.dev/ScaleOutFlow/6>
2025-03-18 21:53:21.514 [6/start/19 (pid 1388)] Task is starting.
2025-03-18 21:53:24.713 [6/start/19 (pid 1388)] Foreach yields 3 child steps.
2025-03-18 21:53:24.713 [6/start/19 (pid 1388)] Task finished successfully.
2025-03-18 21:53:25.084 [6/process/20 (pid 1443)] Task is starting.
2025-03-18 21:53:25.188 [6/process/21 (pid 1452)] Task is starting.
2025-03-18 21:53:25.311 [6/process/22 (pid 1457)] Task is starting.
2025-03-18 21:53:26.501 [6/process/20 (pid 1443)] [job t-89fbc64a-fmcn5] Task is starting (Job status is unknown)...
2025-03-18 21:53:26.645 [6/process/21 (pid 1452)] [job t-bc895a79-f5hpk] Task is starting (Job status is unknown)...
2025-03-18 21:53:26.757 [6/process/22 (pid 1457)] [job t-9ae5d623-n8494] Task is starting (Job status is unknown)...
2025-03-18 21:53:38.730 [6/process/20 (pid 1443)] Kubernetes error:
2025-03-18 21:53:38.730 [6/process/20 (pid 1443)] Error (exit code 1). This could be a transient error. Use @retry to retry.
2025-03-18 21:53:38.841 [6/process/20 (pid 1443)] 
2025-03-18 21:53:38.932 [6/process/20 (pid 1443)] Task failed.
2025-03-18 21:53:39.015 Workflow failed.
2025-03-18 21:53:39.015 Terminating 2 active tasks...
2025-03-18 21:53:39.015 [6/process/22 (pid 1457)] [KILLED BY ORCHESTRATOR]
2025-03-18 21:53:39.016 [6/process/22 (pid 1457)] [KILLED BY ORCHESTRATOR]
2025-03-18 21:53:39.016 [6/process/21 (pid 1452)] [KILLED BY ORCHESTRATOR]
2025-03-18 21:53:39.016 [6/process/21 (pid 1452)] [KILLED BY ORCHESTRATOR]
2025-03-18 21:53:44.000 Killing 2 remaining tasks after having waited for 5 seconds -- some tasks may not exit cleanly
2025-03-18 21:53:44.000 Flushing logs...
2025-03-18 21:53:44.160 This failed task will not be retried.
2025-03-18 21:53:44.303 This failed task will not be retried.
    Step failure:
    Step process (task-id 20) failed.
this used to work without problems in the past
a
@limited-tomato-18674 would you happen to know?
l
Hey @nutritious-magazine-38839 - are you running into this consistently?
n
yes, it failed a few times -- but let me try again
receiving different errors now:
Copy code
/home/workspace/mambaforge/envs/sandbox-tutorial/bin/python /home/workspace/workspaces/tutorials/scaling/flow.py run

Metaflow 2.11.15.3+ob(v1) executing ScaleOutFlow for user:sandbox
Validating your flow...
    The graph looks good!
Running pylint...
    Pylint is happy!
    Metaflow service error:
    Metadata request (/flows/ScaleOutFlow) failed
now back to
KILLED BY ORCHESTRATOR
- I'll stop retrying to avoid making things worse 😊 but let me know if I can help with looking into anything šŸ™
l
Do you happen to have some custom extension called
resource_tracker
?
n
yes 🤐 but it's not supposed to be called in the above (
ScaleOutFlow
)
l
I’m seeing this in the logs - looks like its causing a crash at init
Copy code
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/runpy.py", line 188, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/local/lib/python3.9/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/tmp/metaflow/metaflow/__init__.py", line 115, in <module>
    from .plugins.datatools import S3
  File "/tmp/metaflow/metaflow/plugins/__init__.py", line 150, in <module>
    STEP_DECORATORS = resolve_plugins("step_decorator")
  File "/tmp/metaflow/metaflow/extension_support/plugins.py", line 137, in resolve_plugins
    plugin_module = importlib.import_module(path)
  File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/tmp/metaflow/metaflow_extensions/resource_tracker/plugins/track_resources/track_resources_decorator.py", line 10, in <module>
    from .resource_tracker import (
  File "/tmp/metaflow/metaflow_extensions/resource_tracker/plugins/track_resources/resource_tracker/__init__.py", line 7, in <module>
    from .tiny_data_frame import TinyDataFrame
  File "/tmp/metaflow/metaflow_extensions/resource_tracker/plugins/track_resources/resource_tracker/tiny_data_frame.py", line 13, in <module>
    class TinyDataFrame:
  File "/tmp/metaflow/metaflow_extensions/resource_tracker/plugins/track_resources/resource_tracker/tiny_data_frame.py", line 56, in TinyDataFrame
    self, data: Optional[dict | list] = None, csv_file_path: Optional[str] = None
TypeError: unsupported operand type(s) for |: 'type' and 'type'
Stream closed EOF for jobs-pw-1324319814/t-89fbc64a-fmcn5-stwxz (process)
n
oh, wow, I can take care of that (uninstalling and then supporting 3.9) .. sorry for the trouble 🤦
l
no worries at all! please let us know if you run into anything
thankyou 1