Here's an idea we just did internally. If ppl lik...
# dev-metaflow
l
Here's an idea we just did internally. If ppl like this, I'd love to see something more 1st-class/polished distributed in the
metaflow
project. It was the first thing our DS asked for when we made them start using Metaflow. === It lets you copy/paste your
@step
code into the global scope of a jupyter notebook so that you can iterate, e.g. on dataframe logic line by line the way DS like to. Then copy/paste the new code back into the flow with decent certainty that it works as expected. A note about why we thought this was worth doing, and the code if you want to weigh in or try it...
👀 1
Metaflow has multiple ways to resume or run flows, but none of them let us interrogate our dataframes straightforwardly in an ipython session (e.g. notebook) 1. It's not
python myflow.py resume
a. which loads your artifacts/parameters/configs from a previous (even unsuccessful) run 2. It's not the upcoming "Metaflow step spin" feature a. which allows you to run a single step, while mocking it's
self.
(input) properties to be whatever you like 3. It's not
NBRunner
a. which allows you to run a flow from a notebook 4. It's not setting a breakpoint and then step debugging a. which allows you to preview artifacts, but only explore them in a debug terminal (worse than a notebook for iterating) i. The data viewer in VS Code (and similar in PyCharm) is nice, but you write follow up code in a terminal, not a notebook--which isn't as nice of a DevUx 5. It's not using the Metaflow SDK to fetch run artifacts, e.g.
Flow("Flow").latest_successful_run.data["..."]
a. It does this under the hood, so you can directly copy/paste your step code b. The returned
self.
object has the IDE autocompletion for the attributes that are specific to
YourFlow
message has been deleted
message has been deleted
d
hey @lively-lunch-9285 — haven’t looked at everyting in detail but we did do something like this in the Netflix extensions (which I am coincidentally just updating):
metaflow debug task …
allows you to generate all the files to debug your step line by line. It creates the proper conda environment, sets it up, mocks everything you need and you can execute each line in a notebook. We support iterating before the step (to see what will happen) as well as after (to debug). Internally we even have a button in the ui which you can click and it launches a new instance sized just the right way and puts you in a notebook. See here: https://github.com/Netflix/metaflow-nflx-extensions?tab=readme-ov-file#debug This was amazing work by @rhythmic-controller-77489 and yes, DS do like it 🙂. You do seem to handle config values better. We didn’t do that (yet). The config obj is indeed stored as a regular dict (conscious decision to allow cients to access more easily although we recently discussed storing them directly as COnfigValue (which is the equivalent of your DotDict).
i’m updating it to work on 2.16+. The PR should probably work: https://github.com/Netflix/metaflow-nflx-extensions/pull/59 and am working on fixing the tests.
l
Oh wow, it sounds like you basically created a feature the UnionAI distro of Flyte has. You click on a step (potentially a failed one) and it provisions a cloud instance for you with all the libs installed and the environment and artifacts loaded.
👀 1
d
that’s right 🙂
(that part is more internal only — the OSS version will do it on the machine you are launching it from)
👀 1
(it doesn’t take much to put it in place but that’s more specific to your infra)
it’s pretty cool imho (even just the oss part).
👀 1
feel free to check it out and give feedback 🙂.
🙌 1
(it does work only with the conda/pypi version in the extension though. If you want to contribute a version that works with the vanilla one, feel free to open a PR 🙂)
👀 1