For those using the Metaflow Netflix Extensions, v1.2.0 is out which adds a new “debug” functionality:
https://github.com/Netflix/metaflow-nflx-extensions/tree/1.2.0?tab=readme-ov-file#debug. You can now easily recreate the environment of a task that already ran and either debug it (ie: if it crashed, you will be able to re-execute it line by line) or inspect it (see all the artifacts present at the end of the step). Internally we have it integrated with our cloud workstation infrastructure. It should be very to integrate with your own cloud workstations if you want to as well. Check it out!