hello I m just starting out with setting up metaflow as a fi Outerbounds #ask-metaflow

Join Slack

Channels

ask-metaflow

dev-metaflow

hello, I'm just starting out with setting up metaf...

# ask-metaflow

dry-city-99474

10/17/2024, 12:08 PM

hello, I'm just starting out with setting up metaflow. as a first step, I'm just executing metaflow flows locally but having the artifacts backed by S3, so I set the following environment variables:

Copy code

METAFLOW_DEFAULT_DATASTORE
METAFLOW_DATASTORE_SYSROOT_S3
METAFLOW_DATATOOLS_S3ROOT

and it seems to work okay - I can get the

python flow.py card server

to work and show me my results (I'm actually running this on an ec2 instance via ssh and then using ssh tunneling locally to see the page) however, now I'm trying to load the flow run artifacts in a notebook but it says this flow doesn't exist (see screenshot) I did double check that the name is correct in the string... I was wondering how I could go about debugging this. a couple questions: • is there any way to list all the namespaces for the results? • what's the easiest way to check/see where the Client API is trying to look for the artifacts?

dry-city-99474

10/17/2024, 12:09 PM

oh, I also verified that the jupyter environment had the S3 datastore environment variables set properly

dry-beach-38304

10/17/2024, 4:57 PM

The client uses the metadata service. You could get it to work if you relocate your .metaflow directory to somewhere shared between your notebook and where you run the flow.

❤️ 1

dry-beach-38304

10/17/2024, 4:58 PM

(Needs to be the first one found in both cases of traversing your cwd up to the root in both cases)

❤️ 1

dry-city-99474

10/19/2024, 11:38 AM

thank you, Romain! really appreciate the help 🙂

dry-beach-38304

10/19/2024, 8:10 PM

did that work for you?

dry-city-99474

11/20/2024, 10:44 AM

hey @dry-beach-38304, sorry for the delay! I didn't realize this was the case until now, but it looks like I'm able to store the results on S3, but for some reason when someone else tries to inspect flow results in a notebook, they only have access to their own flow runs instead of all of the flow runs on S3 would you happen to know how we can enable loading other people's run results? for reference, we are ssh'ing into EC2 instances to run workflows locally (for now), while setting the environment variables to have the flow artifacts stored on S3. when I look at the S3 bucket, I can see everyone's artifact directories, but we just can't see them through the flow API

dry-city-99474

11/20/2024, 3:50 PM

note: i also tried setting

namespace(None)

and this doesn't seem to resolve the problem (it's still only retrieving my own flow runs)

dry-city-99474

11/20/2024, 5:21 PM

i guess we aren't able to do this because there is no local metadata provider that allows using the S3Datastore: https://github.com/Netflix/metaflow/blob/05f9756077fc98d10be47b434863c101829544f9/metaflow/plugins/metadata_providers/local.py#L44 - it's just hardcoded to using LocalStorage although one thing that is a little confusing is that it seems like my local flow runs are being saved both in local storage and on S3...

dry-city-99474

11/20/2024, 5:22 PM

it seems like my easiest workaround for enabling this is actually using the datastore API directly for accessing artifacts on S3, but i need to spend some time figuring out how this internal API works...

dry-beach-38304

11/20/2024, 10:29 PM

sorry for the delay — yes, the local datastore uses the file system. You could (never tried it) use it to store your local data on a shared drive (which you could back by S3 for example). That should then work. You would just need to let metaflow know hwere it is and it shoudl work. It’s probably easier than the other way.

dry-city-99474

11/21/2024, 11:23 AM

yeah that makes sense! i ended up doing the internal API hacking, since the file system mounting ends up being an extra step that every person needs to figure out in order to use metaflow in case anyone in the future needs this, here is the basic code I wrote - it's hacky but it should be okay for temporary use:

Copy code

from typing import Any

from metaflow.datastore.datastore_set import TaskDataStoreSet
from metaflow.datastore.flow_datastore import FlowDataStore
from metaflow.datastore.task_datastore import TaskDataStore
from metaflow.metaflow_environment import MetaflowEnvironment
from metaflow.plugins.datastores.s3_storage import S3Storage

def list_runs(flow_name: str) -> list[str]:
    """Gets all run IDs vailable on S3

    Parameters
    ----------
    flow_name : str
        Metaflow FlowSpec class name

    Returns
    -------
    list[str]
        Valid Run IDs available for the given FlowSpec
    """
    storage = S3Storage(S3Storage.get_datastore_root_from_config(None))

    flow_content_list = storage.list_content([flow_name])
    return [
        entry.path.split("/")[1]
        for entry in flow_content_list
        if not entry.path.endswith("data/")
    ]

def load_all_steps(flow_name: str, run_id: str) -> dict[str, TaskDataStore]:
    """Loads all data for every step for the given flow run

    Parameters
    ----------
    flow_name : str
        Metaflow FlowSpec class name
    run_id : str
        Valid flow run id

    Returns
    -------
    dict[str, TaskDataStore]
        Dictionary from step name to step artifacts

    """
    environment = MetaflowEnvironment(None)
    flow_datastore = FlowDataStore(
        flow_name,
        environment,
        storage_impl=S3Storage,
        ds_root=S3Storage.get_datastore_root_from_config(None),
    )
    ds_set = TaskDataStoreSet(flow_datastore, run_id)
    return {a.step_name: a for a in ds_set}

def load_step(flow_name: str, run_id: str, step_name: str) -> dict[str, Any]:
    """Load data for a specific step for a specific run

    Parameters
    ----------
    flow_name : str
        Metaflow FlowSpec class name
    run_id : str
        Flow run id
    step_name : str
        Name of the step

    Returns
    -------
    dict[str, Any]
        Dictionary containing all artifacts available for
        the step

    """
    step_datastore = load_all_steps(flow_name, run_id)[step_name]
    return step_datastore.to_dict()

def load_step_end_data(flow_name: str, run_id: str) -> dict[str, Any]:
    """This is the go-to/standard function to use when
    inspecting flow run results.

    Loads the artifacts available at the end of the flow.

    Parameters
    ----------
    flow_name : str
        Metaflow FlowSpec class name
    run_id : str
        Flow run id

    Returns
    -------
    dict[str, Any]
        All artifacts available at the end step,
        keyed by variable name

    """
    return load_step(flow_name, run_id, "end")

dry-city-99474

11/21/2024, 11:23 AM

anyhoodles, thanks for the help! i appreciate you working through the problem with me 🙂

Open in Slack

Previous Next