Hey Folks, Another query regarding runner API. Whe...
# ask-metaflow
s
Hey Folks, Another query regarding runner API. When I run a sample flow from the CLI, the execution speed of a simple flow is fast. However if I run it via the runner APIs then its takes some time before it can start the flow. Are my obersvations correct? if yes can we speed this up anyway?
1
b
There is some difference that is to be expected because we parse the CLI tree (of all args and options) at runtime so as to inject user-defined parameters along with type-checking etc. We also write to a file and read back to ensure the correct metadata and the ability to get the
Run
object.. But, with all that said, it shouldn't be too slow..
s
Hmm in the order of a few seconds for instance a cli run completes in less than a sec and for the same run the runner takes 12-15 s
b
would you mind sharing the flow?
s
Copy code
from typing import List, Literal

from metaflow import FlowSpec, JSONType, Parameter, step
from pydantic import BaseModel


class ImagePath(BaseModel):
    img_path: str


class SampleFlowResult(ImagePath):
    """
    Sample Flow workflow response
    """

    result: Literal["success", "failed", "in_progress"]


class SampleFlowResponse(BaseModel):
    """
    Sample flow response
    """

    operation: Literal["sample_flow"]
    response: List[SampleFlowResult]


class SampleFlowWorkflowInput(BaseModel):
    request_id: str
    images: list[str]


class SimpleFlow(FlowSpec):
    input_payload = Parameter(
        "input_payload", help="The payload containing the images to be processed.", required=True, type=JSONType
    )

    some_key_1 = Parameter(
        "some_key_1", help="Some Extra Parameter provided to the workflow", type=str, default="key_1"
    )

    some_key_2 = Parameter(
        "some_key_2",
        help="Some Extra Parameter provided to the workflow",
        type=str,
        default="key_2",
    )

    @step
    def start(self) -> None:
        print("This is the start step.")
        self.next(self.process_data)

    @step
    def process_data(self) -> None:
        print("This is the data processing step.")
        self.next(self.end)

    @step
    def end(self) -> None:
        
        self.output_payload = SampleFlowResponse(
            operation="sample_flow",
            response=[
                SampleFlowResult(
                    img_path="path/to/image.jpg",
                    result="success",
                )
            ],
        ).model_dump_json()
        print("This is the end step.")


if __name__ == '__main__':
    SimpleFlow()
I've 2 cases where a) When I run this as a cli is file completes in under a second b) As runner it takes about 12 - 15s
Is it something with python package resolution?
b
what is your CLI command? what does your snippet look like while using the Runner?
s
I'll try and paste an easier one with no input params
when Isolating it from my system I do see slightly faster speeds. Investigating this more. I'll post again for another update. Thanks @brainy-truck-72938
👍 1
Hey @brainy-truck-72938! Hope you're doing well. We were playing around with the slow running and tried to mimic what Command Manager.run does. Over here, there's the
subprocess.Popen
being used. However if we change this to subprocess.run blocking call the response times are atleast 5 times faster
b
thanks, will take a look
🙏 1
s
something to do with argument formating if I change the https://github.com/Netflix/metaflow/blob/d4a0ec7fa9c3a076260f7083cda683f20cc64e2a/metaflow/runner/subprocess_manager.py to this call
Copy code
self.process = subprocess.Popen(
                    shlex.join(self.command),
                    cwd=self.cwd,
                    env=self.env,
                    stdout=subprocess.PIPE,
                    stderr=subprocess.PIPE,
                    bufsize=1,
                    universal_newlines=True,
                    shell=True
                )
The response is quite fast
b
interesting, do you wanna contribute a PR? with some snippets that can show the improvement? I hope passing
shell=True
doesn't affect anything else..
s
Could give it a try ! Where can I find the documentation to contribute towards the PR?
b
I guess the correct doc is here: https://docs.metaflow.org/internals/contributing but ideally, it's the same routine -- fork it, clone the fork, push changes in a separate branch, open a PR, etc.
👍 1
s
@brainy-truck-72938 someone from my team has raised this PR
👀 1