Hi, first post here! I am learning about how to de...
# ask-metaflow
c
Hi, first post here! I am learning about how to deploy metaflow on GCP (my end goal is to learn about K8s, ML orchestration frameworks and distributed training from a lower level angle). I am essentially trying to deploy the template here one step at a time, and it's great! I was wondering what the purpose of each service is. Initially my understanding was: • Metadata service -> When an instance of a flow is submitted as a run, I understood this service is responsible for saving the code snapshot and data artefacts to the datastore (eg. S3) and stores metadata on the run itself (eg. name, status, artefacts and locations etc) in a SQL instance. I also assumed the UI sends requests to this service to render a frontend with the info on runs • Metaflow static ui service -> The Metaflow UI React frontend has been previously built, and this service hosts the static frontend assets using React's API (eg. using npm start or something similar) • Metaflow ui backend service -> Given the above, I am unsure what this is used for. Why is there a UI backend service and a metadata service? Would anyone be able to explain the purpose of each of these services? I couldn't figure it out from the docs why there is a specific UI backend service. Here is my template in case anyone is interested. Thanks so much!
1
d
The ui backend service talks to the metadata service and also uses the Metaflow client to provide info displayed in the ui front end.
c
Thanks @dry-beach-38304! And why can’t the ui front end talk directly to the metadata service? Is this more a design choice?
d
Pretty much. @bulky-afternoon-92433 was instrumental in building it so he may have a better answer.
b
the metadata service is a simple read/write API for the metaflow client to use for recording flow related metadata. as such it is really lightweight on the resource requirements as well compared to this, when we started developing the UI for visualizing flows, the requirements for data fetching were way beyond the capabilities of the metadata service, and there was a core requirement that the UI should never impact regular flow operations. These were the main reasons why the ui backend is built completely separate. also a note on the ui backend talking to the metadata service; this is true, but for the above isolation reasons it does not communicate with a separate host. instead the ui backend hosts its own, read-only version of the metadata api under the hood which it uses when needing to fetch things with the client library.
in a nutshell: the ui backend does a lot of heavy processing to provide the required data for the frontend. we did not want this to impact regular metadata operations so this was kept as a separately hosted service.
c
Thanks so much both for explaining!