user
02/04/2023, 12:59 AM{
"model": "svm",
"dataset": "wine"
}
In this case the config is in JSON but t could be YAML or anything else as well. It defines what data and model family to use and based on this info, trains and evaluates a model using a predefined workflow.
In a more complex real-life case you could have model_
and dataset_
functions be plugins that data scientists can contribute by themselves (see e.g. this example from the Metaflow book for inspiration), and you can have pluggable feature encoders too to ensure offline/online consistency.
You still get all the benefits of Metaflow in this example: Configs for every run get persisted automatically (thanks to IncludeFile
, as well as all the other artifacts). You can test it at scale and deploy as usual.