(highlighting this answer since it is a FAQ) If y...
# ask-metaflow
u
(highlighting this answer since it is a FAQ) If you have a well-defined workflow that data scientists don't need to change often, you can certainly make it all config-driven. Take a look at this simple example that reads a config file like this:
Copy code
{
  "model": "svm",
  "dataset": "wine"
}
In this case the config is in JSON but t could be YAML or anything else as well. It defines what data and model family to use and based on this info, trains and evaluates a model using a predefined workflow. In a more complex real-life case you could have
model_
and
dataset_
functions be plugins that data scientists can contribute by themselves (see e.g. this example from the Metaflow book for inspiration), and you can have pluggable feature encoders too to ensure offline/online consistency. You still get all the benefits of Metaflow in this example: Configs for every run get persisted automatically (thanks to
IncludeFile
, as well as all the other artifacts). You can test it at scale and deploy as usual.
🙌 2