Here is some pseudo code I would like to run in parallel to Outerbounds #ask-metaflow

Here is some pseudo code, I would like to run in p...

happy-wolf-7852

08/05/2024, 1:11 PM

Here is some pseudo code, I would like to run in parallel to a degree. Let us assume I have 10 folds to establish, if some new approach works better. So we may have a incumbent vs challenger or the like situation:

Copy code

@step
def train_models(self):
    
    for fold in self.fold_number_list:

        some_train_data = self.some_data[
            (self.some_data != fold) &
        ...
        # fit incumbent model
        # fit challenger model
        # establish some performance metric(s) on test/validation etc.

I used AWS batch previously, but still think it actually works sequentially from what I have observed (or possibly this is how our AWS infra is configured?). What other approach(s) do people tend to use to speed things up - ideally reduce the runtime to a 10th-ish in this example (scaling out wise that is)? Thanks.

✅ 1

happy-wolf-7852

08/06/2024, 8:52 PM

😃 Anyone?

fast-vr-44972

08/06/2024, 10:08 PM

It looks like you have a single model but you want to train it in parallel? This is kind of distributed training. Maybe you can find something here https://outerbounds.com/blog/distributed-training-with-metaflow/

happy-wolf-7852

08/07/2024, 7:13 AM

Thanks for the reply. Not sure if the above is applicable as I only want to fit bread and butter/basic xgb models. So all 10 models are fitted with 1/(n-1)th of the data and tested on 1/nth of the data where n is the number of folds. So a foreach should work. I used AWS batch for this in the past but as far as I could tell all n jobs where executed as batch/sequentially … so not in 1/nth-ish of the time …

happy-wolf-7852

08/07/2024, 7:20 AM

Having said that. Maybe (most likely), I do not use AWS batch + foreach correctly and should study this: https://docs.aws.amazon.com/batch/latest/userguide/multi-node-parallel-jobs.html and the @parallel decorator more, which may indeed behave differently to @foreach

dazzling-raincoat-7045

08/08/2024, 5:05 PM

https://docs.metaflow.org/metaflow/basics#foreach

happy-wolf-7852

08/08/2024, 7:42 PM

Thanks sure I am aware of foreach but in our aws batch the steps appear to be still executed sequentially …

2 Views

Open in Slack

Previous Next