Hi all Could I please get some help on how to do multi node Outerbounds #ask-metaflow

Hi all, Could I please get some help on how to do...

acoustic-van-30942

06/27/2023, 6:49 PM

Hi all, Could I please get some help on how to do multi-node multi-GPU distributed training on Metaflow? I tried adapting this PyTorch Lightning

ClusterEnvironment

for multi-node multi-GPU training? I'm stuck on deriving the

LOCAL_RANK

, which isn't provided as an environment variable using

@pytorch_parallel

. Thanks in advance!

2 Views