acoustic-van-30942
07/11/2023, 5:40 PM@parallel
decorator. There is the @gpu_profile
decorator which is a good start. But that is not in real-time; it profiles usage after training. For training LLMs, you typically want to monitor your cluster during training.
By way of comparison, Ray's UI below allows users to monitor their clusters in real-time (see screenshot below).ambitious-bird-15073
07/12/2023, 10:46 AMancient-application-36103
07/12/2023, 1:50 PMacoustic-van-30942
07/12/2023, 5:15 PMambitious-bird-15073
07/12/2023, 11:28 PM