sparse-dress-4861
08/16/2024, 3:52 PMancient-application-36103
08/16/2024, 4:09 PMsparse-dress-4861
08/16/2024, 4:29 PMso your results wouldnât be shareable across users.ah, we definitely would like to make it shareable across the team. I am assuming it is not too much of an overhead for the metadata service and the UI ? The compute footprint for these services is also fairly low (IIRC)?
ancient-application-36103
08/16/2024, 4:51 PMsparse-dress-4861
08/16/2024, 7:29 PMsparse-dress-4861
08/16/2024, 8:57 PMancient-application-36103
08/16/2024, 9:51 PMsparse-dress-4861
08/16/2024, 9:51 PMsparse-dress-4861
08/16/2024, 9:52 PMancient-application-36103
08/16/2024, 9:53 PMancient-application-36103
08/16/2024, 9:53 PMsparse-dress-4861
08/16/2024, 9:56 PMancient-application-36103
08/16/2024, 9:57 PMancient-application-36103
08/16/2024, 9:57 PMsparse-dress-4861
08/16/2024, 9:59 PMancient-application-36103
08/16/2024, 10:00 PMsparse-dress-4861
08/16/2024, 10:02 PMancient-application-36103
08/16/2024, 10:03 PMsparse-dress-4861
08/16/2024, 10:06 PMsparse-dress-4861
08/17/2024, 12:02 AMsparse-dress-4861
08/17/2024, 12:03 AM@batch
how do I specify the type of GPU (p3 v/s p4 etc) ?sparse-dress-4861
08/17/2024, 12:05 AM@batch
?sparse-dress-4861
08/17/2024, 12:08 AMsparse-dress-4861
08/17/2024, 12:14 AMmetaflow-torchrun
but no luck in getting the hello
job to succeed. I see warnings on the AWS batch node. Seems like downloading from pypi is not working in that node.
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7f61751a7160>, 'Connection to <http://pypi.org|pypi.org> timed out. (connect timeout=15)')': /simple/awscli/
sparse-dress-4861
08/18/2024, 5:12 PMstale-eve-11739
08/19/2024, 2:17 AMsparse-dress-4861
08/19/2024, 6:05 PMstale-eve-11739
08/19/2024, 8:23 PMsparse-dress-4861
08/20/2024, 6:09 AMcrooked-jordan-29960
08/20/2024, 5:24 PMsparse-dress-4861
08/20/2024, 6:00 PM