Hey, community I have been using metaflow to run s...
# ask-metaflow
m
Hey, community I have been using metaflow to run some pytorch workflow. I have a question regarding the shared_memory Currently, I am running into the issue, with the Dataloader
Copy code
41%|████      | 47/115 [00:31<00:44,  1.52it/s]ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
So, I updated my workflow to bump up shared memory
Copy code
@resources(shared_memory=16092, memory=14092)
    @kubernetes(image='<http://registry.gitlab.com/xxx/xxx/aipilot:latest|registry.gitlab.com/xxx/xxx/aipilot:latest>')
    @timeout(minutes=150)
    @step
    def evaluate(self):
but alas when I run, with
--with kubernetes
, I see
Copy code
2023-08-02 18:29:03.819 [539/evaluate/1776 (pid 1041)] [pod t-w89bb-bw2wc] Task is starting (Pod is running, Container is running)...
when I exec into the pod, I had hoped the shared memory would have increased as opposed to the default 64MiB to 16092MiB
Copy code
root@t-w89bb-bw2wc:/# df -h|grep shm
shm              64M     0   64M   0% /dev/shm
root@t-w89bb-bw2wc:/# df -h
But I don't see any change? Please help