Hi devs, it would be great to see metaflow working...
# dev-metaflow
c
Hi devs, it would be great to see metaflow working with tqdm: currently progress bars are not printed until the very end of the step. There is an old ticket that captures the problem. https://github.com/Netflix/metaflow/issues/32 It appears to be caused by how output streams are buffered and not flushed until the end of the step, so these issues are also related: https://github.com/Netflix/metaflow/issues/166 https://github.com/Netflix/metaflow/issues/221
s
actually a whole new logging subsystem was released a few months ago which should have fixed #166 and #221. We'll review the old tickets and confirm that they are fixed now
correspondingly there's a chance that
tqdm
might work better now. Have you tried it with the latest version?
c
tested with metaflow 2.3.0. I will update to 2.3.1, but changelog does not mention anything about logging...
s
I haven’t tried tqdm recently. I’ll check it too
c
Just tested with 2.3.1 and found the same behaviour.
s
ok, good to know. We’ll take a look
I’ll update this thread tomorrow
👍 1
c
I think tqdm might be messing with buffer flushing in unexpected ways. Running the below flow with and without tqdm set in the for loop def shows that all print output is witheld until the end of the step only when tqdm is used:
Copy code
from metaflow import FlowSpec, step
import tqdm
from time import sleep

class TQDMFlow(FlowSpec):

    @step
    def start(self):
        print('Training...')
        for t in tqdm.tqdm(range(100)):
            print(t)
            sleep(0.004)
        sleep(5)
        self.next(self.end)

    @step
    def end(self):
        pass


if __name__ == '__main__':
    TQDMFlow()
I will copy this over to the #32
s
the fundamental issue is that the task that calls
tqdm
doesn't have direct access to the terminal although it may seem like it. The task may even run in the cloud and the output is streamed via S3. Pretty much the only workaround is to have
tqdm
outside the task like in this example:
not exactly a built-in solution but it is easy to implement
the task needs to emit special lines (in my example
<<BEGIN>>
and
<<ITER>>
) that the progress bar captures. You could wrap this special lines in a nice Python abstraction so you don't have to emit them manually, similar to what
tqdm
does internally
👍 1
a downside is that it pollutes logs a bit since
<<BEGIN>>
and
<<ITER>>
messages will get saved in logs but so would the progress bar, if it was inside the task
c
Thanks for doing the leg work on this. I will definitely play around with your workaround to see if we can “ship” it into some sort of easy-to-digest wrapper for the rest of the team. It is a shame that it is a bit more complicated than expected, but I see the point that about terminal access in metaflow tasks.
👍 1
i
@creamy-leather-49606 I came up with a solution which is a little less ugly https://stackoverflow.com/questions/68225881/how-to-show-tqdm-progress-in-metaflow