Hi all - we're trying to use the `METAFLOW_CLIENT_...
# ask-metaflow
n
Hi all - we're trying to use the
METAFLOW_CLIENT_CACHE_PATH
env var to cache s3 files "locally" (from a databricks notebook - though I don't think being on databricks matters here). When we execute this code:
Copy code
os.environ['METAFLOW_CLIENT_CACHE_PATH'] = './tmp'

runs = list(Flow('MyFlow').runs(TAG))
runs[0].data.run_results
We get the error:
Copy code
FileNotFoundError: [Errno 2] No such file or directory: '/Workspace/Users/<me>/tmp/s3.s3:/my-s3-bucket.MyFlow/2b/dlobj1sb8hqy3' -> './tmp/<s3.s3://my-s3-bucket.MyFlow/2b/2b19368886545832ce20692d56e4e9285627674b.blob>'
File <command-1625965594877807>, line 1
----> 1 run.data.run_results
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/metaflow/client/filecache.py:270, in FileCache.create_file(self, path, value)
    268     tmpfile.write(value)
    269     tmpfile.flush()
--> 270     os.rename(tmpfile.name, path)
    271 except:  # noqa E722
    272     os.unlink(tmpfile.name)
My understanding is this is an issue with the
create_file
method in the filecache.py file itself (here): It seems metaflow is creating that temp cached file and trying to save it to a directory locally, but calls
os.path.dirname
on the input first which gives us the
./tmp/<s3.s3://my-s3-bucket.MyFlow/2b>
path we're seeing in our databricks workspace. Then it tries to save the file to the same path, but never calls the
os.path.dirname
method on our s3 path, so it ends up being a malformed dir structure and the temp file gets deleted (lines 267-270). Wondering if you've seen this before / if there's a way to get around it?
Also, the path generally looks weird. I'm seeing other posts where people are getting a
metaflow.s3.hash
path - wondering why ours is an unformatted s3 path?
a
Hi! I am OOO but can you try two quick changes - can you set the env var before importing metaflow and did you mean to use ./tmp or /tmp?
n
Hey Savin - hope you're getting some good time off 🙂 For context on this, we are on databricks notebook so using
/tmp
isn't really possible. Setting the cache path to the full path
/Workspace/Users/<me>/tmp
leads to the same error. We also are setting the env var before importing metaflow.
@square-wire-39606 not sure if you're back, but if you are - any follow-up ideas on this one?
a
yes i am! just working my way through the backlog. also let me loop in @dry-beach-38304 if he has any quick thoughts before we dig deeper into this
🙌 1
n
Hey all - pinging on this again. We're somewhat blocked (from using Metaflow cli from our databricks notebooks) on this question.
d
Do you have the full traceback? I briefly looked at it and suspect the
create_file
is being called from
store_key
from line 426 (from the
FileBlobCache
. This should ideally be called from here and the key there should ideally not include the s3 portion so I suspect something may be going wrong higher up. If you have a more detailed traceback (and even better, values for the parameters passed to the functions), that would be helpful in understanding where the issue may be coming from. It should definitely not work like this (clearly) but I am not convinced the issue is coming from the filecache atm.
n
Let me DM you and Savin the detailed output