salmon-agency-70336
12/25/2024, 5:49 AMglibc
on it
$ whereis ldd
ldd: /usr/bin/ldd /opt/glibc-2.28/bin/ldd
/usr/bin/ldd
is v2.26 and /opt/glibc-2.28/bin/ldd
is v2.28. This is for a package I want to use, onnxruntime
, which requires glibc>=2.27
.
When I try to install this package directly on the image it works. But when done via Metaflow it complains with
onnxruntime-1.19.2-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl is not a supported wheel on this platform.
which is possible because it does not recognize 2.28 is available(?). Any ideas how to resolve this? I know metaflow tinkers around with LD_LIBRARY_PATH
but I cannot figure out if that is interfering.salmon-agency-70336
12/25/2024, 5:51 AMcp39-cp39-manylinux_2_28_x86_64
and cp39-cp39-manylinux_2_27_x86_64
are compatible with the python on the image. I don't know if this is a hard constraint but putting it in here for reference.
python -m pip debug -v
WARNING: This command is only meant for debugging. Do not use this with automation for parsing and getting these details, since the output and options of this command may change without notice.
pip version: pip 23.3.1 from /usr/local/lib/python3.9/site-packages/pip (python 3.9)
sys.version: 3.9.18 (main, Dec 21 2024, 14:21:25)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-17)]
sys.executable: /usr/local/bin/python3.9
...
...
Compatible tags: 600
cp39-cp39-manylinux_2_28_x86_64
cp39-cp39-manylinux_2_27_x86_64
...
...
hundreds-rainbow-67050
12/25/2024, 6:04 AMCONDA_OVERRIDE_GLIBC=2.28
salmon-agency-70336
12/25/2024, 6:06 AMCONDA_OVERRIDE_GLIBC=2.28 python /path/to/flow.py --environment=conda run --with kubernetes
but no luckhundreds-rainbow-67050
12/25/2024, 6:30 AMsalmon-agency-70336
12/25/2024, 6:50 AMsalmon-agency-70336
12/25/2024, 1:18 PMCONDA_OVERRIDE_GLIBC
passed through when bootstrapping the env on the remote machine?salmon-agency-70336
12/25/2024, 1:18 PMancient-application-36103
12/25/2024, 4:05 PMinstalling via metaflow
- are you using @conda
or @pypi
? if so, the one included out of the box in metaflow or the netflix extensions?salmon-agency-70336
12/25/2024, 4:07 PMMetaflow 2.13+netflix-ext(1.2.3) executing
I am using the @pypi
decorator on my steps. But I believe that is not compatible with --environment=pypi
when using the extensions package and I ended up using --environment=conda
when running the flow.
Also, when running these are all the env variables I set
CONDA_OVERRIDE_GLIBC=2.28 METAFLOW_CONDA_DEPENDENCY_RESOLVER="conda" CONDA_CHANNELS="conda-forge" METAFLOW_DEBUG_CONDA=1 python ...
ancient-application-36103
12/25/2024, 4:08 PM--environment=pypi
is a symlink to --environment=conda
. do you run into the same issue if you try executing without the extensions (you may have to uninstall them)?salmon-agency-70336
12/25/2024, 4:11 PM--environment
issue. I get this when I set this to pypi
Incompatible environment:
The pypi_base decorator requires --environment=conda
ancient-application-36103
12/25/2024, 4:11 PMsalmon-agency-70336
12/25/2024, 4:15 PMtyper[all]==0.9.4
pandas==1.5.3
numpy==1.26.4
tqdm==4.66.4
jsons==1.6.3
pyyaml==6.0.2
python-consul2==0.1.5
pydantic==2.8.2
jsonpath-ng==1.6.1
word2number==1.1
schema==0.7.5
envyaml==1.10.211231
boto3==1.35.58
requests==2.32.3
transformers==4.41.2
tokenizers==0.19.1
gliner==0.2.13
salmon-agency-70336
12/25/2024, 4:15 PMonnxruntime
is a transitive dependency of gliner
salmon-agency-70336
12/25/2024, 4:29 PMword2number==1.1
does not have a .whl
file but we have one in our private repo. So you can drop that.dry-beach-38304
12/26/2024, 10:25 PMsalmon-agency-70336
12/27/2024, 5:56 AMLD_LIBRARY_PATH
but that does not help for some reason.
[pod t-c7466149-9dbg7-2cc86] ERROR: onnxruntime-1.19.2-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl is not a supported wheel on this platform.
[pod t-c7466149-9dbg7-2cc86]
[pod t-c7466149-9dbg7-2cc86] No STDERR
[pod t-c7466149-9dbg7-2cc86] Traceback (most recent call last):
[pod t-c7466149-9dbg7-2cc86] File "<frozen runpy>", line 198, in _run_module_as_main
[pod t-c7466149-9dbg7-2cc86] File "<frozen runpy>", line 88, in _run_code
[pod t-c7466149-9dbg7-2cc86] File "/home/metaflow/metaflow_extensions/netflix_ext/plugins/conda/remote_bootstrap.py", line 97, in <module>
[pod t-c7466149-9dbg7-2cc86] bootstrap_environment(*sys.argv[1:])
[pod t-c7466149-9dbg7-2cc86] File "/home/metaflow/metaflow_extensions/netflix_ext/plugins/conda/remote_bootstrap.py", line 60, in bootstrap_environment
[pod t-c7466149-9dbg7-2cc86] my_conda.create_for_step(step_name, resolved_env, do_symlink=True),
[pod t-c7466149-9dbg7-2cc86] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[pod t-c7466149-9dbg7-2cc86] File "/home/metaflow/metaflow_extensions/netflix_ext/plugins/conda/conda.py", line 431, in create_for_step
[pod t-c7466149-9dbg7-2cc86] raise CondaStepException(e, [step_name]) from None
[pod t-c7466149-9dbg7-2cc86] metaflow_extensions.netflix_ext.plugins.conda.utils.CondaStepException: Step(s): ['prep_dataset'], Error: Could not install pypi dependencies using '['/home/micromamba/envs/metaflow_fd2cd46435561c7590b794174c4c51a93e2e3c84_99ff578188837cb0de8b11b862cbc89ed4dedd58/bin/python', '-m', 'pip', 'install', '--no-deps', '--no-input', '--no-compile', '-r', '/tmp/tmpsvxxeu2i']' -- got errorcode 1'; see pretty-printed error above
Task failed.
Task is starting (retry).
salmon-agency-70336
12/27/2024, 6:14 AMDec 27 06:08:13.898 | root [3008] | ERROR | [Errno 2] No such file or directory: 'aws'
Traceback (most recent call last):
File "/Users/narayan/anaconda3/envs/metaflow-env-39/lib/python3.9/site-packages/metaflow/plugins/kubernetes/kubernetes_cli.py", line 271, in step
kubernetes.launch_job(
File "/Users/narayan/anaconda3/envs/metaflow-env-39/lib/python3.9/site-packages/metaflow/plugins/kubernetes/kubernetes.py", line 162, in launch_job
self._job = self.create_job_object(**kwargs).create().execute()
File "/Users/narayan/anaconda3/envs/metaflow-env-39/lib/python3.9/site-packages/metaflow/plugins/kubernetes/kubernetes_job.py", line 326, in execute
raise KubernetesJobException(
metaflow.plugins.kubernetes.kubernetes_job.KubernetesJobException: Unable to launch Kubernetes job.
jobs.batch is forbidden: User "system:anonymous" cannot create resource "jobs" in API group "batch" in the namespace "default"
2024-12-27 11:38:16.012 [3483/prep_dataset/19627 (pid 3008)] Task failed.
I don't know why it has my local env directory, /Users/narayan/anaconda3/envs/metaflow-env-39/
, in the stacktrace?