Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcudnn_cnn_train.so.8 issue #2

Closed
sylvain471 opened this issue Jun 25, 2024 · 2 comments
Closed

libcudnn_cnn_train.so.8 issue #2

sylvain471 opened this issue Jun 25, 2024 · 2 comments

Comments

@sylvain471
Copy link

Hello,

Florence2 sounds very promising and gpu-poor friendly as compared to current VMLs.

I'd love to get the fine-tuning script to work but, when I manage to get all packages finally installed, I keep getting a complaint about libcudnn_cnn

Could not load library libcudnn_cnn_train.so.8. Error: /usr/local/cuda-12.1/lib/libcudnn_cnn_train.so.8: undefined symbol: _ZN5cudnn3cnn34layerNormFwd_execute_internal_implERKNS_7backend11VariantPackEP11CUstream_stRNS0_18LayerNormFwdParamsERKNS1_20NormForwardOperationEmb, version libcudnn_cnn_infer.so.8

I tested using standard venv and with UV, despite libcudnn_cnn being present

$ find | grep libcudnn_cnn
./.venv/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_cnn_train.so.8
./.venv/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_cnn_infer.so.8

both config gave me the same error 😢

any idea what might solve this problem?

@sylvain471
Copy link
Author

well, digging into obscure github issues pytorch/pytorch#119989 , running the command

unset LD_LIBRARY_PATH

before python train.py solves the problem! at least for now...

@eloise471
Copy link

Super !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants