-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] fix Dockerfile.gpu #34
base: master
Are you sure you want to change the base?
Conversation
@Hoda1394 - what command are you using to run the container? i don't have any experience running a docker image with a gpu; i've only used apptainer/singularity with gpu. a few potential problems come to mind (not saying that any of these are present here):
another point -- you can test whether a gpu is available with another thought -- try validating that the official tensorflow image can use the gpu. so run the |
Actually, I was running the singularity conversion of this image with gpu. |
As additional info when I run
there are two versions of TensorFlow installed. not sure if this can cause this issue... |
this is probably the problem (or one of them!). can you try |
I tried this with the base image and saw both. when running python and import tf, the |
i can reproduce this... it could be a problem with the 1.12.3-gpu-py3 docker image. why are we using such an old image anyway? docker run --rm tensorflow/tensorflow:1.12.3-gpu-py3 python -c 'import tensorflow as tf; print(tf.test.is_built_with_cuda())'
False the 1.14.0-gpu-py3 image works. docker run --rm tensorflow/tensorflow:1.14.0-gpu-py3 python -c 'import tensorflow as tf; print(tf.test.is_built_with_cuda())'
True we should probably use a newer image. i realize we used 1.12 in the project, but we can test if everything works correctly with 1.15 (the last release of the 1.x series). |
I tried removing tensorflow during the build and tensorflow-gpu doesn't work properly without it. I already tested the tensorflow1.15 and I got some other errors due to the version mismatch so if we want to use tensorflow1.15 we may need to update the code. |
I will try version |
feel free to post any errors you get when trying newer versions. paste the entire traceback and i can take a look |
I have tried so many different things to address this issue #33 and among them, this Dockerfile can be built without error but when I run the container, TensorFlow does not see the gpu and runs on cpu!
This container image is available in docker hub as hodadock/kwyk:gpu_test
@satra, @kaczmarj -Any idea how we can fix it?