Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to build M1/M2/M3 images from source fails miserably #16

Open
dolfs opened this issue Dec 19, 2023 · 17 comments
Open

Attempt to build M1/M2/M3 images from source fails miserably #16

dolfs opened this issue Dec 19, 2023 · 17 comments

Comments

@dolfs
Copy link

dolfs commented Dec 19, 2023

Due to the absence of "linux-aarch64" docke images, I set off to attempt to build them myself. Naively I started with the docker base image. I had to do quite some modifications to the script to make it work on MacOS/Apple Silicon, but finally got it to build with opencv, but of course CUDA disabled. For now I would be satisfied with CPU only operations, but I was going to look later for ways to use Apple's latest tech to allow its neural engine and GPUs to be used as well. But ...
I found out this image is not even being used when I next tried to build the layer image.

CORRECTION: This image is being used by log hi-tooling and I was able to build that, based on that image, although there were test errors, causing the overall to fail.

Building that failed, initially, to misunderstanding what directory needed to be in the argument, but once I figured that you, copying the source failed due to a -T argument in a cp command, which is not supported on MacOS. Once I worked around that, things started, but I ended up with the errors below.

The first problem appears to be with CUDA missing. This is logical, but it should be possible to create an image that does not rely on CUDA so that the GPU=0 flag can be used in the pipeline script.

So, at this point I have been "stung" by too many problems and I am just reporting this in the hopes that authors will consider generating ARM images and/or fixing up these scripts.

173.6 Successfully built detectron2 panopticapi fvcore antlr4-python3-runtime
173.6 Failed to build MultiScaleDeformableAttention
173.6 Pip subprocess error:
173.6   Running command git clone --filter=blob:none --quiet https://github.com/facebookresearch/detectron2.git /tmp/pip-req-build-2ne647iv
173.6   Running command git clone --filter=blob:none --quiet https://github.com/cocodataset/panopticapi.git /tmp/pip-req-build-xn4q0jdm
173.6   error: subprocess-exited-with-error
173.6   
173.6   × python setup.py bdist_wheel did not run successfully.
173.6   │ exit code: 1
173.6   ╰─> [146 lines of output]
173.6       No CUDA runtime is found, using CUDA_HOME='/opt/conda/envs/laypa'
173.6       running bdist_wheel
173.6       running build
173.6       running build_py
173.6       creating build
173.6       creating build/lib.linux-aarch64-cpython-311
173.6       creating build/lib.linux-aarch64-cpython-311/functions
173.6       copying functions/__init__.py -> build/lib.linux-aarch64-cpython-311/functions
173.6       copying functions/ms_deform_attn_func.py -> build/lib.linux-aarch64-cpython-311/functions
173.6       creating build/lib.linux-aarch64-cpython-311/modules
173.6       copying modules/__init__.py -> build/lib.linux-aarch64-cpython-311/modules
173.6       copying modules/ms_deform_attn.py -> build/lib.linux-aarch64-cpython-311/modules
173.6       running build_ext
173.6       building 'MultiScaleDeformableAttention' extension
173.6       creating /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311
173.6       creating /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src
173.6       creating /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa
173.6       creating /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models
173.6       creating /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder
173.6       creating /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder/ops
173.6       creating /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder/ops/src
173.6       creating /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder/ops/src/cpu
173.6       Emitting ninja build file /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/build.ninja...
173.6       Compiling objects...
173.6       Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
173.6       [1/2] c++ -MMD -MF /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder/ops/src/cpu/ms_deform_attn_cpu.o.d -pthread -B /opt/conda/envs/laypa/compiler_compat -Wsign-compare -DNDEBUG -fwrapv -O3 -Wall -fPIC -O3 -isystem /opt/conda/envs/laypa/include -fPIC -O3 -isystem /opt/conda/envs/laypa/include -fPIC -I/src/laypa/models/pixel_decoder/ops/src -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/TH -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/THC -I/opt/conda/envs/laypa/include/python3.11 -c -c /src/laypa/models/pixel_decoder/ops/src/cpu/ms_deform_attn_cpu.cpp -o /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder/ops/src/cpu/ms_deform_attn_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++17
173.6       FAILED: /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder/ops/src/cpu/ms_deform_attn_cpu.o
173.6       c++ -MMD -MF /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder/ops/src/cpu/ms_deform_attn_cpu.o.d -pthread -B /opt/conda/envs/laypa/compiler_compat -Wsign-compare -DNDEBUG -fwrapv -O3 -Wall -fPIC -O3 -isystem /opt/conda/envs/laypa/include -fPIC -O3 -isystem /opt/conda/envs/laypa/include -fPIC -I/src/laypa/models/pixel_decoder/ops/src -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/TH -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/THC -I/opt/conda/envs/laypa/include/python3.11 -c -c /src/laypa/models/pixel_decoder/ops/src/cpu/ms_deform_attn_cpu.cpp -o /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder/ops/src/cpu/ms_deform_attn_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++17
173.6       In file included from /opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/c10/cuda/CUDADeviceAssertionHost.h:3,
173.6                        from /opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/c10/cuda/CUDAException.h:3,
173.6                        from /opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/c10/cuda/CUDAFunctions.h:12,
173.6                        from /opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/c10/cuda/CUDAStream.h:10,
173.6                        from /opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/ATen/cuda/CUDAContext.h:19,
173.6                        from /src/laypa/models/pixel_decoder/ops/src/cpu/ms_deform_attn_cpu.cpp:19:
173.6       /opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/c10/cuda/CUDAMacros.h:8:10: fatal error: c10/cuda/impl/cuda_cmake_macros.h: No such file or directory
173.6           8 | #include <c10/cuda/impl/cuda_cmake_macros.h>
173.6             |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
173.6       compilation terminated.
173.6       [2/2] c++ -MMD -MF /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder/ops/src/vision.o.d -pthread -B /opt/conda/envs/laypa/compiler_compat -Wsign-compare -DNDEBUG -fwrapv -O3 -Wall -fPIC -O3 -isystem /opt/conda/envs/laypa/include -fPIC -O3 -isystem /opt/conda/envs/laypa/include -fPIC -I/src/laypa/models/pixel_decoder/ops/src -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/TH -I/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/include/THC -I/opt/conda/envs/laypa/include/python3.11 -c -c /src/laypa/models/pixel_decoder/ops/src/vision.cpp -o /src/laypa/models/pixel_decoder/ops/build/temp.linux-aarch64-cpython-311/src/laypa/models/pixel_decoder/ops/src/vision.o -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++17
173.6       In file included from /src/laypa/models/pixel_decoder/ops/src/vision.cpp:16:
173.6       /src/laypa/models/pixel_decoder/ops/src/ms_deform_attn.h: In function ‘at::Tensor ms_deform_attn_forward(const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, int)’:
173.6       /src/laypa/models/pixel_decoder/ops/src/ms_deform_attn.h:34:20: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
173.6          34 |     if (value.type().is_cuda())
173.6             |                    ^
@rvankoert
Copy link
Collaborator

Hi Dolf,

Thank you for reporting this. We are considering adding support for making Loghi run on arm64/macOs, but we currently don't have a machine available for this. I will look into this a bit further but doubt that we will officially support macOs in the very near future.

@bjarman
Copy link

bjarman commented Jan 23, 2024

Hi,
stoked that this is under way. Is there an eta or a hunch on when this might be ready?
Best regards/Fredrik

@rvankoert
Copy link
Collaborator

@bjarman
Hi Fredrik,

No eta yet, I hope to have something in a few months as the changes for cpu seem to small. Biggest obstacle is getting a recent macbook.

@bjarman
Copy link

bjarman commented Jan 25, 2024

I am currently testing if OrbStack could be a solution for Mac owners. I have successfully created a amd64 ubuntu machine where I now am running Loghi. It seems to be working albeit very slow since I am running gpu -1 right now. I can see some masked images and xml files in the page directory but only images with the suffix .done that are zero bytes in the image directory. Python is still running though. I also see alot of files created in /tmp/tmp.nIoS3J4YxU/

root@ubuntuintel:/mnt/machines/ubuntuintel/loghi# ls -la /tmp/tmp.nIoS3J4YxU/
total 56
drwx------ 5 root root 140 Jan 25 20:06 .
drwxrwxrwt 9 root root 180 Jan 25 20:53 ..
drwxr-xr-x 12 root root 240 Jan 25 20:06 imagesnippets
drwxr-xr-x 2 root root 40 Jan 25 20:03 linedetection
-rw-r--r-- 1 root root 51143 Jan 25 20:06 lines.txt
-rw-r--r-- 1 root root 4039 Jan 25 21:03 log.txt
drwxr-xr-x 2 root root 40 Jan 25 20:03 output

Are files supposed to end up in /tmp/ ?

@rvankoert
Copy link
Collaborator

So far it seems to be working as expected.

First it does layout analysis which should result in xml files and mask images in the page folder. These are converted in the next step to polygons which are stored as pagexml in the page folder.
Then i t extracts textlines which are stored in tmp.
These are passed to the actual htr and stored in an intermediate format in the tmp dir as well.
Then this intermediate format is merged with the pagexml again and stored in the page folder.
After that some more processing is done to calculate reading order.

I am very curious if you could get it to work on a mac

@bjarman
Copy link

bjarman commented Jan 25, 2024

This is what is running:

root@ubuntuintel:/mnt/machines/ubuntuintel/loghi# ps aux|grep python3
root 161 0.0 0.3 1228100 29312 ? Ss 19:07 0:00 [rosetta] /usr/bin/python3 /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers
root 6375 0.0 0.4 3407448 33040 pts/2 Sl+ 20:06 0:00 [rosetta] /usr/bin/docker docker run -u 0:0 --rm -m 32000m --shm-size 10240m -ti -v /tmp:/tmp -v /tmp/tmp.nIoS3J4YxU:/tmp/tmp.nIoS3J4YxU -v /mnt/machines/ubuntuintel/loghi/public-models/loghi-htr:/mnt/machines/ubuntuintel/loghi/public-models/loghi-htr loghi/docker.htr:1.3.9 bash -c LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4 python3 /src/loghi-htr/src/main.py --do_inference --existing_model /mnt/machines/ubuntuintel/loghi/public-models/loghi-htr/float32-generic-2023-02-15/ --batch_size 64 --use_mask --inference_list /tmp/tmp.nIoS3J4YxU/lines.txt --results_file /tmp/tmp.nIoS3J4YxU/results.txt --charlist /mnt/machines/ubuntuintel/loghi/public-models/loghi-htr/float32-generic-2023-02-15//charlist.txt --gpu -1 --output /tmp/tmp.nIoS3J4YxU/output/ --config_file_output /tmp/tmp.nIoS3J4YxU/output/config.json --beam_width 1
root 6423 100 2.0 2753972 167968 pts/0 Rsl+ 20:06 75:50 [rosetta] /usr/bin/python3 python3 /src/loghi-htr/src/main.py --do_inference --existing_model /mnt/machines/ubuntuintel/loghi/public-models/loghi-htr/float32-generic-2023-02-15/ --batch_size 64 --use_mask --inference_list /tmp/tmp.nIoS3J4YxU/lines.txt --results_file /tmp/tmp.nIoS3J4YxU/results.txt --charlist /mnt/machines/ubuntuintel/loghi/public-models/loghi-htr/float32-generic-2023-02-15//charlist.txt --gpu -1 --output /tmp/tmp.nIoS3J4YxU/output/ --config_file_output /tmp/tmp.nIoS3J4YxU/output/config.json --beam_width 1

I will let python continue until it is done and we'll see if it actually works!

Running with gpu -1 is very slow. If this works it still would not be great for large datasets. My test is with 10 handwritten images and it has been running for about 2 hours now. At some point making use of the silicon M1, M2, M3 gpu would be preferred :)

I have not seen any output to the log for a while. Is there any way of knowing if things are still running as it should?

@rvankoert
Copy link
Collaborator

This is the htr step. It will take up the most time. There should be output in the $tmpdir/log.txt containing transcriptions for each line. On other cpu based systems this is definitely the slowest step

@bjarman
Copy link

bjarman commented Jan 25, 2024

Fingers crossed and patience then :)

@bjarman
Copy link

bjarman commented Jan 26, 2024

I let the python process run over night but nothing new happened. This is the output in /tmp/tmpXXXX/log.txt from this morning when I ran ./na-pipeline.sh with a directory containing only one scanned image named image.jpg:

root@ubuntuintel:/mnt/machines/ubuntuintel/loghi# ./na-pipeline.sh /mnt/machines/ubuntuintel/loghi/k62/
/tmp/tmp.ZlIrWRlcoE
starting Laypa baseline detection
docker run --rm -it -u 0:0 -m 32000m --shm-size 10240m -v /mnt/machines/ubuntuintel/loghi/public-models/laypa/general/baseline:/mnt/machines/ubuntuintel/loghi/public-models/laypa/general/baseline -v /mnt/machines/ubuntuintel/loghi/k62:/mnt/machines/ubuntuintel/loghi/k62 -v /mnt/machines/ubuntuintel/loghi/k62:/mnt/machines/ubuntuintel/loghi/k62 loghi/docker.laypa:1.3.9 python run.py -c /mnt/machines/ubuntuintel/loghi/public-models/laypa/general/baseline/config.yaml -i /mnt/machines/ubuntuintel/loghi/k62 -o /mnt/machines/ubuntuintel/loghi/k62 --opts MODEL.WEIGHTS TEST.WEIGHTS /mnt/machines/ubuntuintel/loghi/public-models/laypa/general/baseline/model_best_mIoU.pth
DeprecationWarning PREPROCESS.RESIZE.USE is losing support; please switch to PREPROCESS.RESIZE.RESIZE_MODE
INPUT.SCALING_TEST is not set, inferring from INPUT.SCALING_TRAIN and PREPROCESS.RESIZE.SCALING to be 0.5
[01/26 09:31:29 laypa.page_xml.output_pageXML]: Could not find page dir (/mnt/machines/ubuntuintel/loghi/k62/page), creating one at specified location
[01/26 09:31:30 detectron2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /mnt/machines/ubuntuintel/loghi/public-models/laypa/general/baseline/model_best_mIoU.pth ...
[01/26 09:31:30 fvcore.common.checkpoint]: [Checkpointer] Loading from /mnt/machines/ubuntuintel/loghi/public-models/laypa/general/baseline/model_best_mIoU.pth ...
/opt/conda/envs/laypa/lib/python3.11/site-packages/torch/utils/data/dataloader.py:557: UserWarning: This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 12, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
Predicting PageXML: 0% 0/1 [00:00<?, ?it/s][W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
Predicting PageXML: 100% 1/1 [00:12<00:00, 12.13s/it]
110 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - /mnt/machines/ubuntuintel/loghi/k62/page/image.png
236 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - FOUND LABELS:210
469 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - mergedLineDetected: /mnt/machines/ubuntuintel/loghi/k62/page/image.png
557 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - mergedLineDetected: /mnt/machines/ubuntuintel/loghi/k62/page/image.png
564 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - mergedLineDetected: /mnt/machines/ubuntuintel/loghi/k62/page/image.png
575 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - mergedLineDetected: /mnt/machines/ubuntuintel/loghi/k62/page/image.png
576 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - mergedLineDetected: /mnt/machines/ubuntuintel/loghi/k62/page/image.png
661 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.BaselinesMapper - Mapping lines took: 15.36 ms
679 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - textlines to match: 143 /mnt/machines/ubuntuintel/loghi/k62/page/image.png
errors: 0
warnings: 0
1705 [main] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - Finished all threads
starting Loghi HTR
141 [main] INFO nl.knaw.huc.di.images.minions.MinionCutFromImageBasedOnPageXMLNew - Ignore file '/mnt/machines/ubuntuintel/loghi/k62/page', not an image
728 [pool-1-thread-1] INFO nl.knaw.huc.di.images.layoutanalyzer.layoutlib.LayoutProc - nl.knaw.huc.di.images.minions.MinionCutFromImageBasedOnPageXMLNew$$Lambda$4/0x0000000840066840@47156cd2 interline distance: 78.44743462982075
1172 [pool-1-thread-1] INFO nl.knaw.huc.di.images.layoutanalyzer.layoutlib.LayoutProc - nl.knaw.huc.di.images.minions.MinionCutFromImageBasedOnPageXMLNew$$Lambda$4/0x0000000840066840@47156cd2 textlines: 144
1179 [pool-1-thread-1] INFO nl.knaw.huc.di.images.layoutanalyzer.layoutlib.LayoutProc - nl.knaw.huc.di.images.minions.MinionCutFromImageBasedOnPageXMLNew$$Lambda$4/0x0000000840066840@47156cd2 average textline took: 3
errors: 0
warnings: 0
/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)

==========
== CUDA ==

CUDA Version 12.2.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see
https://docs.nvidia.com/datacenter/cloud-native/ .

bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)

Inside the image directory the directory page is created and contains image.jpg and image.xml. The image directory contains the original image.jpg and a file named image.jpg.done (zero bytes).

A lot of stuff is created in /tmp/tmp.../. For instance a number of files called image-line_xxxxxx.

The directory /tmp/tmp.../output is empty.

Either the HTR process is ridiculously slow on my laptop using cpu or the process is stalled for some reason.

I see no errors in the log just a few warnings.

Do you have any thoughts on how to move forward?

@rvankoert
Copy link
Collaborator

You should have gotten some more output fairly quickly. Even my 12 year old laptop takes less than 5 minutes per scan. My guess is that something is wrong and the process is stalled. It is strange because the layoutanalysis uses pytorch and seems to work fine. The htr uses tensorflow and fails somehow. I'll try to look into this a bit more this weekend. Hopefully next week I'll get my hands on a mac so I can at least try to reproduce this. If I am to take a guess: it tries to load cuda, even though it shouldn't.

@bjarman
Copy link

bjarman commented Jan 26, 2024

Tensorflow should be able to utilise Apple silicon or AMD GPUs by using the tensorflow-metal plugin. The HTR should of course respect the flag gpu -1 though. Have a nice weekend and I hope you get that mac and that this is an easy fix!

@bjarman
Copy link

bjarman commented Jan 26, 2024

Just for fun I set gpu to "0" and this is what happened:

root@ubuntuintel:/mnt/machines/ubuntuintel/loghi# ./na-pipeline.sh k62
/tmp/tmp.Lr9KmCH4pW
using GPU 0
starting Laypa baseline detection
docker run --gpus device=0 --rm -it -u 0:0 -m 32000m --shm-size 10240m -v /mnt/machines/ubuntuintel/loghi/public-models/laypa/general/baseline:/mnt/machines/ubuntuintel/loghi/public-models/laypa/general/baseline -v /mnt/machines/ubuntuintel/loghi/k62:/mnt/machines/ubuntuintel/loghi/k62 -v /mnt/machines/ubuntuintel/loghi/k62:/mnt/machines/ubuntuintel/loghi/k62 loghi/docker.laypa:1.3.9 python run.py -c /mnt/machines/ubuntuintel/loghi/public-models/laypa/general/baseline/config.yaml -i /mnt/machines/ubuntuintel/loghi/k62 -o /mnt/machines/ubuntuintel/loghi/k62 --opts MODEL.WEIGHTS TEST.WEIGHTS /mnt/machines/ubuntuintel/loghi/public-models/laypa/general/baseline/model_best_mIoU.pth
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
100 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - /mnt/machines/ubuntuintel/loghi/k62/page/image.png
245 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - FOUND LABELS:210
555 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - mergedLineDetected: /mnt/machines/ubuntuintel/loghi/k62/page/image.png
606 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - mergedLineDetected: /mnt/machines/ubuntuintel/loghi/k62/page/image.png
616 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - mergedLineDetected: /mnt/machines/ubuntuintel/loghi/k62/page/image.png
625 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - mergedLineDetected: /mnt/machines/ubuntuintel/loghi/k62/page/image.png
628 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - mergedLineDetected: /mnt/machines/ubuntuintel/loghi/k62/page/image.png
3076 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.BaselinesMapper - Mapping lines took: 2.384 s
3088 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - textlines to match: 143 /mnt/machines/ubuntuintel/loghi/k62/page/image.png
errors: 29
warnings: 0
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_89d7114d-b371-42c1-9187-00d7d29f5458'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'ccec253a-fb96-4b32-a58a-ede099a3b495'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'f9046d9b-8424-403a-b6aa-89e78a90aaf8'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_19215798-c2d8-4edd-9717-9b3165f36656'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_8efa880c-305f-4d74-900e-1a335e730652'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'ab49b7e1-127d-4999-9d51-0e9a95f0c170'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_29b8baf3-6329-41f2-a814-06a29a222262'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_5ccd09a3-e8da-48db-8fc2-32aa77ecab30'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_6ae60f8b-d479-4002-b71c-77dd9a0b0dc8'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_4fdd5858-6b46-4da1-b4aa-b4b56778457c'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'bd5015b6-f9e3-4829-a1b6-2eef897bf7e8'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'd82fd08f-346d-43f8-9a97-ca62cab18f45'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'a62e7b36-0bac-4daa-8dd0-c9ead8fdf1aa'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_9ea71f73-0873-4d6c-b0f9-89249e478d00'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_57384b8c-d068-4519-9bae-960b84b2f7dc'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'e39dd52b-ef6d-496d-843d-2f5d73dcd6fd'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_7dcdc118-bab9-4205-9dfe-835ddb26549c'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_4ebc9a53-9953-4b45-970b-3165efc2d265'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_404806ec-d97b-4447-9f98-06cd1c479198'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'fdc973c6-32f6-40f8-9724-77b3321c11f1'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_41890653-a012-4761-a745-e80bd736bc5d'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_9ec881ff-e3af-4355-ada3-4cb54e49d794'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'bd1b9d1d-61aa-4425-bc12-27da9befb0e6'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_44d0f5e4-d594-432c-83fc-202e14d4afca'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_04f9f989-4c03-4ad2-a071-9c2a1e223597'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'd0cae1f0-226e-4a7c-a32c-94e52d483dcb'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_0992297d-0106-40a8-9379-65f27c1df215'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'e5c4b98c-7904-42e2-9bc1-042443b90de8'.
localString: Line 626, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_518c36cc-3900-4b6b-9257-73dd83611c08'.
Errors: 29
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_89d7114d-b371-42c1-9187-00d7d29f5458'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'ccec253a-fb96-4b32-a58a-ede099a3b495'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'f9046d9b-8424-403a-b6aa-89e78a90aaf8'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_19215798-c2d8-4edd-9717-9b3165f36656'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_8efa880c-305f-4d74-900e-1a335e730652'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'ab49b7e1-127d-4999-9d51-0e9a95f0c170'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_29b8baf3-6329-41f2-a814-06a29a222262'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_5ccd09a3-e8da-48db-8fc2-32aa77ecab30'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_6ae60f8b-d479-4002-b71c-77dd9a0b0dc8'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_4fdd5858-6b46-4da1-b4aa-b4b56778457c'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'bd5015b6-f9e3-4829-a1b6-2eef897bf7e8'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'd82fd08f-346d-43f8-9a97-ca62cab18f45'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'a62e7b36-0bac-4daa-8dd0-c9ead8fdf1aa'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_9ea71f73-0873-4d6c-b0f9-89249e478d00'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_57384b8c-d068-4519-9bae-960b84b2f7dc'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'e39dd52b-ef6d-496d-843d-2f5d73dcd6fd'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_7dcdc118-bab9-4205-9dfe-835ddb26549c'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_4ebc9a53-9953-4b45-970b-3165efc2d265'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_404806ec-d97b-4447-9f98-06cd1c479198'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'fdc973c6-32f6-40f8-9724-77b3321c11f1'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_41890653-a012-4761-a745-e80bd736bc5d'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_9ec881ff-e3af-4355-ada3-4cb54e49d794'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'bd1b9d1d-61aa-4425-bc12-27da9befb0e6'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_44d0f5e4-d594-432c-83fc-202e14d4afca'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_04f9f989-4c03-4ad2-a071-9c2a1e223597'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'd0cae1f0-226e-4a7c-a32c-94e52d483dcb'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_0992297d-0106-40a8-9379-65f27c1df215'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'e5c4b98c-7904-42e2-9bc1-042443b90de8'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_518c36cc-3900-4b6b-9257-73dd83611c08'.
4140 [main] INFO nl.knaw.huc.di.images.minions.MinionExtractBaselines - Finished all threads
starting Loghi HTR
151 [main] INFO nl.knaw.huc.di.images.minions.MinionCutFromImageBasedOnPageXMLNew - Ignore file '/mnt/machines/ubuntuintel/loghi/k62/page', not an image
661 [pool-1-thread-1] INFO nl.knaw.huc.di.images.layoutanalyzer.layoutlib.LayoutProc - nl.knaw.huc.di.images.minions.MinionCutFromImageBasedOnPageXMLNew$$Lambda$4/0x0000000840066840@480c5d46 interline distance: 78.44743462982075
1142 [pool-1-thread-1] INFO nl.knaw.huc.di.images.layoutanalyzer.layoutlib.LayoutProc - nl.knaw.huc.di.images.minions.MinionCutFromImageBasedOnPageXMLNew$$Lambda$4/0x0000000840066840@480c5d46 textlines: 144
1155 [pool-1-thread-1] INFO nl.knaw.huc.di.images.layoutanalyzer.layoutlib.LayoutProc - nl.knaw.huc.di.images.minions.MinionCutFromImageBasedOnPageXMLNew$$Lambda$4/0x0000000840066840@480c5d46 average textline took: 3
errors: 29
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_89d7114d-b371-42c1-9187-00d7d29f5458'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'ccec253a-fb96-4b32-a58a-ede099a3b495'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'f9046d9b-8424-403a-b6aa-89e78a90aaf8'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_19215798-c2d8-4edd-9717-9b3165f36656'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_8efa880c-305f-4d74-900e-1a335e730652'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'ab49b7e1-127d-4999-9d51-0e9a95f0c170'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_29b8baf3-6329-41f2-a814-06a29a222262'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_5ccd09a3-e8da-48db-8fc2-32aa77ecab30'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_6ae60f8b-d479-4002-b71c-77dd9a0b0dc8'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_4fdd5858-6b46-4da1-b4aa-b4b56778457c'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'bd5015b6-f9e3-4829-a1b6-2eef897bf7e8'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'd82fd08f-346d-43f8-9a97-ca62cab18f45'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'a62e7b36-0bac-4daa-8dd0-c9ead8fdf1aa'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_9ea71f73-0873-4d6c-b0f9-89249e478d00'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_57384b8c-d068-4519-9bae-960b84b2f7dc'.warnings: 0

localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'e39dd52b-ef6d-496d-843d-2f5d73dcd6fd'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_7dcdc118-bab9-4205-9dfe-835ddb26549c'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_4ebc9a53-9953-4b45-970b-3165efc2d265'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_404806ec-d97b-4447-9f98-06cd1c479198'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'fdc973c6-32f6-40f8-9724-77b3321c11f1'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_41890653-a012-4761-a745-e80bd736bc5d'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_9ec881ff-e3af-4355-ada3-4cb54e49d794'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'bd1b9d1d-61aa-4425-bc12-27da9befb0e6'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_44d0f5e4-d594-432c-83fc-202e14d4afca'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_04f9f989-4c03-4ad2-a071-9c2a1e223597'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'd0cae1f0-226e-4a7c-a32c-94e52d483dcb'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_0992297d-0106-40a8-9379-65f27c1df215'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'e5c4b98c-7904-42e2-9bc1-042443b90de8'.
localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'region_518c36cc-3900-4b6b-9257-73dd83611c08'.
Errors: 29
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_89d7114d-b371-42c1-9187-00d7d29f5458'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'ccec253a-fb96-4b32-a58a-ede099a3b495'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'f9046d9b-8424-403a-b6aa-89e78a90aaf8'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_19215798-c2d8-4edd-9717-9b3165f36656'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_8efa880c-305f-4d74-900e-1a335e730652'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'ab49b7e1-127d-4999-9d51-0e9a95f0c170'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_29b8baf3-6329-41f2-a814-06a29a222262'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_5ccd09a3-e8da-48db-8fc2-32aa77ecab30'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_6ae60f8b-d479-4002-b71c-77dd9a0b0dc8'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_4fdd5858-6b46-4da1-b4aa-b4b56778457c'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'bd5015b6-f9e3-4829-a1b6-2eef897bf7e8'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'd82fd08f-346d-43f8-9a97-ca62cab18f45'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'a62e7b36-0bac-4daa-8dd0-c9ead8fdf1aa'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_9ea71f73-0873-4d6c-b0f9-89249e478d00'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_57384b8c-d068-4519-9bae-960b84b2f7dc'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'e39dd52b-ef6d-496d-843d-2f5d73dcd6fd'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_7dcdc118-bab9-4205-9dfe-835ddb26549c'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_4ebc9a53-9953-4b45-970b-3165efc2d265'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_404806ec-d97b-4447-9f98-06cd1c479198'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'fdc973c6-32f6-40f8-9724-77b3321c11f1'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_41890653-a012-4761-a745-e80bd736bc5d'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_9ec881ff-e3af-4355-ada3-4cb54e49d794'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'bd1b9d1d-61aa-4425-bc12-27da9befb0e6'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_44d0f5e4-d594-432c-83fc-202e14d4afca'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_04f9f989-4c03-4ad2-a071-9c2a1e223597'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'd0cae1f0-226e-4a7c-a32c-94e52d483dcb'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_0992297d-0106-40a8-9379-65f27c1df215'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'e5c4b98c-7904-42e2-9bc1-042443b90de8'.
cvc-id.1: There is no ID/IDREF binding for IDREF 'region_518c36cc-3900-4b6b-9257-73dd83611c08'.
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
Exception in thread "main" java.io.FileNotFoundException: /tmp/tmp.Lr9KmCH4pW/results.txt (No such file or directory)
at java.base/java.io.FileInputStream.open0(Native Method)
at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
at java.base/java.io.FileInputStream.(FileInputStream.java:157)
at java.base/java.io.FileInputStream.(FileInputStream.java:112)
at java.base/java.io.FileReader.(FileReader.java:103)
at nl.knaw.huc.di.images.minions.MinionLoghiHTRMergePageXML.fillDictionary(MinionLoghiHTRMergePageXML.java:336)
at nl.knaw.huc.di.images.minions.MinionLoghiHTRMergePageXML.main(MinionLoghiHTRMergePageXML.java:247)
recalculating reading order
62 [main] INFO nl.knaw.huc.di.images.minions.MinionRecalculateReadingOrderNew - input_dir: /mnt/machines/ubuntuintel/loghi/k62/page/
71 [main] INFO nl.knaw.huc.di.images.minions.MinionRecalculateReadingOrderNew - /mnt/machines/ubuntuintel/loghi/k62/page/image.xml
520 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionRecalculateReadingOrderNew - /mnt/machines/ubuntuintel/loghi/k62/page/image.xml interlinemedian: 78.44743462982075
errors: 0
warnings: 0
detecting language...
59 [main] INFO nl.knaw.huc.di.images.minions.MinionDetectLanguageOfPageXml - training languages: [English, Italian, French, Latin, German, Dutch]
956 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionDetectLanguageOfPageXml - image.xml: processing file...
errors: 0
warnings: 0
MinionSplitPageXMLTextLineIntoWords...
53 [main] INFO nl.knaw.huc.di.images.minions.MinionSplitPageXMLTextLineIntoWords - /mnt/machines/ubuntuintel/loghi/k62/page/image.xml: processing
57 [pool-1-thread-1] INFO nl.knaw.huc.di.images.minions.MinionSplitPageXMLTextLineIntoWords - /mnt/machines/ubuntuintel/loghi/k62/page/image.xml
errors: 0
warnings: 0

The program does not seem to exit even though it cannot find a GPU.

I also tried using -e CUDA_VISIBLE_DEVICES=-1 in all the docker run but it still looks like it wants to use CUDA/GPU.

@bjarman
Copy link

bjarman commented Feb 9, 2024

Hi! Just wanted to check if you got a hold of a mac computer? Also I would be very happy to aid in any testing.

@rvankoert
Copy link
Collaborator

No, I haven gotten one yet. I checked again today and will probably be able to borrow an m3 for a few weeks. Earlier I did some research and my best guess is that the problem has something todo with different cpu-architectures, which should be a fairly easy fix.

@rvankoert
Copy link
Collaborator

Tomorrow I can borrow an M3 for about a month. I check in here again once I have more news.

@bjarman
Copy link

bjarman commented Feb 15, 2024

Let me know if/when there is anything to test. I would be happy to do testing on my laptop as well.
My specs:
Chip: Apple M2 Max
Memory: 96 GB
macOS: 14.3

@bjarman
Copy link

bjarman commented Feb 28, 2024

Hi!
How is this coming along?
Testing offer still stands should you have anything to share!

rvankoert added a commit that referenced this issue Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants