-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to build M1/M2/M3 images from source fails miserably #16
Comments
Hi Dolf, Thank you for reporting this. We are considering adding support for making Loghi run on arm64/macOs, but we currently don't have a machine available for this. I will look into this a bit further but doubt that we will officially support macOs in the very near future. |
Hi, |
@bjarman No eta yet, I hope to have something in a few months as the changes for cpu seem to small. Biggest obstacle is getting a recent macbook. |
I am currently testing if OrbStack could be a solution for Mac owners. I have successfully created a amd64 ubuntu machine where I now am running Loghi. It seems to be working albeit very slow since I am running gpu -1 right now. I can see some masked images and xml files in the page directory but only images with the suffix .done that are zero bytes in the image directory. Python is still running though. I also see alot of files created in /tmp/tmp.nIoS3J4YxU/ root@ubuntuintel:/mnt/machines/ubuntuintel/loghi# ls -la /tmp/tmp.nIoS3J4YxU/ Are files supposed to end up in /tmp/ ? |
So far it seems to be working as expected. First it does layout analysis which should result in xml files and mask images in the page folder. These are converted in the next step to polygons which are stored as pagexml in the page folder. I am very curious if you could get it to work on a mac |
This is what is running: root@ubuntuintel:/mnt/machines/ubuntuintel/loghi# ps aux|grep python3 I will let python continue until it is done and we'll see if it actually works! Running with gpu -1 is very slow. If this works it still would not be great for large datasets. My test is with 10 handwritten images and it has been running for about 2 hours now. At some point making use of the silicon M1, M2, M3 gpu would be preferred :) I have not seen any output to the log for a while. Is there any way of knowing if things are still running as it should? |
This is the htr step. It will take up the most time. There should be output in the $tmpdir/log.txt containing transcriptions for each line. On other cpu based systems this is definitely the slowest step |
Fingers crossed and patience then :) |
I let the python process run over night but nothing new happened. This is the output in /tmp/tmpXXXX/log.txt from this morning when I ran ./na-pipeline.sh with a directory containing only one scanned image named image.jpg: root@ubuntuintel:/mnt/machines/ubuntuintel/loghi# ./na-pipeline.sh /mnt/machines/ubuntuintel/loghi/k62/ ==========
|
You should have gotten some more output fairly quickly. Even my 12 year old laptop takes less than 5 minutes per scan. My guess is that something is wrong and the process is stalled. It is strange because the layoutanalysis uses pytorch and seems to work fine. The htr uses tensorflow and fails somehow. I'll try to look into this a bit more this weekend. Hopefully next week I'll get my hands on a mac so I can at least try to reproduce this. If I am to take a guess: it tries to load cuda, even though it shouldn't. |
Tensorflow should be able to utilise Apple silicon or AMD GPUs by using the tensorflow-metal plugin. The HTR should of course respect the flag gpu -1 though. Have a nice weekend and I hope you get that mac and that this is an easy fix! |
Just for fun I set gpu to "0" and this is what happened: root@ubuntuintel:/mnt/machines/ubuntuintel/loghi# ./na-pipeline.sh k62 localString: Line 769, Column: 9: cvc-id.1: There is no ID/IDREF binding for IDREF 'e39dd52b-ef6d-496d-843d-2f5d73dcd6fd'. The program does not seem to exit even though it cannot find a GPU. I also tried using -e CUDA_VISIBLE_DEVICES=-1 in all the docker run but it still looks like it wants to use CUDA/GPU. |
Hi! Just wanted to check if you got a hold of a mac computer? Also I would be very happy to aid in any testing. |
No, I haven gotten one yet. I checked again today and will probably be able to borrow an m3 for a few weeks. Earlier I did some research and my best guess is that the problem has something todo with different cpu-architectures, which should be a fairly easy fix. |
Tomorrow I can borrow an M3 for about a month. I check in here again once I have more news. |
Let me know if/when there is anything to test. I would be happy to do testing on my laptop as well. |
Hi! |
Due to the absence of "linux-aarch64" docke images, I set off to attempt to build them myself. Naively I started with the docker base image. I had to do quite some modifications to the script to make it work on MacOS/Apple Silicon, but finally got it to build with opencv, but of course CUDA disabled. For now I would be satisfied with CPU only operations, but I was going to look later for ways to use Apple's latest tech to allow its neural engine and GPUs to be used as well. But ...
I found out this image is not even being used when I next tried to build the layer image.
CORRECTION: This image is being used by log hi-tooling and I was able to build that, based on that image, although there were test errors, causing the overall to fail.
Building that failed, initially, to misunderstanding what directory needed to be in the argument, but once I figured that you, copying the source failed due to a
-T
argument in acp
command, which is not supported on MacOS. Once I worked around that, things started, but I ended up with the errors below.The first problem appears to be with CUDA missing. This is logical, but it should be possible to create an image that does not rely on CUDA so that the GPU=0 flag can be used in the pipeline script.
So, at this point I have been "stung" by too many problems and I am just reporting this in the hopes that authors will consider generating ARM images and/or fixing up these scripts.
The text was updated successfully, but these errors were encountered: