Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make models page in setup guide more prominent and rearrange #348

Merged
merged 2 commits into from
Apr 6, 2023

Conversation

lena-hinrichsen
Copy link
Member

…the models page more suitable for Docker users

part of #318

site/en/models.md Outdated Show resolved Hide resolved
site/en/models.md Outdated Show resolved Hide resolved
site/en/models.md Outdated Show resolved Hide resolved
```sh
docker run --user $(id -u) --workdir /data \
--volume $PWD/data:/data \
--volume $PWD/models:/usr/local/share/ocrd-resources \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that still correct for ocrd-tesserocr-*?

docker run --rm -it ocrd/all:latest ocrd-tesserocr-recognize -D
/usr/local/share/tessdata/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copy & pasted that and I am not sure about this ...
Since this PR is mainly about the structure of the page and making Docker more prominent, I'll open another issue for this question, so we don't forget to have a look at this separately

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then let's merge this first and track the ocrd_tesserocr question in an issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigh this question keeps being asked...

https://github.com/OCR-D/ocrd_all/blob/b36cec819c67ccbdbed8033007ea27596e02f972/Makefile#L724
https://github.com/OCR-D/ocrd_all/blob/b36cec819c67ccbdbed8033007ea27596e02f972/Makefile#L756
https://github.com/OCR-D/ocrd_all/blob/b36cec819c67ccbdbed8033007ea27596e02f972/Dockerfile#L32-L33

so – no, /usr/local/share/ocrd-resources will be ignored in ocrd_tesserocr when installed via ocrd_all, the module resource location is /usr/local/share/tessdata, as @kba wrote.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we do still have a problem with our model volume logic here. Modules like ocrd_tesserocr or workflow-configuration (ocrd-page-transform ...) want to have their stuff under /usr/local/share/XYZ, while others use /usr/local/share/ocrd-resources.

Everything is ok for the preinstalled resources (tool json files for bashlib processors, preset files for ocrd-page-transform, minimal models for Tesseract). But as soon as you want to install additional models persistently, we cannot offer anything ATM.

(So this is not just about the right kind of recipe covering the volume mapping, but accomodating a single module location inside the Docker image – because it is prebuilt – with persistent updates we usually do via data location...)

Co-authored-by: Konstantin Baierer <[email protected]>
@lena-hinrichsen lena-hinrichsen merged commit 82bc2d9 into master Apr 6, 2023
@lena-hinrichsen lena-hinrichsen deleted the models-in-docu branch April 6, 2023 15:53
@lena-hinrichsen lena-hinrichsen removed the request for review from bertsky April 6, 2023 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants