Tools for Visualizing (intermediate) OCR D results

download a pre-built release from Github
unzip somewhere
copy/symlink the startup script from your platform's subdirectory to your search PATH, probably adding --resolve-dir $PWD (or similar) to the arguments (in order to make PageViewer resolve relative image paths w.r.t. the current working directory instead of the XML file – which is more useful for OCR-D workspaces).
For example, on Linux, add this to your ~/.bash_aliases or ~/.bashrc:

alias jpageviewer='java -jar ~/path/to/JPageViewer\ 1.4\ \(Linux\,\ 64\ bit\)/JPageViewer.jar --resolve-dir $PWD'

Usage

# cd into workspace directory
jpageviewer OCR-D-SEG-TESS/PAGE1.xml

(Then continue with the Open button, navigating to the next PAGE file, or close the UI and start new instance on the shell.)

Advantages

Scheme support: all PAGE versions, but also ALTO
Shows fully recursive regions, including reading order
Shows all hierarchy levels from Border to Glyph
Platforms: Win, Linux, Mac
Recommended usage: viewing

Drawbacks

Bugs related to zooming (which breaks tooltips)
Does not show AlternativeImage content
Does not rotate image according to annotated skew
Fixed colour scheme
No METS or directory navigation (pages have to be opened individually)

Aletheia

Aletheia is an advanced system for accurate and yet cost-effective analysis, recognition and annotation of scanned documents. It aids the user with a number of automated and semi-automated tools which were developed and fine-tuned based on feedback from major libraries across Europe and from their digitisation service providers which are using it in a production environment.

Cutting-edge features are, among others, the support of top-down ground truthing with sophisticated split and shrink tools as well as bottom-up ground truthing supporting the aggregation of lower-level elements to more complex structures. The integrated rules and guidelines validator, in combination with powerful correction tools, enable efficient production of highly accurate ground truth as well as standardised electronic renditions of digitised documents.

In addition, special features such as a customisable virtual keyboard and the Aletheia Sans font with extensive coverage of special characters in Unicode have been developed to support working with the complexities of historical documents. (https://www.primaresearch.org/tools/Aletheia)

Aletheia is available either as a free Lite version (only requires registration via Email) or as a Pro version (annual paid subscription, added features and support).

See also the feature comparsion for both versions.

Installation

unzip somewhere
run Aletheia.exe

Advantages

Scheme support: all PAGE versions, but also ALTO
Shows fully recursive regions, including reading order
Shows all hierarchy levels from Border to Glyph
Offers lots of check/fixup tools for consistency
Platforms: Win
Recommended usage: editing and viewing
Some directory navigation (pages have to be opened collectively)

Drawbacks

Does not show AlternativeImage content
Does not rotate image according to annotated skew
Fixed colour scheme
No METS navigation

Transkribus

Installation

https://transkribus.eu/wiki/index.php/Download_and_Installation

Advantages

Drawbacks

Does not support recent PAGE versions
Not free

LAREX

Installation

native: as described the README
Docker: docker pull bertsky/larex and then as described here, e.g. docker run --rm -u 0:$GROUPS -v path/to/workspace:/data bertsky/larex

Usage

go to http://localhost:8080/Larex with your browser (preferably Chrome/chromium)

Advantages

Very efficient for large amounts of pages (fast, has keyboard shortcuts for everything), esp. for text correction
Offers custom auto-segmentation, including reading order
Variable colour scheme
Platforms: Linux or Docker-capable
Recommended usage: editing and viewing

Drawbacks

Does not show Border or hierarchy levels below TextLine
Does not show recursive regions
Does not show AlternativeImage content
Does not rotate image according to annotated skew
No direct METS navigation (custom, flat bookpath directory structure which needs to be exported from OCR-D fileGrps via ocrd-export-larex)

nw-page-editor

nw-page-editor is an application for editing ground truth information for diverse purposes related to the areas of document processing and text recognition. The edition is done interactively and visually on top of images of scanned documents. Additionally the app supports many keyboard shortcuts to allow more efficient editing, see section Application usage shortcuts.

The app is available in two variants. The first variant is as a desktop application based on the NW.js framework thus making it cross-platform. The second variant is as a web application that allows remote editing by multiple users and can be easily setup via a docker container. (https://github.com/mauvilsa/nw-page-editor)

Installation

Advantages

Scheme support: PAGE XML Version [http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15] and property extensions (https://github.com/omni-us/pageformat)
Platforms: Win, Linux, Mac
Recommended usage: ~~editing and~~ viewing

Drawbacks

Custom PAGE extensions when editing

METS Tools

OCRD Browser

An extensible viewer for OCRD mets.xml files (https://github.com/hnesk/browse-ocrd)

Installation

sudo apt install libcairo2-dev libgtk-3-dev libglib2.0-dev libgtksourceview-3.0-dev libgirepository1.0-dev
pip install browse-ocrd

Usage

# cd into workspace directory
browse-ocrd mets.xml

Advantages

Scheme support: OCR-D METS conventions (https://ocr-d.de/en/spec/mets)
Shows pages on all fileGrps, including AlternativeImages
Platforms: Linux
Recommended usage: viewing

Drawbacks

Only shows page-level (but not region/line/word) AlternativeImage
Slow on large documents with many/large pages
No zooming currently

Image Viewer and Tools

Feh

feh is an X11 image viewer aimed mostly at console users. Unlike most other viewers, it does not have a fancy GUI, but simply displays images. It is controlled via commandline arguments and configurable key/mouse actions. (https://feh.finalrewind.org/)

Installation

sudo apt install feh

Usage

# cd into workspace directory
feh OCR-D-IMG-BIN/

Advantages

Exact zoom interpolation
Extensive keyboard shortcuts
Allows keeping zoom level across pages
Very versatily and fast
Can browse multiple files, including thumbnail mode

Drawbacks

No multi-page TIFF display

Evince

Installation

sudo apt install evince

Usage

# cd into workspace directory
evince OCR-D-IMG-BIN/PAGE1.png

Advantages

Has multi-page TIFF display

Drawbacks

Artefacts and/or decreased sharpness in zoom interpolation
Cannot browse multiple files

ImageMagick

Use ImageMagick® to create, edit, compose, or convert bitmap images. It can read and write images in a variety of formats (over 200) including PNG, JPEG, GIF, HEIC, TIFF, DPX, EXR, WebP, Postscript, PDF, and SVG. ImageMagick can resize, flip, mirror, rotate, distort, shear and transform images, adjust image colors, apply various special effects, or draw text, lines, polygons, ellipses and Bézier curves.

Installation

sudo apt install imagemagick

Usage

# cd into workspace directory
identify -verbose OCR-D-IMG/*.tiff
compare OCR-D-IMG-BIN1/PAGE1.png OCR-D-IMG-BIN2/PAGE1.png PAGE1-BIN1-BIN2.png
display OCR-D-IMG-BIN1/PAGE1.png OCR-D-IMG-BIN2/PAGE1.png PAGE1-BIN1-BIN2.png

Advantages

Query images with identify
Compare images with compare
View images with display
Process images with convert

Drawbacks

Welcome to the OCR-D wiki, a companion to the OCR-D website.

Articles and tutorials

Discussions

Expert section on OCR-D- workflows

Particular workflow steps

Recommended workflows

Successful Workflows for Particular Material (Template)

Workflow Guide

Videos

Section on Ground Truth

Tools for Visualizing (intermediate) OCR D results

PAGE Tools

Installation

Usage

Advantages

Drawbacks

Installation

Advantages

Drawbacks

Installation

Advantages

Drawbacks

Installation

Usage

Advantages

Drawbacks

Installation

Advantages

Drawbacks

METS Tools

Installation

Usage

Advantages

Drawbacks

Image Viewer and Tools

Installation

Usage

Advantages

Drawbacks

Installation

Usage

Advantages

Drawbacks

Installation

Usage

Advantages

Drawbacks

Clone this wiki locally