processor-ocr-pero

CLI processor for the SoDUCo project. OCR processing. Takes a series of JSON files in a given directory, and produces an updated version of these files in another directory. It will look for specific regions and run text line detection and recognition on them. Some extra options enable to produce other output formats: PAGE XML and ALTO XML.

Install and tests

pipenv install
pipenv run python -m pero-cli -i ./tests/input  -o ./tests/output -f json -f image

Usage

usage: __main__.py [-h] -i INPUT_DIR -o OUTPUT_DIR -f {json,alto,page,image} [--pero-config-file PERO_CONFIG_FILE]

PERO OCR command line argument

options:
  -h, --help            show this help message and exit
  -i INPUT_DIR, --input-dir INPUT_DIR
  -o OUTPUT_DIR, --output-dir OUTPUT_DIR
  -f {json,alto,page,image}, --export-format {json,alto,page,image}
  --pero-config-file PERO_CONFIG_FILE

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
pero-cli		pero-cli
preparation		preparation
tests/input		tests/input
Pipfile		Pipfile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

processor-ocr-pero

Install and tests

Usage

About

Releases

Packages

Languages

soduco/processor-ocr-pero

Folders and files

Latest commit

History

Repository files navigation

processor-ocr-pero

Install and tests

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages