Skip to content

This project aims to automate the receipt/invoice parsing process.

License

Notifications You must be signed in to change notification settings

air-yan/InvoiceOCR

Repository files navigation

Invoice-Receipt-OCR

This project aims to automate the receipt/invoice parsing process.

Installation and Prerequisite

Python Modules

# to add rating for text extraction process
pip install python-Levenshtein

# images and preprocessing
pip install Wand
pip install opencv-python

# ocr engine
pip install pytesseract

# PDF text extraction tool -> not required for now
pip install pdfminer.six

Environments

If you are using windows, you should set PATH for imagemagik and tesseract.

TODO

  • Add testing codes
  • Core Functions:
    • amount
    • invoice #
    • bill date vs due date
    • address
    • vendor name
  • Optimize your rating process

About

This project aims to automate the receipt/invoice parsing process.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published