Vietnamese Handwriting Text Recognition (vnhtr package)

This project deploys and improves two foundational models within TrOCR and VietOCR.

Proposal Architecture

VGG Transformer with Rethinking Head

TrOCR with Rethinking Head

Usage

`vnhtr` package

pip install vnhtr

from PIL import Image
from vnhtr.vnhtr_script.tools import *

vta_predictor = VGGTransformer("cuda:0")
tra_predictor = TrOCR("cuda:0")

vta_predictor.predict([Image.open("/content/out_sample_2.jpg")])
tra_predictor.predict([Image.open("/content/out_sample_2.jpg")])

Fully implemented

git clone https://github.com/nguyenhoanganh2002/vnhtr
cd ./vnhtr/vnhtr/source
pip install -r requirements.txt

Pretrain/Fintune VGG Transformer/TrOCR (pretraining on a large dataset and then finetuning on a wild dataset)

python VGGTransformer/train.py
python VisionEncoderDecoder/train.py

Pretrain VGG Transformer/TrOCR with Rethinking Head (large dataset)

python VGGTransformer/adapter_trainer.py
python VisionEncoderDecoder/adapter_trainer.py

Finetune VGG Transformer with Rethinking Head (wild dataset)

python VGGTransformer/finetune.py
python VisionEncoderDecoder/finetune.py

Access the model without going through the training or finetuning phases.

from VGGTransformer.config import config as vggtransformer_cf
from VGGTransformer.models import VGGTransformer, AdapterVGGTransformer
from VisionEncoderDecoder.config import config as trocr_cf
from VisionEncoderDecoder.model import VNTrOCR, AdapterVNTrOCR

vt_base = VGGTransformer(vggtransformer_cf)
vt_adapter = AdapterVGGTransformer(vggtransformer_cf)
tr_base = VNTrOCR(trocr_cf)
tr_adapter = AdapterVNTrOCR(trocr_cf)

For access to the full dataset and pretrained weights, please contact: [email protected]

Experimental Results

Model	CER	Δ(CER)	WER	Δ(WER)	Inference time (ms)	Δ(normalized)
VGG Transformer	17.1		33.03		211.5
VGG Transformer + Rethinking Head	13.25	+3.85	27.9	+5.13	227.4	+0.075
TrOCR	8.2		19.25		104.6
TrOCR + Rethinking Head	7.87	+0.33	18.32	+0.93	113.2	+0.082

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
vnhtr		vnhtr
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py
test.ipynb		test.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vietnamese Handwriting Text Recognition (vnhtr package)

Proposal Architecture

VGG Transformer with Rethinking Head

TrOCR with Rethinking Head

Usage

`vnhtr` package

Fully implemented

Experimental Results

About

Releases

Packages

Languages

License

anhnh2002/vnhtr

Folders and files

Latest commit

History

Repository files navigation

Vietnamese Handwriting Text Recognition (vnhtr package)

Proposal Architecture

VGG Transformer with Rethinking Head

TrOCR with Rethinking Head

Usage

vnhtr package

Fully implemented

Experimental Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`vnhtr` package

Packages