BANKNOTE UNIQUE CODE RECOGNITION

OVERVIEW

The application is supposed to capture region of interest(ROI) with unique banknote id, which consists of two letters and seven numbers(0-9), and implement OCR on it in order to recognize id, so it would be possible to operate with this code in future.

In this project Russian currency is used as an example of work(one, two and five thousands of rubles).

Important

Project has not been COMPLETED yet. At the moment app isn't working correctly, only 67% of planned functions are included. The app doesn't have properly working ocr.

METHODS

Optical character recognition
Object detection
Data annotation
Model's architecture fine-tuning
Learning curves visualization
Relative coordinates
Image resize
Augmentation implementation

PROBLEM PROJECT SOLVES

The main problem that this application solves is the theft of banknotes. If your money has been stolen and you have the id of each bill, then you can provide these numbers to the police, which will greatly increase the chance of returning your lost money to your pocket.

Application will be using by people through a telegram bot. Person sends picture of a banknote to telegram bot and after it saves id to the ids database. Each person has private database.

APP ARCHITECTURE

MODEL'S PERFORMANCE

At the moment OCR has 12.5% accuracy. Object detection model crops region of interest with accuracy of 98%. To improve OCR performance it should be fine-tuned on type of the font used on Russian banknotes. So, that's the problem is being solved now.

EXAMPLE OF WORKING

LEARNING CURVES

It's easy to see, that's model is working not quite well. Training loss is higher than test and there's huge gap between them. It indicates that the model is overfitting. So, it's the problem is being solved now.

With many tries and different architectures this is the best result this project ever had at the moment.

CREATING OBJECT DETECTION MODEL TO CAPTURE REGION OF INTEREST

First it's required to detect roi, because OCR needs to see only the text we're interested in without any other distracting characters and signs. Bills are type of things that doesn't have a lot of different patterns(1000 of rubles looks like the other million of 1000 ruble bill and so forth). So, it would be really difficult to teach model on such little pattern as text(our id), especially if we resize banknote pics to the size of 170x128.

And it was decided to teach the model to detect whole banknote and slice this region the way to get ~1/4 of right up part of banknote(where id is located).

Note

Check pic below, where blue region is object's annotation and red is the piece with id, obtained by cropping the image.

ANNOTATON PROCESS

label-studio was used to annotate the samples. Out of 904 available samples only 860 were suitable.

MODEL'S ARCHITECTURE

VGG-16 is used as foundation to experiment with. Activation function of output layer was changed from softmax to sigmoid, also last layer contains 4 neurons, because model will predict 4 metrics: top left x, top left y, down right x, down right y. Input shape was changed.

Deep learning part was fine-tuned, so it has a bit different architecture. Categorical cross-entropy loss function was changed to MSE, because we have regression task now, not classification.

PROBLEMS ENCOUNTERED IN THE PROCESS

RELATIVE COORDINATES

There was no any idea how to work with relative coordinates. It was needed to transform absolute coords to relative, because the model is using sigmoid activation function, where it ranges from 0 to 1.

It was solved by dividing each coordinate by appropriate shape. Here's function, which transfrom these coordinates.

def relative_coords(bbox):
  bbox = [bbox[0] / cols,#top_x
          bbox[1] / rows,#top_y
          bbox[2] / cols,#top_x
          bbox[3] / rows]#top_y
  
  return bbox

NOT ENOUGH DATA

First versions of model were overfitting a lot more. All of that because there're were only 860 data samples.

It was partly solved by implementing augmentation. Best result of all augmentation types was shown by cut-out aug. It decreased overfitting in around two times.

GETTING STARTED

The application does not work for other users, but for the owner. It uses pretrained regression convolutional neural network.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
reports		reports
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
rcnn_evaluation.py		rcnn_evaluation.py
unique_number_predicting_algorithm.py		unique_number_predicting_algorithm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BANKNOTE UNIQUE CODE RECOGNITION

TABLE OF CONTENTS

OVERVIEW

METHODS

PROBLEM PROJECT SOLVES

APP ARCHITECTURE

MODEL'S PERFORMANCE

EXAMPLE OF WORKING

LEARNING CURVES

CREATING OBJECT DETECTION MODEL TO CAPTURE REGION OF INTEREST

ANNOTATON PROCESS

MODEL'S ARCHITECTURE

PROBLEMS ENCOUNTERED IN THE PROCESS

RELATIVE COORDINATES

NOT ENOUGH DATA

GETTING STARTED

License

About

Releases

Packages

Languages

License

dysff/unique_banknote_code_ocr

Folders and files

Latest commit

History

Repository files navigation

BANKNOTE UNIQUE CODE RECOGNITION

TABLE OF CONTENTS

OVERVIEW

METHODS

PROBLEM PROJECT SOLVES

APP ARCHITECTURE

MODEL'S PERFORMANCE

EXAMPLE OF WORKING

LEARNING CURVES

CREATING OBJECT DETECTION MODEL TO CAPTURE REGION OF INTEREST

ANNOTATON PROCESS

MODEL'S ARCHITECTURE

PROBLEMS ENCOUNTERED IN THE PROCESS

RELATIVE COORDINATES

NOT ENOUGH DATA

GETTING STARTED

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages