Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add code for detecting blank spaces in certificates -- approach discussion. #1

Open
nvinayvarma189 opened this issue Oct 29, 2019 · 1 comment
Assignees
Labels
good first issue Good for newcomers thread discussion thread

Comments

@nvinayvarma189
Copy link

nvinayvarma189 commented Oct 29, 2019

Let us take these two images for reference. image1 and image2.

My approach is to

  1. convert the input images to binary (black or white) images.
  2. Resize it to a predefined constant size.
  3. Imagine a horizontal rectangle of length A and width B, fixed where the top edge of the image is aligned with the top edge of the rectangle. For reference: it would be on the design border for image1 and above "name of school" text in image2
  4. While (all the pixels inside the rectangle area are not the same)
    4.1 move the rectangle down by one row.

After doing this the top edge of the rectangle would end up exactly below "awarded to" line image1. (Assuming the width of the rectangle is big enough to not fit in the gap between "certificate of achievement" and top border. So it is important to get an estimate of the average width needed for the rectangle.

Now as the rectangle encountered a white region (empty space), it should be aware that it shall place the certificate receiver's name nearby.

  1. While ( the bottom edge of the image did not encounter any different color pixel (black in this case) other than what it currently consists (white pixels in this case) )
    5.1 move the rectangle down by one row.

After this, the rectangle would be at a place where it's bottom edge just touched the "for" text in image1. So we can be sure to place the name in this rectangle region.

There is a catch here. if we assigned large width to the rectangle, chances are that it might miss the gap between "awarder to" and "for" as well. So when this happens, once the rectangle reaches the bottom of the image, re-fix it at the top of the image and reduce the width.

  1. if the rectangle reached the bottom of the image and gap is not found, repeat steps from 4 with lesser width for the rectangle.

now while doing 6, chances are that the rectangle might find a gap between "certificate of achievement" and "awarded to". So by assuming the name of the recipient would always be placed in the middle of the image, we can check if the rectangle is around the middle area of the certificate whenever it found a gap.

If both the conditions, finding a gap and the gap situated in the middle area of the image are satisfied, then the algorithm terminates.


This is, of course, a naive algorithm, as we see more test images, we make changes to the algorithm accordingly. There will always be exception cases for these kinds of problems because of the format of the certificate. It can be very inconsistent and this algorithm can only work for most of the certificate formats if not all.

Please provide suggestions and let me know if anything is unclear.

@sakethramanujam
Copy link
Member

@V1NAY8 should you want to add any ideas to this, we shall be working on it over the weekend, else, drop and go ahead with something like a headless chrome that generates screenshots of media see puppteer.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers thread discussion thread
Projects
None yet
Development

No branches or pull requests

2 participants