Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCR model example using two stage pipeline #563

Draft
wants to merge 6 commits into
base: gen3
Choose a base branch
from

Conversation

aljazkonec1
Copy link

This PR adds an OCR model example using a two stage pipeline.

Copy link
Contributor

@klemen1999 klemen1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, left some comments

python3 main.py
# Instalation
Running this example requires a **Luxonis OAK4 device** connected to your computer. You can find more information about the supported devices and the set up instructions in our [Documentation](https://rvc4.docs.luxonis.com/hardware).
Moreover, you need to prepare a **Python 3.10** environment with [DepthAI](https://pypi.org/project/depthai/) and [DepthAI Nodes](https://pypi.org/project/depthai-nodes/) packages installed. You can do this by running:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does py3.10 dependency come from since depthai and depthai-nodes should both work also with 3.8?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it the same as we have in the general README

gen3/neural-networks/advanced-examples/ocr/ocr/README.md Outdated Show resolved Hide resolved
gen3/neural-networks/advanced-examples/ocr/ocr/README.md Outdated Show resolved Hide resolved
replay_node.setLoop(True)

video_resize_node = pipeline.create(dai.node.ImageManipV2)
video_resize_node.initialConfig.setOutputSize(1728, 960)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reason we request 3x the input size just for nicer visualization at the end?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OCR model accepts 320x48, so cropping from a small 576x320 image we would have to upsample cropped detection by a lot and we would loose accuracy in second stage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants