Can PaddleOCR do Batch Inference? #14567

redagavin · 2025-01-19T18:46:05Z

redagavin
Jan 19, 2025

from paddleocr import PaddleOCR, draw_ocr

# Paddleocr目前支持中英文、英文、法语、德语、韩语、日语，可以通过修改lang参数进行切换
# 参数依次为`ch`, `en`, `french`, `german`, `korean`, `japan`。
ocr = PaddleOCR(use_angle_cls=True, lang="en")  # need to run only once to download and load model into memory

from tqdm import tqdm

for item in tqdm(np_dataset, total=total_items, desc="Processing images"):
    img = item['image']
    result = ocr.ocr(img, cls=True)
    if result == [None]:
        continue
    for idx in range(len(result)):
        res = result[idx]

When I used the code snippet above, I got the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[11], line 5
      3 for item in tqdm(np_dataset, total=total_items, desc="Processing images"):
      4     img = item['image']
----> 5     result = ocr.ocr(img, cls=True)
      6     if result == [None]:
      7         continue

File /data/gavin/miniconda3/envs/paddleocr/lib/python3.11/site-packages/paddleocr/paddleocr.py:760, in PaddleOCR.ocr(self, img, det, rec, cls, bin, inv, alpha_color, slice)
    758 for img in imgs:
    759     img = preprocess_image(img)
--> 760     dt_boxes, rec_res, _ = self.__call__(img, cls, slice)
    761     if not dt_boxes and not rec_res:
    762         ocr_res.append(None)

File /data/gavin/miniconda3/envs/paddleocr/lib/python3.11/site-packages/paddleocr/tools/infer/predict_system.py:109, in TextSystem.__call__(self, img, cls, slice)
    107     elapse = sum(elapsed)
    108 else:
--> 109     dt_boxes, elapse = self.text_detector(img)
    111 time_dict["det"] = elapse
    113 if dt_boxes is None:

File /data/gavin/miniconda3/envs/paddleocr/lib/python3.11/site-packages/paddleocr/tools/infer/predict_det.py:396, in TextDetector.__call__(self, img, use_slice)
    394         elapse += sub_elapse
    395 else:
--> 396     dt_boxes, elapse = self.predict(img)
    397 return dt_boxes, elapse

File /data/gavin/miniconda3/envs/paddleocr/lib/python3.11/site-packages/paddleocr/tools/infer/predict_det.py:239, in TextDetector.predict(self, img)
    236 if self.args.benchmark:
    237     self.autolog.times.start()
--> 239 data = transform(data, self.preprocess_op)
    240 img, shape_list = data
    241 if img is None:

File /data/gavin/miniconda3/envs/paddleocr/lib/python3.11/site-packages/paddleocr/ppocr/data/imaug/__init__.py:72, in transform(data, ops)
     70     ops = []
     71 for op in ops:
---> 72     data = op(data)
     73     if data is None:
     74         return None

File /data/gavin/miniconda3/envs/paddleocr/lib/python3.11/site-packages/paddleocr/ppocr/data/imaug/operators.py:228, in DetResizeForTest.__call__(self, data)
    226 def __call__(self, data):
    227     img = data["image"]
--> 228     src_h, src_w, _ = img.shape
    229     if sum([src_h, src_w]) < 64:
    230         img = self.image_padding(img)

ValueError: too many values to unpack (expected 3)

item in the for loop is a batch of images so it has four dimensions: [batch size, height, width, channel]. It seems like PaddleOCR() cannot handle batches. Could you please tell me how I can do batch inference? Thanks!

GreatV · 2025-01-19T18:51:59Z

GreatV
Jan 19, 2025
Maintainer

No, PaddleOCR does not natively support batch inference directly using the ocr.ocr() method. The issue you encountered arises because the ocr.ocr() method expects a single image as input (with dimensions [height, width, channels]) and not a batch of images with dimensions [batch size, height, width, channels]. This is why passing a batch of images to the ocr.ocr() method results in errors like ValueError: too many values to unpack (expected 3).

How to Handle Batch Inference in PaddleOCR

Although PaddleOCR doesn't natively process batches, you can implement batch inference by iterating over the batch and processing each image individually. Alternatively, you can modify the PaddleOCR codebase to support batch processing. Below are the recommended approaches:

1. Process Images Sequentially

Modify your code to process each image in the batch one at a time. Here is an example:

from paddleocr import PaddleOCR
from tqdm import tqdm
import numpy as np

# Initialize PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang="en")

# Example batch of images (np_dataset)
# np_dataset is a list of dictionaries with each dictionary containing an image
# np_dataset = [{'image': np_image1}, {'image': np_image2}, ...]

results = []  # To store OCR results for the batch
for item in tqdm(np_dataset, desc="Processing images"):
    img = item['image']  # Extract the single image from the batch
    result = ocr.ocr(img, cls=True)  # Perform OCR
    results.append(result)  # Append the result to the list

This approach processes each image in the dataset sequentially and stores the OCR results in the results list.

2. Use Multi-threading or Multi-processing for Parallelism

If you want to speed up the process, you can use multi-threading or multi-processing to process images in parallel. Here's an example using the ThreadPoolExecutor:

from paddleocr import PaddleOCR
from concurrent.futures import ThreadPoolExecutor
from tqdm import tqdm

# Initialize PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang="en")

# Example batch of images (np_dataset)
# np_dataset = [{'image': np_image1}, {'image': np_image2}, ...]

def process_image(item):
    img = item['image']
    return ocr.ocr(img, cls=True)

# Use ThreadPoolExecutor for parallel processing
results = []
with ThreadPoolExecutor(max_workers=4) as executor:  # Adjust max_workers as needed
    futures = [executor.submit(process_image, item) for item in np_dataset]
    for future in tqdm(futures, desc="Processing images"):
        results.append(future.result())

This approach processes images in parallel, significantly speeding up batch inference.

3. Modify PaddleOCR for Batch Inference

If batch inference is a strict requirement, you can modify the PaddleOCR codebase to handle batches. Here's how:

Modify the ocr.ocr() method in the PaddleOCR class to accept batch input and loop over the batch internally.
Ensure that all preprocessing, inference, and postprocessing steps in the PaddleOCR pipeline handle batch dimensions.

However, this approach requires a deep understanding of the PaddleOCR codebase and might not be straightforward.

Related Discussion: Multi-threading Issues

The error you encountered during multi-threading (UnimplementedError: There are no kernels which are registered in the fused_conv2d operator) might be related to an unsupported hardware/kernel configuration. Ensure the following:

You are using a compatible GPU and the correct version of PaddlePaddle installed with GPU support.
Use multi-threading cautiously, as PaddleOCR's internal models may not be thread-safe.

For more information, refer to the GitHub discussion "多线程识别报错", which discusses multi-threading errors in PaddleOCR.

Final Recommendation

For simplicity and minimal changes, use sequential processing or multi-threading as described above.
If batch inference is critical, consider modifying the PaddleOCR codebase with caution or explore other OCR frameworks that natively support batch processing.

Response generated by feifei-bot | chatgpt-4o-latest

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can PaddleOCR do Batch Inference? #14567

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Can PaddleOCR do Batch Inference? #14567

redagavin Jan 19, 2025

Replies: 1 comment

GreatV Jan 19, 2025 Maintainer

How to Handle Batch Inference in PaddleOCR

1. Process Images Sequentially

2. Use Multi-threading or Multi-processing for Parallelism

3. Modify PaddleOCR for Batch Inference

Related Discussion: Multi-threading Issues

Final Recommendation

redagavin
Jan 19, 2025

GreatV
Jan 19, 2025
Maintainer