Skip to content

Commit

Permalink
support batch inference during testing (#310)
Browse files Browse the repository at this point in the history
* support batch inference during testing

* fix unittest

* update docs using url

* set cfg for train, val and test

* update docs

* update docs and test.py

* samples_per_gpu as global setting

* changes revert
  • Loading branch information
cuhk-hbsun authored Jun 23, 2021
1 parent e6cb750 commit 82f64a5
Show file tree
Hide file tree
Showing 30 changed files with 263 additions and 95 deletions.
2 changes: 2 additions & 0 deletions configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@
data = dict(
samples_per_gpu=16,
workers_per_gpu=8,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/dbnet/dbnet_r50dcnv2_fpnc_1200e_icdar2015.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,8 @@
data = dict(
samples_per_gpu=8,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/drrg/drrg_r50_fpn_unet_1200e_ctw1500.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,8 @@
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=f'{data_root}/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/fcenet/fcenet_r50_fpn_1500e_icdar2015.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,8 @@
data = dict(
samples_per_gpu=8,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/fcenet/fcenet_r50dcnv2_fpn_1500e_ctw1500.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,8 @@
data = dict(
samples_per_gpu=6,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/maskrcnn/mask_rcnn_r50_fpn_160e_ctw1500.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@
data = dict(
samples_per_gpu=8,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/maskrcnn/mask_rcnn_r50_fpn_160e_icdar2015.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@
data = dict(
samples_per_gpu=8,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/maskrcnn/mask_rcnn_r50_fpn_160e_icdar2017.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@
data = dict(
samples_per_gpu=8,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/panet/panet_r18_fpem_ffm_600e_ctw1500.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,8 @@
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/panet/panet_r18_fpem_ffm_600e_icdar2015.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@
data = dict(
samples_per_gpu=8,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/panet/panet_r50_fpem_ffm_600e_icdar2017.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,8 @@
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/psenet/psenet_r50_fpnf_600e_ctw1500.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,8 @@
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,8 @@
data = dict(
samples_per_gpu=8,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2017.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@
data = dict(
samples_per_gpu=8,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=data_root + '/instances_training.json',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type=dataset_type,
ann_file=f'{data_root}/instances_training.json',
Expand Down
2 changes: 2 additions & 0 deletions configs/textrecog/crnn/crnn_academic_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@
data = dict(
samples_per_gpu=64,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(type='ConcatDataset', datasets=[train1]),
val=dict(
type='ConcatDataset',
Expand Down
2 changes: 2 additions & 0 deletions configs/textrecog/nrtr/nrtr_r31_1by16_1by8_academic.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,8 @@
data = dict(
samples_per_gpu=128,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(type='ConcatDataset', datasets=[train1, train2]),
val=dict(
type='ConcatDataset',
Expand Down
2 changes: 2 additions & 0 deletions configs/textrecog/nrtr/nrtr_r31_1by8_1by4_academic.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,8 @@
data = dict(
samples_per_gpu=128,
workers_per_gpu=4,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(type='ConcatDataset', datasets=[train1, train2]),
val=dict(
type='ConcatDataset',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,8 @@
data = dict(
samples_per_gpu=64,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type='ConcatDataset',
datasets=[
Expand Down
2 changes: 2 additions & 0 deletions configs/textrecog/sar/sar_r31_parallel_decoder_academic.py
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,8 @@
data = dict(
samples_per_gpu=64,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type='ConcatDataset',
datasets=[
Expand Down
2 changes: 2 additions & 0 deletions configs/textrecog/sar/sar_r31_parallel_decoder_chinese.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,8 @@
data = dict(
samples_per_gpu=40,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(type='ConcatDataset', datasets=[train]),
val=dict(type='ConcatDataset', datasets=[test]),
test=dict(type='ConcatDataset', datasets=[test]))
Expand Down
52 changes: 26 additions & 26 deletions configs/textrecog/sar/sar_r31_parallel_decoder_toy_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,24 +30,19 @@
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiRotateAugOCR',
rotate_degrees=[0, 90, 270],
transforms=[
dict(
type='ResizeOCR',
height=48,
min_width=48,
max_width=160,
keep_aspect_ratio=True),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'img_shape', 'valid_ratio',
'img_norm_cfg', 'ori_filename'
]),
type='ResizeOCR',
height=48,
min_width=48,
max_width=160,
keep_aspect_ratio=True),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'img_shape', 'valid_ratio',
'img_norm_cfg', 'ori_filename'
])
]

Expand All @@ -66,7 +61,7 @@
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=train_pipeline,
pipeline=None,
test_mode=False)

train_anno_file2 = 'tests/data/ocr_toy_dataset/label.lmdb'
Expand All @@ -82,7 +77,7 @@
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=train_pipeline,
pipeline=None,
test_mode=False)

test_anno_file1 = 'tests/data/ocr_toy_dataset/label.lmdb'
Expand All @@ -92,20 +87,25 @@
ann_file=test_anno_file1,
loader=dict(
type='LmdbLoader',
repeat=1,
repeat=10,
parser=dict(
type='LineStrParser',
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=test_pipeline,
pipeline=None,
test_mode=True)

data = dict(
samples_per_gpu=16,
workers_per_gpu=2,
train=dict(type='ConcatDataset', datasets=[train1, train2]),
val=dict(type='ConcatDataset', datasets=[test]),
test=dict(type='ConcatDataset', datasets=[test]))
samples_per_gpu=8,
train=dict(
type='UniformConcatDataset',
datasets=[train1, train2],
pipeline=train_pipeline),
val=dict(
type='UniformConcatDataset', datasets=[test], pipeline=test_pipeline),
test=dict(
type='UniformConcatDataset', datasets=[test], pipeline=test_pipeline))

evaluation = dict(interval=1, metric='acc')
2 changes: 2 additions & 0 deletions configs/textrecog/sar/sar_r31_sequential_decoder_academic.py
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,8 @@
data = dict(
samples_per_gpu=64,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),
train=dict(
type='ConcatDataset',
datasets=[
Expand Down
60 changes: 56 additions & 4 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,12 @@ To support the tasks of `text detection`, `text recognition` and `key informatio

Here we show some examples of using different combination of `loader` and `parser`.

#### Encoder-Decoder-Based Text Recognition Task
#### Text Recognition Task

##### OCRDataset

<small>*Dataset for encoder-decoder based recognizer*</small>

```python
dataset_type = 'OCRDataset'
img_prefix = 'tests/data/ocr_toy_dataset/imgs'
Expand All @@ -225,7 +230,7 @@ train = dict(
You can check the content of the annotation file in `tests/data/ocr_toy_dataset/label.txt`.
The combination of `HardDiskLoader` and `LineStrParser` will return a dict for each file by calling `__getitem__`: `{'filename': '1223731.jpg', 'text': 'GRAND'}`.

##### Optional Arguments:
**Optional Arguments:**

- `repeat`: The number of repeated lines in the annotation files. For example, if there are `10` lines in the annotation file, setting `repeat=10` will generate a corresponding annotation file with size `100`.

Expand Down Expand Up @@ -254,7 +259,10 @@ train = dict(
test_mode=False)
```

#### Segmentation-Based Text Recognition Task
##### OCRSegDataset

<small>*Dataset for segmentation-based recognizer*</small>

```python
prefix = 'tests/data/ocr_char_ann_toy_dataset/'
train = dict(
Expand All @@ -277,6 +285,11 @@ The combination of `HardDiskLoader` and `LineJsonParser` will return a dict for
```

#### Text Detection Task

##### TextDetDataset

<small>*Dataset with annotation file in line-json txt format*</small>

```python
dataset_type = 'TextDetDataset'
img_prefix = 'tests/data/toy_dataset/imgs'
Expand All @@ -302,7 +315,10 @@ The combination of `HardDiskLoader` and `LineJsonParser` will return a dict for
```


### COCO-like Dataset
##### IcdarDataset

<small>*Dataset with annotation file in coco-like json format*</small>

For text detection, you can also use an annotation file in a COCO format that is defined in [mmdet](https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/coco.py):
```python
dataset_type = 'IcdarDataset'
Expand All @@ -325,4 +341,40 @@ You can check the content of the annotation file in `tests/data/toy_dataset/inst
```shell
python tools/data_converter/ctw1500_converter.py ${src_root_path} -o ${out_path} --split-list training test
```

#### UniformConcatDataset

To use the `universal pipeline` for multiple datasets, we design `UniformConcatDataset`.
For example, apply `train_pipeline` for both `train1` and `train2`,

```python
data = dict(
...
train=dict(
type='UniformConcatDataset',
datasets=[train1, train2],
pipeline=train_pipeline))
```

Meanwhile, we have
- train_dataloader
- val_dataloader
- test_dataloader

to give specific settings. They will override the general settings in `data` dict.
For example,

```python
data = dict(
workers_per_gpu=2, # global setting
train_dataloader=dict(samples_per_gpu=8, drop_last=True), # train-specific setting
val_dataloader=dict(samples_per_gpu=8, workers_per_gpu=1), # val-specific setting
test_dataloader=dict(samples_per_gpu=8), # test-specific setting
...
```
`workers_per_gpu` is global setting and `train_dataloader` and `val_dataloader` will inherit the values.
`val_dataloader` override the value by `workers_per_gpu=1`.

To activate `batch inference` for `val` and `test`, please set `val_dataloader=dict(samples_per_gpu=8)` and `test_dataloader=dict(samples_per_gpu=8)` as above.
Or just set `samples_per_gpu=8` as global setting.
See [config](/configs/textrecog/sar/sar_r31_parallel_decoder_toy_dataset.py) for an example.
Loading

0 comments on commit 82f64a5

Please sign in to comment.