[Bug] the code take a long time to start running #78

Fan123456-c · 2024-09-23T03:37:30Z

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

/

Reproduces the problem - code sample

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch
--nproc_per_node=8 --master_port=29523
tools/train.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof-full.py
--work-dir=work_dirs/new-mv-gd --launcher="pytorch"

Reproduces the problem - command or script

"Why does the code take a long time to start running after I launch it? What is the reason for this, and is there any solution?"

Reproduces the problem - error message

It took a long time : 09/23 03:32:13 - mmengine - WARNING - Failed to search registry with scope "embodiedscan" in the "loop" registry tree. As a workaround, the current "loop" registry in "mmengine" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "embodiedscan" is a correct scope, or whether the registry is initialized.
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 234014/234014, 14313.0 task/s, elapsed: 16s, ETA: 0s ] 0/234014, elapsed: 0s, ETA::

Additional information

No response

Fan123456-c · 2024-09-23T03:39:08Z

It takes about 5 minutes from the time I launch the code until it starts running. What could be the reason for this delay, and is there any solution to reduce the startup time

henryzhengr · 2024-09-24T06:57:34Z

Same here, the full version indeed takes awhile to run. This is likely to be caused by dataloader initialization where preprocessing procedures are done such as matching the language descriptions with bbox annotations.

I believe they are already working on this #71

If you need it urgently you can probably preprocess this offline then modify to loader to load the preprocessed version. Or in the case of distributed training, split the file into chunks and having multiple workers to process separate chunks concurrently Mentioned in #29 .

Fan123456-c · 2024-09-24T09:29:46Z

Same here, the full version indeed takes awhile to run. This is likely to be caused by dataloader initialization where preprocessing procedures are done such as matching the language descriptions with bbox annotations.

I believe they are already working on this #71

If you need it urgently you can probably preprocess this offline then modify to loader to load the preprocessed version. Or in the case of distributed training, split the file into chunks and having multiple workers to process separate chunks concurrently Mentioned in #29 .

Thank you for your attention. If I want to debug this model and reduce the dataset loading time, is there any other method, such as using only a part of the dataset?

Tai-Wang · 2024-09-27T07:08:50Z

We provide a mini set to achieve lightweight training. You can follow that benchmark result to reduce the burden of experiments.

Tai-Wang closed this as completed Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] the code take a long time to start running #78

[Bug] the code take a long time to start running #78

Fan123456-c commented Sep 23, 2024

Fan123456-c commented Sep 23, 2024

henryzhengr commented Sep 24, 2024 •

edited

Loading

Fan123456-c commented Sep 24, 2024

Tai-Wang commented Sep 27, 2024

[Bug] the code take a long time to start running #78

[Bug] the code take a long time to start running #78

Comments

Fan123456-c commented Sep 23, 2024

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Fan123456-c commented Sep 23, 2024

henryzhengr commented Sep 24, 2024 • edited Loading

Fan123456-c commented Sep 24, 2024

Tai-Wang commented Sep 27, 2024

henryzhengr commented Sep 24, 2024 •

edited

Loading