Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VisibleDeprecationWarning and train failure #48

Open
CassieMai opened this issue Mar 20, 2017 · 6 comments
Open

VisibleDeprecationWarning and train failure #48

CassieMai opened this issue Mar 20, 2017 · 6 comments

Comments

@CassieMai
Copy link

Hello, I have a problem when I trained mnc using ./experiments/scripts/mnc_5stage.sh. Can anyone help me? Thanks in advance.

I0320 15:46:29.860514  2121 net.cpp:270] This network produces output seg_cls_loss
I0320 15:46:29.860517  2121 net.cpp:270] This network produces output seg_cls_loss_ext
I0320 15:46:29.862728  2121 net.cpp:283] Network initialization done.
I0320 15:46:29.862998  2121 solver.cpp:60] Solver scaffolding done.
Loading pretrained model weights from data/imagenet_models/VGG16.mask.caffemodel
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1024780411
I0320 15:46:30.187852  2121 net.cpp:810] Ignoring source layer rpn_conv/3x3
I0320 15:46:30.187872  2121 net.cpp:810] Ignoring source layer rpn_relu/3x3
I0320 15:46:30.187875  2121 net.cpp:810] Ignoring source layer rpn/output_rpn_relu/3x3_0_split
I0320 15:46:30.244598  2121 net.cpp:810] Ignoring source layer drop6
I0320 15:46:30.253931  2121 net.cpp:810] Ignoring source layer drop7
I0320 15:46:30.310539  2121 net.cpp:810] Ignoring source layer drop6_mask
I0320 15:46:30.319871  2121 net.cpp:810] Ignoring source layer drop7_mask
Solving...
/MNC/tools/../lib/pylayer/proposal_target_layer.py:152: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  cur_inds = npr.choice(cur_inds, size=cur_rois_this_image, replace=False)
/MNC/tools/../lib/transform/bbox_transform.py:201: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
/MNC/tools/../lib/transform/bbox_transform.py:202: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
/MNC/tools/../lib/pylayer/proposal_target_layer.py:190: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  gt_box = scaled_gt_boxes[gt_assignment[val]]
/MNC/tools/../lib/pylayer/proposal_target_layer.py:193: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  gt_mask = gt_masks[gt_assignment[val]]
/MNC/tools/../lib/pylayer/proposal_target_layer.py:194: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  gt_mask_info = mask_info[gt_assignment[val]]
/MNC/tools/../lib/pylayer/proposal_target_layer.py:195: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  gt_mask = gt_mask[0:gt_mask_info[0], 0:gt_mask_info[1]]
/MNC/tools/../lib/pylayer/proposal_target_layer.py:201: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  top_mask_info[i, 0] = gt_assignment[val]
F0320 15:46:43.415727  2121 smooth_L1_loss_layer.cpp:54] Not Implemented Yet
*** Check failure stack trace: ***
./experiments/scripts/mnc_5stage.sh: line 35:  2121 Aborted                 (core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver models/${NET}/mnc_5stage/solver.prototxt --weights ${NET_INIT} --imdb ${DATASET_TRAIN} --iters ${ITERS} --cfg experiments/cfgs/${NET}/mnc_5stage.yml ${EXTRA_ARGS}

@hgaiser
Copy link

hgaiser commented Mar 21, 2017

The warning is not the issue, but it looks like you are running CPU mode (which is, as it mentions, not implemented).

@CassieMai
Copy link
Author

Thanks. Actually I am using GPU mode. Maybe there is some wrong configuration in debugging process. I might debug MNC from the beginning.

@CassieMai
Copy link
Author

@hgaiser I still can't solve this problem. I really use GPU mode (in Makefile.config, #CPU_only=1). Do you have any idea?

@hgaiser
Copy link

hgaiser commented Mar 21, 2017

You can try a basic caffe tutorial and make sure it is running on the GPU. Does the command nvidia-smi give something logical? Or does it print some error or something?

@CassieMai
Copy link
Author

CassieMai commented Mar 21, 2017

@hgaiser I did training on a new downloaded MNC, and the problem became as follows.


I0321 16:31:14.548213 21241 net.cpp:270] This network produces output rpn_loss_bbox
I0321 16:31:14.548214 21241 net.cpp:270] This network produces output seg_cls_loss
I0321 16:31:14.548216 21241 net.cpp:270] This network produces output seg_cls_loss_ext
I0321 16:31:14.631436 21241 net.cpp:283] Network initialization done.
I0321 16:31:14.631700 21241 solver.cpp:60] Solver scaffolding done.
Loading pretrained model weights from data/imagenet_models/VGG16.mask.caffemodel
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1024780411
I0321 16:31:14.956207 21241 net.cpp:810] Ignoring source layer rpn_conv/3x3
I0321 16:31:14.956226 21241 net.cpp:810] Ignoring source layer rpn_relu/3x3
I0321 16:31:14.956228 21241 net.cpp:810] Ignoring source layer rpn/output_rpn_relu/3x3_0_split
I0321 16:31:15.013999 21241 net.cpp:810] Ignoring source layer drop6
I0321 16:31:15.023555 21241 net.cpp:810] Ignoring source layer drop7
I0321 16:31:15.081140 21241 net.cpp:810] Ignoring source layer drop6_mask
I0321 16:31:15.090597 21241 net.cpp:810] Ignoring source layer drop7_mask
Solving...
./experiments/scripts/mnc_5stage.sh: line 35: 21241 Segmentation fault      (core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver models/${NET}/mnc_5stage/solver.prototxt --weights ${NET_INIT} --imdb ${DATASET_TRAIN} --iters ${ITERS} --cfg experiments/cfgs/${NET}/mnc_5stage.yml ${EXTRA_ARGS}

@CassieMai
Copy link
Author

@hgaiser Sorry. It seems that I did't use cuDNN correctly. I check cuda path in ~/.bashrc. Now this problem has been solved. Thank you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants