VisibleDeprecationWarning and train failure #48

CassieMai · 2017-03-20T23:02:24Z

Hello, I have a problem when I trained mnc using ./experiments/scripts/mnc_5stage.sh. Can anyone help me? Thanks in advance.

I0320 15:46:29.860514  2121 net.cpp:270] This network produces output seg_cls_loss
I0320 15:46:29.860517  2121 net.cpp:270] This network produces output seg_cls_loss_ext
I0320 15:46:29.862728  2121 net.cpp:283] Network initialization done.
I0320 15:46:29.862998  2121 solver.cpp:60] Solver scaffolding done.
Loading pretrained model weights from data/imagenet_models/VGG16.mask.caffemodel
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1024780411
I0320 15:46:30.187852  2121 net.cpp:810] Ignoring source layer rpn_conv/3x3
I0320 15:46:30.187872  2121 net.cpp:810] Ignoring source layer rpn_relu/3x3
I0320 15:46:30.187875  2121 net.cpp:810] Ignoring source layer rpn/output_rpn_relu/3x3_0_split
I0320 15:46:30.244598  2121 net.cpp:810] Ignoring source layer drop6
I0320 15:46:30.253931  2121 net.cpp:810] Ignoring source layer drop7
I0320 15:46:30.310539  2121 net.cpp:810] Ignoring source layer drop6_mask
I0320 15:46:30.319871  2121 net.cpp:810] Ignoring source layer drop7_mask
Solving...
/MNC/tools/../lib/pylayer/proposal_target_layer.py:152: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  cur_inds = npr.choice(cur_inds, size=cur_rois_this_image, replace=False)
/MNC/tools/../lib/transform/bbox_transform.py:201: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
/MNC/tools/../lib/transform/bbox_transform.py:202: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
/MNC/tools/../lib/pylayer/proposal_target_layer.py:190: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  gt_box = scaled_gt_boxes[gt_assignment[val]]
/MNC/tools/../lib/pylayer/proposal_target_layer.py:193: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  gt_mask = gt_masks[gt_assignment[val]]
/MNC/tools/../lib/pylayer/proposal_target_layer.py:194: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  gt_mask_info = mask_info[gt_assignment[val]]
/MNC/tools/../lib/pylayer/proposal_target_layer.py:195: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  gt_mask = gt_mask[0:gt_mask_info[0], 0:gt_mask_info[1]]
/MNC/tools/../lib/pylayer/proposal_target_layer.py:201: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  top_mask_info[i, 0] = gt_assignment[val]
F0320 15:46:43.415727  2121 smooth_L1_loss_layer.cpp:54] Not Implemented Yet
*** Check failure stack trace: ***
./experiments/scripts/mnc_5stage.sh: line 35:  2121 Aborted                 (core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver models/${NET}/mnc_5stage/solver.prototxt --weights ${NET_INIT} --imdb ${DATASET_TRAIN} --iters ${ITERS} --cfg experiments/cfgs/${NET}/mnc_5stage.yml ${EXTRA_ARGS}

The text was updated successfully, but these errors were encountered:

hgaiser · 2017-03-21T16:04:56Z

The warning is not the issue, but it looks like you are running CPU mode (which is, as it mentions, not implemented).

CassieMai · 2017-03-21T18:42:58Z

Thanks. Actually I am using GPU mode. Maybe there is some wrong configuration in debugging process. I might debug MNC from the beginning.

CassieMai · 2017-03-21T20:49:57Z

@hgaiser I still can't solve this problem. I really use GPU mode (in Makefile.config, #CPU_only=1). Do you have any idea?

hgaiser · 2017-03-21T21:33:14Z

You can try a basic caffe tutorial and make sure it is running on the GPU. Does the command nvidia-smi give something logical? Or does it print some error or something?

CassieMai · 2017-03-21T22:36:02Z

@hgaiser I did training on a new downloaded MNC, and the problem became as follows.


I0321 16:31:14.548213 21241 net.cpp:270] This network produces output rpn_loss_bbox
I0321 16:31:14.548214 21241 net.cpp:270] This network produces output seg_cls_loss
I0321 16:31:14.548216 21241 net.cpp:270] This network produces output seg_cls_loss_ext
I0321 16:31:14.631436 21241 net.cpp:283] Network initialization done.
I0321 16:31:14.631700 21241 solver.cpp:60] Solver scaffolding done.
Loading pretrained model weights from data/imagenet_models/VGG16.mask.caffemodel
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1024780411
I0321 16:31:14.956207 21241 net.cpp:810] Ignoring source layer rpn_conv/3x3
I0321 16:31:14.956226 21241 net.cpp:810] Ignoring source layer rpn_relu/3x3
I0321 16:31:14.956228 21241 net.cpp:810] Ignoring source layer rpn/output_rpn_relu/3x3_0_split
I0321 16:31:15.013999 21241 net.cpp:810] Ignoring source layer drop6
I0321 16:31:15.023555 21241 net.cpp:810] Ignoring source layer drop7
I0321 16:31:15.081140 21241 net.cpp:810] Ignoring source layer drop6_mask
I0321 16:31:15.090597 21241 net.cpp:810] Ignoring source layer drop7_mask
Solving...
./experiments/scripts/mnc_5stage.sh: line 35: 21241 Segmentation fault      (core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver models/${NET}/mnc_5stage/solver.prototxt --weights ${NET_INIT} --imdb ${DATASET_TRAIN} --iters ${ITERS} --cfg experiments/cfgs/${NET}/mnc_5stage.yml ${EXTRA_ARGS}

CassieMai · 2017-03-22T01:48:45Z

@hgaiser Sorry. It seems that I did't use cuDNN correctly. I check cuda path in ~/.bashrc. Now this problem has been solved. Thank you for your help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VisibleDeprecationWarning and train failure #48

VisibleDeprecationWarning and train failure #48

CassieMai commented Mar 20, 2017

hgaiser commented Mar 21, 2017

CassieMai commented Mar 21, 2017

CassieMai commented Mar 21, 2017

hgaiser commented Mar 21, 2017

CassieMai commented Mar 21, 2017 •

edited

Loading

CassieMai commented Mar 22, 2017

VisibleDeprecationWarning and train failure #48

VisibleDeprecationWarning and train failure #48

Comments

CassieMai commented Mar 20, 2017

hgaiser commented Mar 21, 2017

CassieMai commented Mar 21, 2017

CassieMai commented Mar 21, 2017

hgaiser commented Mar 21, 2017

CassieMai commented Mar 21, 2017 • edited Loading

CassieMai commented Mar 22, 2017

CassieMai commented Mar 21, 2017 •

edited

Loading