function 'feval' in model.lua get bad argument #1 error #9

sun5219 · 2017-02-16T08:05:54Z

I use example command:
th src/train.lua -phase train -gpu_id 1 -input_feed -model_dir model -image_dir data/images -data_path data/train.txt -val_data_path data/validate.txt -label_path data/labels.txt -vocab_file data/vocab.txt -batch_size 8 -beam_size 1 -max_num_tokens 150 -max_image_width 500 -max_image_height 160
but receive this error:
/home/kxx/torch/install/bin/luajit: /home/kxx/.luarocks/share/lua/5.1/torch/Tensor.lua:462: bad argument #1 to 'set' (expecting number or Tensor or Storage) stack traceback: [C]: in function 'set' /home/kxx/.luarocks/share/lua/5.1/torch/Tensor.lua:462: in function 'view' /home/kxx/.luarocks/share/lua/5.1/onmt/translate/Beam.lua:127: in function 'func' /home/kxx/.luarocks/share/lua/5.1/onmt/utils/Tensor.lua:12: in function 'recursiveApply' /home/kxx/.luarocks/share/lua/5.1/onmt/utils/Tensor.lua:7: in function 'selectBeam' /home/kxx/.luarocks/share/lua/5.1/onmt/translate/Beam.lua:350: in function '_nextState' /home/kxx/.luarocks/share/lua/5.1/onmt/translate/Beam.lua:339: in function '_nextBeam' .../.luarocks/share/lua/5.1/onmt/translate/BeamSearcher.lua:98: in function '_findKBest' .../.luarocks/share/lua/5.1/onmt/translate/BeamSearcher.lua:68: in function 'search' ./src/model.lua:246: in function 'feval' ./src/model.lua:313: in function 'step' src/train.lua:159: in function 'run' src/train.lua:253: in function 'main' src/train.lua:259: in main chunk [C]: in function 'dofile' .../kxx/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50
is it a bug?

The text was updated successfully, but these errors were encountered:

da03 · 2017-02-16T15:25:42Z

That's indeed a bug. Did you observe that after 1 epoch has finished?

dmas-at-wiris · 2017-02-16T15:32:25Z

I have the same problem. And yes, that happens after one training epoch, when validation starts. The same happens if you try to "translate" images only (without training) using your pretrained model.

da03 · 2017-02-16T15:35:13Z

Oh thanks for reporting that! I'll look into it.

da03 · 2017-02-25T15:16:10Z

Sorry for the delay. I suspect that's due to the fact that OpenNMT updated the library. I just updated the repo to be compatible with the current OpenNMT library. Can you try reinstalling OpenNMT libary by luarocks remove onmt && luarocks install --local https://raw.githubusercontent.com/OpenNMT/OpenNMT/master/rocks/opennmt-scm-1.rockspec and then use the latest Im2Text code?

da03 · 2017-02-25T15:36:07Z

Oh sorry my bad, training broke now although decoding works fine... Working to sovle this.

dmas-at-wiris · 2017-02-28T16:28:20Z

Thanks! It works now

da03 · 2017-02-28T16:29:31Z

Great!

acmx2 · 2017-06-02T21:47:50Z

Hello, trying to get Im2Text to work alongside with the latest OpenNMT and Torch installations. A lot of errors are encountered, obviously due to that the two have been updated since the latest Im2Text commit.

So, I have rolled them back adjusting to the latest Im2Text commit date, and the following configuration seems to be working somehow:

Torch: ed2b0f48a9f3b4aa47ec5fab5abcabcedac4f97d (with adjusted submudule revisions)
OpenNMT: a2e96b03694b6d656dec327efdbfece3c29b417a
Im2Text latest: cafc5eb (Apr 19)

However, still the following error for some test images:
argument #2 to '?' (out of range) stack traceback: [C]: at 0x7f0deeb54a40 [C]: in function '__newindex' ./src/model.lua:260: in function 'feval'
@da03 To be certain about if this is a program or configuration error, could you please share information about your working configuration (i.e. what Im2Text, OpenNMT, Torch commit hashes etc.)?

da03 · 2017-06-02T22:33:23Z

Oh sure. I plan to directly include onmt in this project.

da03 · 2017-06-02T22:38:19Z

@acmx2 Updated. This is a working version which myself is using. Since onmt is already included, make sure the currently installed onmt has been removed. Note that due to the model changes, the pretrained model cannot work now. I'll train and upload a new model.

acmx2 · 2017-06-03T07:09:11Z

Thanks, it runs w/o crashes. However, having trained and tested it I see the following results:
7944775fc9.png _ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ { 2 } ^ { 2 } } ^ {
, the same for the other images.
I use the provided on the main page 'train' and 'test' options, i.e., for the first I run the train phase, and the test phase for the next. Is it an expected result for such a small training set?

da03 · 2017-06-05T00:43:36Z

@acmx2 Thanks for reporting that! Can you test it on the training set as well?

acmx2 · 2017-06-05T06:31:12Z

Surely, want to train on the 100k set, need to upgrade my videocard for that, because it might take a couple of weeks to train on my current gtx 950m 4Gb, although the training seems to be running on it with reduced batch size (20->5).

acmx2 · 2017-06-24T17:19:29Z

@da03 Finally, trained the latest Im2Text using the latest Torch on the 100k training set. The result is: the same {2{2{2... It looks like something is broken in the latest commits.

Details: my configuration is GTX 1080 Ti 11GB RAM, I use the command line provided on the main page with additional option disabling stress test on start up. After 70 hours of training the perplexity definitely doesn't go below 20.
Trying to vary the command line options / downgrading the Torch version - don't improve anything.

Nevertheless, the following configuration seems to be working:

The latest Torch
The Im2Text cafc5eb with the 'opennmt' folder just copied from the latest commit.
Great!
I have trained it on the 100k set.

sun5219 changed the title ~~How to solve this problem?~~ function 'feval' in model.lua get bad argument #1 error Feb 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

function 'feval' in model.lua get bad argument #1 error #9

function 'feval' in model.lua get bad argument #1 error #9

sun5219 commented Feb 16, 2017 •

edited

Loading

da03 commented Feb 16, 2017

dmas-at-wiris commented Feb 16, 2017

da03 commented Feb 16, 2017

da03 commented Feb 25, 2017

da03 commented Feb 25, 2017

dmas-at-wiris commented Feb 28, 2017

da03 commented Feb 28, 2017

acmx2 commented Jun 2, 2017

da03 commented Jun 2, 2017

da03 commented Jun 2, 2017

acmx2 commented Jun 3, 2017

da03 commented Jun 5, 2017

acmx2 commented Jun 5, 2017

acmx2 commented Jun 24, 2017

function 'feval' in model.lua get bad argument #1 error #9

function 'feval' in model.lua get bad argument #1 error #9

Comments

sun5219 commented Feb 16, 2017 • edited Loading

da03 commented Feb 16, 2017

dmas-at-wiris commented Feb 16, 2017

da03 commented Feb 16, 2017

da03 commented Feb 25, 2017

da03 commented Feb 25, 2017

dmas-at-wiris commented Feb 28, 2017

da03 commented Feb 28, 2017

acmx2 commented Jun 2, 2017

da03 commented Jun 2, 2017

da03 commented Jun 2, 2017

acmx2 commented Jun 3, 2017

da03 commented Jun 5, 2017

acmx2 commented Jun 5, 2017

acmx2 commented Jun 24, 2017

sun5219 commented Feb 16, 2017 •

edited

Loading