Saving model crashes when training multipathnet #21

samson-wang · 2016-09-26T01:53:30Z

When executed to the following code on the end epoch of training multipathnet model, the process crashed.

   print("Saving model to "..model_path)
   torch.save(model_path, utils.checkpoint(model))

The stack trace:

/home/samson/torch/install/bin/luajit: ./modules/ModelParallelTable.lua:357: ModelParallelTable only supports CudaTensor, not torch.FloatTensor
stack traceback:
    [C]: in function 'error'
    ./modules/ModelParallelTable.lua:357: in function 'type'
    /home/samson/torch/install/share/lua/5.1/nn/utils.lua:45: in function 'recursiveType'
    /home/samson/torch/install/share/lua/5.1/nn/utils.lua:41: in function 'recursiveType'
    /home/samson/torch/install/share/lua/5.1/nn/Module.lua:126: in function 'float'
    /data/home/samson/Repo/multipathnet/utils.lua:487: in function 'checkpoint'
    train.lua:196: in function 'save'
    train.lua:340: in function 'hooks'
    ./engines/fboptimengine.lua:79: in function 'train'
    train.lua:364: in main chunk
    [C]: in function 'dofile'
    ...mson/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00405d50

FYI:

torch.save(model_path, model)

is fine.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Saving model crashes when training multipathnet #21

Saving model crashes when training multipathnet #21

samson-wang commented Sep 26, 2016

Saving model crashes when training multipathnet #21

Saving model crashes when training multipathnet #21

Comments

samson-wang commented Sep 26, 2016