unsupervised training error #3

hongyuntw · 2022-04-18T02:39:44Z

when i follow the command on readme to train unsupervised model
I got the following error message

2022-04-17 22:32:50,496 [INFO]: Chunk # 4900 ppl: 26.1486, nll: 128.3368
2022-04-17 22:33:01,088 [INFO]: Chunk # 5000 ppl: 27.0186, nll: 128.0882
2022-04-17 22:33:01,200 [INFO]: Total data-chunks: 5000 and data-units: 270000.
2022-04-17 22:33:01,200 [INFO]: Epoch training time elapsed: 528.27 (s).
2022-04-17 22:33:01,200 [INFO]: Data-unit/sec: 511.101.
2022-04-17 22:33:01,281 [INFO]: Evaluation data source: {'data_path': 'artifacts/amazon/reviews/val/', 'early_term': 500}.
2022-04-17 22:33:57,546 [INFO]: Total data-chunks: 500 and data-units: 90000.
2022-04-17 22:33:57,547 [INFO]: Evaluation time elapsed: 55.24 (s).
2022-04-17 22:33:57,547 [INFO]: Data-unit/sec: 1629.164.
2022-04-17 22:33:57,547 [INFO]: Validation ppl: 27.2019, nll: 128.7210
2022-04-17 22:33:57,891 [INFO]: Saved the model's and optimizer's state to: '/home/nlplab/harry/review_rec/FewSum/runs/amazon/unsup/5/ep1_checkpoint.tar'.
2022-04-17 22:33:57,901 [INFO]: Performing summary evaluation on {'data_path': 'artifacts/amazon/gold_summs/val.csv'}.
/home/nlplab/harry/review_rec/FewSum/fewsum/utils/tools/beam_search.py:226: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  prev_k = best_scores_id // num_words
2022-04-17 22:33:58,801 [ERROR]: The shape of the 2D attn_mask is torch.Size([1, 1]), but should be (1, 2).
Traceback (most recent call last):
  File "fewsum/workflow.py", line 263, in <module>
    raise e
  File "fewsum/workflow.py", line 260, in <module>
    after_epoch_func=after_ep_func)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/modelling/interfaces/i_dev_summ.py", line 90, in standard_workflow
    after_epoch_func(epoch)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/modelling/interfaces/i_dev_summ.py", line 214, in after_ep_func
    **summ_eval_kwargs)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/modelling/interfaces/i_dev_summ.py", line 110, in summ_eval
    eval_proc.eval(data_source, out_file_path=out_file_path)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/eval/procedures/summ_eval_proc.py", line 73, in eval
    gen_summ, pred_props = self.summs_gen_func(batch)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/modelling/interfaces/i_dev_summ.py", line 222, in summ_gen_wrapper
    gen_summ, pred_props = self.imodel.generate(batch=batch, **kwargs)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/modelling/interfaces/i_summ.py", line 71, in generate
    minimum=1, **prop_vals)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/modelling/generators/beamer.py", line 109, in __call__
    out = self.decoding_func(prev_word_ids, **new_kwargs)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/modelling/models/basesum.py", line 149, in decode
    pos_offset=pos_offset, **kwargs)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/modelling/models/basesum.py", line 124, in _decode
    decode=True)
  File "/home/nlplab/harry/anaconda3/envs/fewsum/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/modelling/modules/transformer_stack.py", line 90, in forward
    mem_key_padding_mask=mem_key_padding_mask)
  File "/home/nlplab/harry/anaconda3/envs/fewsum/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nlplab/harry/review_rec/FewSum/fewsum/modelling/modules/transformer_decoder_layer.py", line 86, in forward
    key_padding_mask=tgt_key_padding_mask)
  File "/home/nlplab/harry/anaconda3/envs/fewsum/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nlplab/harry/anaconda3/envs/fewsum/lib/python3.7/site-packages/torch/nn/modules/activation.py", line 1045, in forward
    attn_mask=attn_mask, average_attn_weights=average_attn_weights)
  File "/home/nlplab/harry/anaconda3/envs/fewsum/lib/python3.7/site-packages/torch/nn/functional.py", line 5268, in multi_head_attention_forward
    raise RuntimeError(f"The shape of the 2D attn_mask is {attn_mask.shape}, but should be {correct_2d_size}.")
RuntimeError: The shape of the 2D attn_mask is torch.Size([1, 1]), but should be (1, 2).

I already trace the code in BaseSum class but I have no idea to fix the bug.
I did not change the code.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unsupervised training error #3

unsupervised training error #3

hongyuntw commented Apr 18, 2022

unsupervised training error #3

unsupervised training error #3

Comments

hongyuntw commented Apr 18, 2022