report a big PPL score on yelp #71

hejunqing · 2019-10-14T06:38:27Z

Hello. Thanks for sharing your work.
I trained a model following the steps in README and ran the evaluation using the run_all_evaluator.sh
It turns out most of the metrics are identicle to the results reported in your paper except PPL.
The results for my trained model are :
ll_scores: [(-9.701861720617387, 106.5074394250216), (-10.269295644873736, 120.9065905583248)]
The mean PPL is 113.7
However, the results should be around 32.
I think it may attribute to a different vocabulary or training KenLM with different corpus. I directly used the yelp_corpus_adapter for data preparation and yelp/reviews-train.txt to train KenLM.
Did I miss something ?

vrublack · 2019-12-03T13:05:52Z

I have the same issue. I tried training the language model on the dev and test split as well but got a similar PPL. Notably the overall_evaluator.py script should be changed in line 62 to ll_score, ppl_score = language_fluency.score_generated_sentences(generated_text_file_path, options.language_model_path) and in line 68 to ll_scores.append(ppl_score) because it formerly wanted to output a tuple of negative log likelihood and perplexity (might have something to do with Kenlm versions).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

report a big PPL score on yelp #71

report a big PPL score on yelp #71

hejunqing commented Oct 14, 2019 •

edited

Loading

vrublack commented Dec 3, 2019

report a big PPL score on yelp #71

report a big PPL score on yelp #71

Comments

hejunqing commented Oct 14, 2019 • edited Loading

vrublack commented Dec 3, 2019

hejunqing commented Oct 14, 2019 •

edited

Loading