Reproducing probabilistic forecasting experiment results #58
Unanswered
ConstantinaNicolaou
asked this question in
Q&A
Replies: 1 comment 2 replies
-
There is randomness due to the sampling from the mixture distribution, but this should not cause that big of difference. Comparing the results you replicated, there is no big difference between 1 random run vs averaged over 20 runs. Just want to point out that these are error metrics, so lower is better, so the results you are getting are better than reported in the paper. For the paper we only reported results from 1 run for Moirai, and averaged over 5 runs for the baselines. I would need more time to investigate why there is this discrepancy... |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I am trying to reproduce the results in Table 21 of the paper. The model I am using is MOIRAI_small and the datasets are electricity and Walmart. I run the script as defined here for both datasets but I am not able to get the same results reported in the paper. This is still the case even after averaging over 20 reps.
Here are my results.
For electricity:
Single random eval run:
test_metrics/mean_weighted_sum_quantile_loss: 0.07266858965158463
test_metrics/MSIS: 7.508491516113281
test_metrics/sMAPE[0.5]: 0.1335689127445221
test_metrics/MASE[0.5]: 0.9439684152603149
test_metrics/ND[0.5]: 0.09246590733528137
test_metrics/NRMSE[mean]: 2.2414095401763916
Averaged over 20 reps:
'test_metrics/mean_weighted_sum_quantile_loss': 0.07226627618074417
'test_metrics/MSIS': 7.500214052200318
'test_metrics/sMAPE[0.5]': 0.13359345570206643
'test_metrics/MASE[0.5]': 0.9438354432582855
'test_metrics/ND[0.5]': 0.09195553995668888
'test_metrics/NRMSE[mean]': 0.9117850214242935
Paper results:
For example, for MSIS I get around 7.5 whereas in the paper it is 7.999. MASE and NRMSE are also discrepant.
For Walmart:
Single random eval run:
test_metrics/mean_weighted_sum_quantile_loss: 0.09697504341602325
test_metrics/MSIS: 8.7696533203125
test_metrics/sMAPE[0.5]: 0.17258785665035248
test_metrics/MASE[0.5]: 0.9929932355880737
test_metrics/ND[0.5]: 0.12082848697900772
test_metrics/NRMSE[mean]: 0.307935893535614
Averaged over 20 reps:
'test_metrics/mean_weighted_sum_quantile_loss': 0.09698229804635047
'test_metrics/MSIS': 8.756616592407227
'test_metrics/sMAPE[0.5]': 0.17264488562941552
'test_metrics/MASE[0.5]': 0.9931747078895569
'test_metrics/ND[0.5]': 0.12086529061198234
'test_metrics/NRMSE[mean]': 0.3032348841428757
Paper results:
For example, for CRPS I get 0.097 whereas in the paper it is 0.103. In this case all metrics are discrepant.
Is this expected behaviour? If so what is the difference attributed to? Is there a way to fix any stochasticity to get the reported results in the paper?
Thank you very much for your efforts. Looking forward to your response.
Beta Was this translation helpful? Give feedback.
All reactions