Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow missing data for "ts_forecast_panel" task #878

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

int-chaos
Copy link
Collaborator

@int-chaos int-chaos commented Jan 9, 2023

Why are these changes needed?

  • allow missing data for "ts_forecast_panel" task
  • address typo for hcrystalball_model
  • add test for missing data scenario in test_forecast.py
  • fix issue where fit_kwargs were not actually passed into the model fit()

Related issue number

@int-chaos
Copy link
Collaborator Author

there is a current error with missing timestamps test. please advise if you can.

missing timestamps causes error with cross validation.

With the current seeding, 04/01/2017 data is removed for Agency_13 SKU_04, but the time range 01/01/2017 - 06/01/2017 is used as the testing portion for holdout validation. Even though 04/01/2017 Agency_13 SKU_04 is not part of the testing data (in this case testing data is 01/01/2017 - 06/01/2017 for all agency and sku groups, with the exception of 04/01/2017 Agency_13 SKU_04), when the model predicts it still predicts a value for 04/01/2017 Agency_13 SKU_04, thus resulting in a ValueError where the testing length and prediction length are different.

I've been trying to come up with a way to remove the prediction for 04/01/2017 Agency_13 SKU_04, but there is no easy way to locate it since the raw prediction is just a tensor.
image

@sonichi
Copy link
Contributor

sonichi commented Jan 9, 2023

there is a current error with missing timestamps test. please advise if you can.

missing timestamps causes error with cross validation.

With the current seeding, 04/01/2017 data is removed for Agency_13 SKU_04, but the time range 01/01/2017 - 06/01/2017 is used as the testing portion for holdout validation. Even though 04/01/2017 Agency_13 SKU_04 is not part of the testing data (in this case testing data is 01/01/2017 - 06/01/2017 for all agency and sku groups, with the exception of 04/01/2017 Agency_13 SKU_04), when the model predicts it still predicts a value for 04/01/2017 Agency_13 SKU_04, thus resulting in a ValueError where the testing length and prediction length are different.

I've been trying to come up with a way to remove the prediction for 04/01/2017 Agency_13 SKU_04, but there is no easy way to locate it since the raw prediction is just a tensor. image

What does it mean by `With the current seeding, 04/01/2017 data is removed for Agency_13 SKU_04"? Why is it removed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Time series gap detection for TFT tasks
2 participants