-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Models/evaluate year #124
base: master
Are you sure you want to change the base?
Models/evaluate year #124
Conversation
@tommylees112 , do you want me to take a look at this now, or once the tests are passing? |
when tests are passing please buddy |
This currently isn't working for me even though it is passing the test. will need to have a play! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Tommy - I know things aren't working yet, but took a quick look and have some initial comments.
I'll also take a look once everything is working!
@@ -79,6 +79,13 @@ class DataLoader: | |||
Whether to load testing or training data. This also affects the way the data is | |||
returned; for train, it is a concatenated array, but for test it is a dict with dates | |||
so that the netcdf file can easily be reconstructed | |||
>>>>>>>>>CHANGE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: to remove
|
||
data_folder = data_path / f'features/{experiment}/{mode}' | ||
if (test_year is None) and (test_month is None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API is confusing - there is a year and month associated with the test data too (i.e. if I pass test_year=2018
in our current setup, it wouldn't work).
Perhaps something that makes sense would be a function which determines which years are in the test folder, and which years are in the train folder (which shouldn't be too hard, since that information is in the folder names).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if test_year
is passed, but test_month
is not (which would be useful to predict all of 2018), things wouldn't work.
Maybe limiting the granularity to years? And allowing a range of years to be passed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I see now that this is handled in the base training class, but since the DataLoader
is something we would want to expose to users, I think it makes sense to clean up here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we chat through how to improve this API ? So far I have the ideas:
- Have a function to check whether the file exists in the train datafolder
- Expose this to the user from the
DATALOADER
not the baseneural_network
class - Require making predictions of whole years (don't expose months to users)
Did i miss any?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for 3., would it be possible to expose both, but if test_month
is None
, then do the whole year?
def predict(self) -> Tuple[Dict[str, Dict[str, np.ndarray]], Dict[str, np.ndarray]]: | ||
|
||
test_arrays_loader = DataLoader(data_path=self.data_path, batch_file_size=self.batch_size, | ||
def predict(self, test_year: Optional[int] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above regarding the confusing interface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this functionality needs to be added for the regression and parsimonious model
else: | ||
preds_xr.to_netcdf(self.model_dir / f'preds_{key}.nc') | ||
|
||
def evaluate_train_timesteps(self, years: List[int], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments in the DataLoader
class
We are trying here to use the TRAINED model to evaluate specific months.
The major NEW functions are:
src/models/base.py
:evaluate_train_timesteps
And changes to:
src/models/neural_networks/base.py
:predict
NOTE: need to update all of the models (non-neural networks too)