Resampling techniques to use on modeling train/test/validation data splits.
Basic resampling of training data and the effect of increasing training dataset size.
Investigate how to use large data source for building training datasets without drastically increasing model training/tuning times.
A comparison between:
- All of the data
- Balanced data
- Randomly adding in data
- Adding in data by lowest class precision * (main test)