Skip to content
This repository has been archived by the owner on Jun 26, 2021. It is now read-only.

[FeatureRequest] Dataset Metrics #85

Open
justusschock opened this issue Apr 8, 2019 · 5 comments
Open

[FeatureRequest] Dataset Metrics #85

justusschock opened this issue Apr 8, 2019 · 5 comments
Assignees

Comments

@justusschock
Copy link
Member

justusschock commented Apr 8, 2019

Find a way to provide a lazy way for calculating dataset metrics since they'd currently break the intention of lazy datasets (see #66 )

@justusschock
Copy link
Member Author

justusschock commented Jun 13, 2019

To move this on, we have to think of a way to lazily calculate dataset metrics.

The only thing I can currently think of is to save intermediate values to disk and reload them afterwards (although this may slow down things due to i/o bottlenecks).

A (probably) good hybrid solution has some requirements on the metric for good performance (but should also work without these requirements, but

For better performance, the metric must be dividable into 2 sub-steps, where one of the sub-steps works on a per-sample/per-batch basis and reduces the amount of temporary data and thus the i/o bottleneck

The (reduced) temporary data will then be stored on disk and loaded all together after all predictions are done (and intermediate allocated, but now freed, memory can be reallocated for metric calculation).

Do you know any metrics fulfilling these requirements or a better solution for lazy dataset metric calculation at all @mibaumgartner ?

@mibaumgartner
Copy link
Member

mibaumgartner commented Jun 16, 2019

My original goal was to compute the AUROC on the the validation dataset. Memory consumption of the classification results isn't that high, it might not be necessary to introduce a caching system.

I think there are two options for the general lazy case:

  1. We introduce a pseudo batch size which caches the results of multiple batches to compute the metric over more samples (typically batch size is limited by GPU memory and not RAM). Thus, it would be possible to compute the metric over more samples and we would mimic a similar behaviour like right now (a special case of this might be the entire dataset).
  2. We cache the predictions, but I do not know what benefit we gain from that? If it does not fit into memory during training, it probably won't fit into memory after training, too.

@justusschock
Copy link
Member Author

I like the first approach, but regarding the outputs: That highly depends on the task. If you're doing something like GANs or segmentation and have some image-like outputs, you can go OOM very fast.

If you're doing this after training, there might be more RAM available, although in most cases this does not change the fact, whether it fits or not.

@mibaumgartner
Copy link
Member

I like the first approach, too :)

@justusschock
Copy link
Member Author

Go ahead and implement it

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants