[BUG] Classification - forced to calculate all metrics #953

Linardos · 2024-10-02T16:10:47Z

Describe the bug

The current codebase sort of forces things to calculate all possible classification metrics, even if I explicitly ask for just one metric in my config file.

To Reproduce

Steps to reproduce the behavior:

In config.yaml, set problem_type to "classification"
Define your own metrics in config.yaml
Run GaNDLF
You will likely see all possible metrics being calculated even if you did not define them in config

Expected behavior

I expected to just calculate my defined metrics

Environment information

GaNDLF version is the current master, 0.1.0

Solution

How I solved it in my case is by setting calculate_overall_metrics parameters to false in both scripts they appear in (training_loop and forward_pass. See here: https://github.com/search?q=repo%3Amlcommons%2FGaNDLF%20calculate_overall_metrics&type=code ).

sarthakpati · 2024-10-02T19:27:23Z

So, there are 2 types of metrics:

Per-subject metrics (all of the metrics here [ref]) - these are calculate on a batch-level [ref]
Per-cohort metrics [ref] - these are calculated on an epoch-level (since we need all targets and predictions) [ref]

Currently, only the per-subject metrics are user-controlled, and the per-cohort metrics are always calculated. I keep hearing that for regression and classification problems, the per-subject metrics are kind of meaningless (since we always end up calculating averages through the epoch). And from a statistical perspective, the per-cohort metrics are much more important.

That being said, how would you propose we update this so that this can be accommodated?

Linardos · 2024-10-02T19:48:55Z

Hmm, I understand per-cohort are considered more important, but even so I'd leave the flexibility to the user in case they want to run sanity checks on per-subject or some other preliminary experiments before getting to the per-cohort ones. In my case, the per-cohort metrics were causing a bug in the FeTS Challenge code which remains to be traced, so I needed to disable them for now at least.

I see two options:

Give the user the flexibility, adding a WARNING message to present what you told me here when the user skips the per-cohort metrics.
If you want to stick with forcing all the per-cohort metrics calculation as the expected behavior, there should either way be again a WARNING that all per-cohort metrics are calculated nevermind what the user is passing in the config.

The main thing here is for the user to not be in the dark about this. It seems unintuitive for the code to be over-riding commands that a user is passing to the config, and cause for confusion

sarthakpati · 2024-10-02T19:53:59Z

The main thing here is for the user to not be in the dark about this. It seems unintuitive for the code to be over-riding commands that a user is passing to the config, and cause for confusion

I agree with you 💯%. I am fine with either suggestion. Do you want to put the PR together and I can review?

github-actions · 2024-12-02T19:47:55Z

Stale issue message

github-actions bot added the no-issue-activity label Dec 2, 2024

sarthakpati removed the no-issue-activity label Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Classification - forced to calculate all metrics #953

[BUG] Classification - forced to calculate all metrics #953

Linardos commented Oct 2, 2024 •

edited

Loading

sarthakpati commented Oct 2, 2024

Linardos commented Oct 2, 2024

sarthakpati commented Oct 2, 2024

github-actions bot commented Dec 2, 2024

[BUG] Classification - forced to calculate all metrics #953

[BUG] Classification - forced to calculate all metrics #953

Comments

Linardos commented Oct 2, 2024 • edited Loading

Describe the bug

To Reproduce

Expected behavior

Environment information

Solution

sarthakpati commented Oct 2, 2024

Linardos commented Oct 2, 2024

sarthakpati commented Oct 2, 2024

github-actions bot commented Dec 2, 2024

Linardos commented Oct 2, 2024 •

edited

Loading