Feature request: cutoff/threshold value for classification #1267

wibrt · 2016-05-18T10:58:36Z

When using classification usually a p-value is calculated to determine
the class of the target, by default orange uses 0.5 as cutoff or
threshold value if u like. Depending the on this cutoff value the model performs differently.

It would be nice to obtain these cutoff value(s) (eg as output) from a widget like roc analyis or lift
Also it would be nice to be able to change the cutoff value
** For example in the learner
** Or in test and score
** or ..

janezd · 2016-05-18T11:10:51Z

Like the Calibration plot

and Calibrated classifier

from Orange 2?

wibrt · 2016-05-18T11:44:08Z

indeed a good option for optimizing CA: calibrated classifier widget,
but it actually depends on the performance metric you want to optimize,
hence the reference to test an score

janezd · 2019-06-13T21:58:38Z

Improved calibration plot (#3881) now allows for exploring different thresholds and their effect on CA, F1, sens, spec, ppv ... It can also output a model with the user-defined threshold -- if the data doesn't come from cross-validation but from a single training/testing (otherwise there is no single model to output).

I initially planned to add a widget that would take a learning algorithm as input and output a learning algorithm that calls the wrapped algorithm to compute a model and then imposes a threshold that optimizes CA, F1 or MCC (I wouldn't include anything exotic). The widget wouldn't be complicated: a single input and output, and a single combo box for choosing the score to optimize. But I am no longer convinced that such a widget would actually make sense.

How do you optimize? Internal cross validation to find the optimal threshold? And then the mean threshold over all folds? I have strong doubts about this.
Even without the above point, I doubt this optimization would give more than cosmetic improvements, if any.
A tree model with an optimized threshold wouldn't be recognized as a tree by the tree viewer. (Sorry, deriving such classes would be too hacky.) The resulting model would only be accepted by Predictions widget.

So, a threshold-optimization widget wouldn't be very useful and it would be rather incompatible with other widgets. We can discuss it some more (I would appreciate @BlazZupan's opinion), but I lean towards considering this issue closed when #3881 is merged.

janezd · 2019-06-14T13:02:07Z

I discussed this with @BlazZupan. We'll add a "calibrator" that will take a learning algorithm and optimize the threshold as well as calibrate probabilities; especially the latter. Of course on training data (I don't know what I was thinking yesterday :)).

The widget can be connected to Test and Score, which is useful enough.

ajdapretnar added the wish label May 25, 2016

janezd mentioned this issue Jun 13, 2019

[ENH] Calibration plot (add performance curves) and a new Calibrated Learner widget #3881

Merged

janezd added the needs discussion Core developers need to discuss the issue label Jun 13, 2019

janezd assigned BlazZupan and janezd Jun 13, 2019

janezd removed the needs discussion Core developers need to discuss the issue label Jun 14, 2019

janezd unassigned BlazZupan Jun 14, 2019

janezd closed this as completed in #3881 Jul 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: cutoff/threshold value for classification #1267

Feature request: cutoff/threshold value for classification #1267

wibrt commented May 18, 2016

janezd commented May 18, 2016

wibrt commented May 18, 2016

janezd commented Jun 13, 2019

janezd commented Jun 14, 2019

Feature request: cutoff/threshold value for classification #1267

Feature request: cutoff/threshold value for classification #1267

Comments

wibrt commented May 18, 2016

janezd commented May 18, 2016

wibrt commented May 18, 2016

janezd commented Jun 13, 2019

janezd commented Jun 14, 2019