Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: cutoff/threshold value for classification #1267

Closed
wibrt opened this issue May 18, 2016 · 4 comments · Fixed by #3881
Closed

Feature request: cutoff/threshold value for classification #1267

wibrt opened this issue May 18, 2016 · 4 comments · Fixed by #3881
Assignees
Labels

Comments

@wibrt
Copy link
Contributor

wibrt commented May 18, 2016

When using classification usually a p-value is calculated to determine
the class of the target, by default orange uses 0.5 as cutoff or
threshold value if u like. Depending the on this cutoff value the model performs differently.

  • It would be nice to obtain these cutoff value(s) (eg as output) from a widget like roc analyis or lift
  • Also it would be nice to be able to change the cutoff value
    ** For example in the learner
    ** Or in test and score
    ** or ..
@janezd
Copy link
Contributor

janezd commented May 18, 2016

Like the Calibration plot
screen shot 2016-05-18 at 13 08 16
and Calibrated classifier
screen shot 2016-05-18 at 13 09 17
from Orange 2?

@wibrt
Copy link
Contributor Author

wibrt commented May 18, 2016

indeed a good option for optimizing CA: calibrated classifier widget,
but it actually depends on the performance metric you want to optimize,
hence the reference to test an score

@janezd
Copy link
Contributor

janezd commented Jun 13, 2019

Improved calibration plot (#3881) now allows for exploring different thresholds and their effect on CA, F1, sens, spec, ppv ... It can also output a model with the user-defined threshold -- if the data doesn't come from cross-validation but from a single training/testing (otherwise there is no single model to output).

Screenshot 2019-06-13 at 23 42 51

I initially planned to add a widget that would take a learning algorithm as input and output a learning algorithm that calls the wrapped algorithm to compute a model and then imposes a threshold that optimizes CA, F1 or MCC (I wouldn't include anything exotic). The widget wouldn't be complicated: a single input and output, and a single combo box for choosing the score to optimize. But I am no longer convinced that such a widget would actually make sense.

  • How do you optimize? Internal cross validation to find the optimal threshold? And then the mean threshold over all folds? I have strong doubts about this.
  • Even without the above point, I doubt this optimization would give more than cosmetic improvements, if any.
  • A tree model with an optimized threshold wouldn't be recognized as a tree by the tree viewer. (Sorry, deriving such classes would be too hacky.) The resulting model would only be accepted by Predictions widget.

So, a threshold-optimization widget wouldn't be very useful and it would be rather incompatible with other widgets. We can discuss it some more (I would appreciate @BlazZupan's opinion), but I lean towards considering this issue closed when #3881 is merged.

@janezd janezd added the needs discussion Core developers need to discuss the issue label Jun 13, 2019
@janezd
Copy link
Contributor

janezd commented Jun 14, 2019

I discussed this with @BlazZupan. We'll add a "calibrator" that will take a learning algorithm and optimize the threshold as well as calibrate probabilities; especially the latter. Of course on training data (I don't know what I was thinking yesterday :)).

The widget can be connected to Test and Score, which is useful enough.

@janezd janezd removed the needs discussion Core developers need to discuss the issue label Jun 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants