[FIX] Learner: Generalize params and used_vals to all learners #2128

pavlin-policar · 2017-03-21T17:00:30Z

Issue

Description of changes

The fix I've provided does basically one thing. We now expect every learner to have a params attribute. We also expect every learner to output a model with the used_vals field. These attributes seem very general to all learners and models and there doesn't seem to be any reason whatsoever for scikit wrapped learners to be special and have their own separate attributes on their public API. This makes everything very confusing and difficult to work with. By moving everything up to the base learner, it makes the API of any learners much simpler and more uniform.

In case this is not a desired solution, I can also provide a quick and dirty workaround that also fixes the main issue with SGD and other wrapped learners.

Includes

Code changes
Tests
Documentation

codecov-io · 2017-03-21T17:15:52Z

Codecov Report

Merging #2128 into master will increase coverage by <.01%.
The diff coverage is 93.75%.

@@            Coverage Diff             @@
##           master    #2128      +/-   ##
==========================================
+ Coverage   71.48%   71.49%   +<.01%     
==========================================
  Files         318      318              
  Lines       54404    54409       +5     
==========================================
+ Hits        38892    38897       +5     
  Misses      15512    15512

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ba393c0...e5b9ca4. Read the comment docs.

janezd · 2017-03-22T14:48:09Z

I have a dream that one day Orange will be free of scikit-learn.

I have an unpleasant feeling that this makes Orange's learners more scikitish. If these parameters will be no longer needed when we replace scikit's learners with well-behaving, properly implemented algorithms, I'd prefer not putting them into base learners.

kernc · 2017-03-22T22:23:34Z

Indeed, sklearn is a mature, stable, reliable, responsible, well-defined, disciplined and hygienic, well-tested, reasonably working, pragmatic, go-to, state-of-the-art, and industry-standard machine learning library with a huge developer and user communities — as all such evidently awfully incompatible with what we're doing here. (Not sarcasm, actually.)

pavlin-policar · 2017-03-23T09:07:39Z

Tbh, I think we should remove the params attribute all together since it is only ever useful in scikit learners and should be "protected" at the very least. Also, it's only ever used in tests in WidgetLearnerTestMixin. I can probably rewrite these tests so they wouldn't be needed anymore. It might however be reasonable to have attributes on learners like in TreeLearner, without the params of course. What do you think?

I am also unsure of why the model somehow needs the parameters...

kernc · 2017-03-23T17:59:23Z

The reason sklearn tracks params is aforementioned discipline. Looking forward, deffo not something we want to be bound by at all costs.

I am also unsure of why the model somehow needs the parameters...

So it knows what set of learner params led to it. So it can be reproduced. In theory.

pavlin-policar · 2017-03-24T09:04:30Z

So it knows what set of learner params led to it. So it can be reproduced. In theory.

Well, yes, but so then why do only sklearn learners have this? And also wouldn't that be an argument to make for the learners to also have their own params?

I'd like to just remove params from the public API altogether and let sklearners use them internally, because really the only place where this is actually used is in one or two tests that can be changed.

Learner: Generalize params and used_vals to all learners

95b63ec

pavlin-policar changed the title ~~[FIX] Learner: Generalize params and used_vals to all learners~~ [WIP] Learner: Generalize params and used_vals to all learners Mar 21, 2017

Fitter: Make fitter params available to be consistent

e5b9ca4

pavlin-policar changed the title ~~[WIP] Learner: Generalize params and used_vals to all learners~~ [FIX] Learner: Generalize params and used_vals to all learners Mar 21, 2017

pavlin-policar mentioned this pull request Mar 24, 2017

[FIX] Fitter: Fix used_vals and params not being set #2138

Merged

3 tasks

pavlin-policar closed this Mar 27, 2017

pavlin-policar deleted the fitter-skllearner-call branch October 22, 2017 07:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] Learner: Generalize params and used_vals to all learners #2128

[FIX] Learner: Generalize params and used_vals to all learners #2128

pavlin-policar commented Mar 21, 2017

codecov-io commented Mar 21, 2017 •

edited

Loading

janezd commented Mar 22, 2017

kernc commented Mar 22, 2017 •

edited

Loading

pavlin-policar commented Mar 23, 2017 •

edited

Loading

kernc commented Mar 23, 2017 •

edited

Loading

pavlin-policar commented Mar 24, 2017

[FIX] Learner: Generalize params and used_vals to all learners #2128

[FIX] Learner: Generalize params and used_vals to all learners #2128

Conversation

pavlin-policar commented Mar 21, 2017

Issue

Description of changes

Includes

codecov-io commented Mar 21, 2017 • edited Loading

Codecov Report

janezd commented Mar 22, 2017

kernc commented Mar 22, 2017 • edited Loading

pavlin-policar commented Mar 23, 2017 • edited Loading

kernc commented Mar 23, 2017 • edited Loading

pavlin-policar commented Mar 24, 2017

codecov-io commented Mar 21, 2017 •

edited

Loading

kernc commented Mar 22, 2017 •

edited

Loading

pavlin-policar commented Mar 23, 2017 •

edited

Loading

kernc commented Mar 23, 2017 •

edited

Loading