Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Learner: Generalize params and used_vals to all learners #2128

Closed

Conversation

pavlin-policar
Copy link
Collaborator

Issue

#2125 6.

Description of changes

The fix I've provided does basically one thing. We now expect every learner to have a params attribute. We also expect every learner to output a model with the used_vals field. These attributes seem very general to all learners and models and there doesn't seem to be any reason whatsoever for scikit wrapped learners to be special and have their own separate attributes on their public API. This makes everything very confusing and difficult to work with. By moving everything up to the base learner, it makes the API of any learners much simpler and more uniform.

In case this is not a desired solution, I can also provide a quick and dirty workaround that also fixes the main issue with SGD and other wrapped learners.

Includes
  • Code changes
  • Tests
  • Documentation

@pavlin-policar pavlin-policar changed the title [FIX] Learner: Generalize params and used_vals to all learners [WIP] Learner: Generalize params and used_vals to all learners Mar 21, 2017
@codecov-io
Copy link

codecov-io commented Mar 21, 2017

Codecov Report

Merging #2128 into master will increase coverage by <.01%.
The diff coverage is 93.75%.

@@            Coverage Diff             @@
##           master    #2128      +/-   ##
==========================================
+ Coverage   71.48%   71.49%   +<.01%     
==========================================
  Files         318      318              
  Lines       54404    54409       +5     
==========================================
+ Hits        38892    38897       +5     
  Misses      15512    15512

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ba393c0...e5b9ca4. Read the comment docs.

@pavlin-policar pavlin-policar changed the title [WIP] Learner: Generalize params and used_vals to all learners [FIX] Learner: Generalize params and used_vals to all learners Mar 21, 2017
@janezd
Copy link
Contributor

janezd commented Mar 22, 2017

I have a dream that one day Orange will be free of scikit-learn.

I have an unpleasant feeling that this makes Orange's learners more scikitish. If these parameters will be no longer needed when we replace scikit's learners with well-behaving, properly implemented algorithms, I'd prefer not putting them into base learners.

@kernc
Copy link
Contributor

kernc commented Mar 22, 2017

Indeed, sklearn is a mature, stable, reliable, responsible, well-defined, disciplined and hygienic, well-tested, reasonably working, pragmatic, go-to, state-of-the-art, and industry-standard machine learning library with a huge developer and user communities — as all such evidently awfully incompatible with what we're doing here. (Not sarcasm, actually.)

@pavlin-policar
Copy link
Collaborator Author

pavlin-policar commented Mar 23, 2017

Tbh, I think we should remove the params attribute all together since it is only ever useful in scikit learners and should be "protected" at the very least. Also, it's only ever used in tests in WidgetLearnerTestMixin. I can probably rewrite these tests so they wouldn't be needed anymore. It might however be reasonable to have attributes on learners like in TreeLearner, without the params of course. What do you think?

I am also unsure of why the model somehow needs the parameters...

@kernc
Copy link
Contributor

kernc commented Mar 23, 2017

The reason sklearn tracks params is aforementioned discipline. Looking forward, deffo not something we want to be bound by at all costs.

I am also unsure of why the model somehow needs the parameters...

So it knows what set of learner params led to it. So it can be reproduced. In theory.

@pavlin-policar
Copy link
Collaborator Author

So it knows what set of learner params led to it. So it can be reproduced. In theory.

Well, yes, but so then why do only sklearn learners have this? And also wouldn't that be an argument to make for the learners to also have their own params?

I'd like to just remove params from the public API altogether and let sklearners use them internally, because really the only place where this is actually used is in one or two tests that can be changed.

@pavlin-policar pavlin-policar deleted the fitter-skllearner-call branch October 22, 2017 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants