Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up statsample-glm #34

Open
lokeshh opened this issue Jul 15, 2016 · 4 comments
Open

Speed up statsample-glm #34

lokeshh opened this issue Jul 15, 2016 · 4 comments

Comments

@lokeshh
Copy link
Member

lokeshh commented Jul 15, 2016

Statsample-GLM currently takes up a lot of time to fit models. For datasets having 1000 rows it can take hours to fit the model.

@v0dro
Copy link
Member

v0dro commented Jul 21, 2016

What do you suggest should be done? Let's first profile the code and see the slowest places. Then we can optimize these and see if we should write C extensions or port things to NMatrix.

@envp
Copy link
Member

envp commented Sep 14, 2016

I'm interested in this, where should I start?

@agisga
Copy link

agisga commented Sep 16, 2016

I think that maybe one way to start would be to look at how things are done in other tools such as R and Python.
If it helps, I recently found these notes on high-performance GLM solvers (in particular, on variants of the IRLS algorithm): http://bwlewis.github.io/GLM/ (the example codes are in R, but it should be easy to translate them into Ruby).

@envp
Copy link
Member

envp commented Sep 20, 2016

Thanks! I'll go through the recommended reading for now. Have a good deal of schoolwork coming up right now.

I'll report back ASAP!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants