Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sprand as samples from sparse distribution #3

Open
abraunst opened this issue Apr 3, 2019 · 3 comments
Open

sprand as samples from sparse distribution #3

abraunst opened this issue Apr 3, 2019 · 3 comments

Comments

@abraunst
Copy link

abraunst commented Apr 3, 2019

I was thinking, in the spirit of this package, maybe rand(Normal(),SparseMatrixCSC,p,m,n) could be better expressed as rand(Bernoulli(p,Normal()),SparseMatrixCSC,m,n) where Bernoulli(p, Normal()) would be the "Gauss-Bernoulli" or "Spike-and-Slab" mixture distribution

P(x) = (1-p) delta(x)+ p Normal(x)

It seems to make things a bit more generic.

@abraunst abraunst changed the title sparse arrays as samples from sparse distribution sprand as samples from sparse distribution Apr 4, 2019
@rfourquet
Copy link
Member

Sounds like a very interesting idea. It would require to have the non-zero struture depend on the values (i.e. test each produced value for nullity), so I wonder whether possible multiple allocations would have negative performance impact. But definitely worth exploring. I may be even possible to support both API (for now I guess I prefer to not get rid of the current API, as it feels closer to the sprand API and makes it probably easier to switch).

@abraunst
Copy link
Author

abraunst commented Apr 7, 2019

Sounds like a very interesting idea. It would require to have the non-zero struture depend on the values (i.e. test each produced value for nullity), so I wonder whether possible multiple allocations would have negative performance impact. But definitely worth exploring. I may be even possible to support both API (for now I guess I prefer to not get rid of the current API, as it feels closer to the sprand API and makes it probably easier to switch).

For sure, if the p value is small enough, the sampler should do what the current sprand does, i.e. extract the non-zero indices and then fill them. I confess that I tried to implement it in RandomExtensions but I am still a bit lost in the design 😅 .

In any case, even if RandomExtensions makes its way into stdlib (which I would love to see), I think that the current interface in stdlib should be left as convenience functions (without maybe the rfn param and other bells and whistles).

@rfourquet
Copy link
Member

I tried to implement it in RandomExtensions

Cool!!

I tried to implement it in RandomExtensions but I am still a bit lost in the design

Sorry for that, the internals have quite evolved last time I worked on it, and didn't document yet. Feel free to open an issue to ask for help, and I will answer there or write documentation (but I will have very little time in the upcoming week).

I think that the current interface in stdlib should be left

I don't have a lot of hopes for sprand to go away. I agree that sprand is more convenient vs rand([T], SparseVector, p, n, m), that's why I initially (in the Base PR) added the short version rand([T], p, n, m) to give it a chance to compete favorably against sprand. But IIRC, someone had noted somewhere that it's not this short version is not very clear, so this is not an unanimous solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants