-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(inference): add nuts sampler #462
base: master
Are you sure you want to change the base?
Conversation
Cool Tony! looks interesting -- I converted this to a draft PR, when you're ready to put it in the pipeline, you should be able to convert it back. |
ping @femtomc :) |
@chentoast Hi! I'll likely get to take a look at this sometime next week -- thank you for your work on this PR! |
|
||
for (name, val) in get_values_shallow(a) | ||
out[name] = val + b[name] | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how this function is intended to work if the choice maps have different shapes. Should we treat missing addresses in one choice map as "zero" for the value type of the choice map which does have the address?
It's unclear to me what the correct thing is here -- or whether you'd ever be adding choice maps with different address shapes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this invariant is enforced implicitly, it would be nice to document the code to indicate that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that makes sense. I can't see a way for the invariant to be violated - we are always applying these functions to values and momentum choicemaps that are defined to have the same mistake at the start of the function. Violating that invariant would involve somehow changing the selection while in the middle of a leapfrog step, which doesn't seem to be possible.
return (new_trace, SamplerStats(depth, tree.n, tree.diverging, true)) | ||
else | ||
return (trace, SamplerStats(depth, tree.n, tree.diverging, false)) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small comment -- here, the normal interface for kernels is slightly violated (but in a way which I'm not unhappy about). I think normally, primitive kernels have a return signature of (Trace, accept_or_reject)
.
One interesting idea (to improve the information we can extract from our kernels) would be to create a new abstract type for the second element of the returned tuple, and allow definition of new types to be returned from kernels.
This change is a bit more drastic -- so I recommend that (for now), we remove SamplerStats
from this PR -- and setup a new issue to handle the bigger question of how to better profile the kernel statistics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense - I'll do that in a follow-up PR.
end | ||
|
||
return out | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like that this allocates a new choice map each time you call it -- but I need to examine and think about the code below further to suggest a better solution.
Even if we created a view-like structure, we'd still need to eat the allocations when we decide to modify (scale) the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually implemented a version of this function that operated in place, but I didn't see noticeable performance benefits (at least, on my tiny toy problem). But I don't see any downsides of making it work in place, at least for now.
This pr is getting closer, but I would hold off on merging until I've had a chance to test it out on some more non-trivial models |
Hi, it seems like this PR has been abandoned. Are there plans to merge this? I would like to have the NUTS sampler included in Gen. Willing to pitch in if there's any work left before it's ready to merge. |
I think @chentoast hasn't been working on this for a while, so @hedonhermdev you are welcome to pick up where the PR was left off! It'd be great if you could:
|
work in progress. implements a no u turn sampler but does not implement a dual averaging adaption process for setting epsilon, which can be done as follow up work. code generally needs to be cleaned up a little as well
todo: