-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
information bottleneck #50
Conversation
To dict pretty print
…tener initialization of weights st all color languages have identical informativity
A followup: I'm on the fence about whether to include the colors/ example at all, tbh. There is nothing novel in that notebook that is not covered by https://github.com/nogazs/ib-color-naming/tree/master. This doesn't mean there can't be: we can look at how to convert encoders into ULTK languages, and then measure informativity differently. However, (1) right now |
What if we replaced the images in the README with the two trade-off plots in the modals example? 🤔 The latter are both original to ULTK, instead of pulled from other papers which we have not exactly replicated in ULTK. |
Amazing stuff, thanks Nathaniel! I will do a detailed code review soon once time permits, but wanted to respond to and mention a few high-level things first:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff Nathaniel! Only major comments are the ones I mentioned already in the long comment; otherwise, only extremely minor things. Let's walk through the notebook(s) in person in our meeting next week as well :)
# self.xhat_size = len(self.ln_px) | ||
|
||
self.result = None # namedtuple | ||
self.results: list[Any] = [] # list of namedtuples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not list[namedtuple]
then? And same for above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol good catch
else: | ||
converged = ( | ||
it == self.max_it | ||
or np.sum(np.abs(np.exp(self.ln_qxhat_x) - np.exp(prev_q))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prob just express the bound in log-space instead of exp'ing, but not a huge deal
def ib_kl(py_x: np.ndarray, qy_xhat: np.ndarray) -> np.ndarray: | ||
"""Compute the IB distortion matrix, the KL divergence between p(y|x) and q(y|xhat), in nats.""" | ||
# D[p(y|x) || q(y|xhat)], | ||
# input shape `(x, xhat, y)`, output shape `(x, xhat)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be curious to get your thoughts on using jaxtyping
to annotate the shape of numpy arrays for methods like these: https://docs.kidger.site/jaxtyping/
It was originally written for jax, but the type annotations work with other libraries like numpy and pytorch
np.exp(self.ln_qy_xhat), | ||
) | ||
|
||
def next_result(self, beta, *args, **kwargs) -> IBResult: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
|
||
return Language( | ||
expressions=tuple( | ||
[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need the list [ ... ]
wrapper here, e.g. can delete this line and line285
PRECISION = 1e-15 | ||
|
||
|
||
def get_gaussian_noise(shape): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more params here, with defaults?
|
||
|
||
############################################################################## | ||
# Probability and Information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about putting this stuff in the probability sub-module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I hope one day we move probability out of effcomm too
todo from meeting:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Nathaniel! This is great stuff :)
Going to merge now, and just remind of a couple things: (i) we should re-run / update the color example early in the new year; (ii) relatedly, should we close Mickey's branch at this point?; (iii) let's discuss more IB/effcomm analyses from the meeting later as well
Sounds good re (i) and (iii) Cheers! |
🥳 We now have an IB-BA implementation native to ULTK and a minimal example comparing efficiency analyses using grammatical/LoT complexity to IB.
Here is a summary of the changes I made: