-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What does "compile" mean for ReverseDiff? #91
Comments
It means the latter.
What's a case for the other interpretation? If you have value-dependent control flow, you'd also have to grab a new tape. I guess in theory you could want to reuse the tape but never JIT compile? But if you know you're going to be reusing the tape, why would you not be compiling it? It seems like a theoretical choice but not one I'd point users to 99% of the time, since you'd have to do something like reuse the same tape 5 times but not more than 6 😅 |
I agree with Chris here -- specifically that if you've recorded a tape, I really struggle to imagine a realistic situation in which you wouldn't also want to compile it. Consequently, you might as well unify the concept and assert that either:
|
Thank you for your answers. Even though it seems obvious to you two, it didn't seem so obvious to me at the time. I think the reason I picked a different interpretation in DI was two-fold:
Anyway, I clarified the |
Following discussions with the Turing folks, it seems that there are (at least) two conflicting interpretations of the word "compile" when it comes to ReverseDiff and its tape API. To clarify, I will use the terminology from the ReverseDiff docs and sepatate two concepts:
ReverseDiff.GradientTape
)ReverseDiff.compile
)In LogDensityProblems and Turing, if I understand correctly, passing
compile = true
means "record a tape and promise that it is safe to reuse later". The tape is then always compiled for increased speed.In DifferentiationInterface on the other hand, passing
compile = true
means "when you record a tape during preparation, compile it for increased speed". A tape is always recorded during preparation (which is arguably not ideal for functions with value-dependent control flow).I think we need to settle on a unified standard, and the right way to do that would be in ADTypes. Here are my propositions:
compile
argument toAutoReverseDiff
means at the moment: what is the interpretation we agree upon?AutoReverseDiff(tape=true/false, compile=true/false)
with defaults corresponding to the common interpretation in 1.What do you think?
Related:
The text was updated successfully, but these errors were encountered: