Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create lightweight package AbstractDualNumbers or ForwardDiffBase or similar? #518

Open
oschulz opened this issue May 1, 2021 · 14 comments

Comments

@oschulz
Copy link

oschulz commented May 1, 2021

My apologies if this has been suggested before:

While ForwardDiff is not the heaviest of packages in the ecosystem, it's also not exactly lightweight (take 1.6 seconds to load on my system). A lightweight package AbstractDualNumbers.jl or ForwardDiffBase.jl (or similar) that just defines something like abstract type AbstractDualNumber{Tag} <: Real end and things like function AbstractDualNumbers.value end and function AbstractDualNumbers.partials end could allow packages to define custom push-forwards without depending on ForwardDiff itself.

I know that there are exciting efforts underway in the Julia-AD-ecosystem for new ADs (e.g. Diffractor), but ForwardDiff is certainly not going away any time soon. A really lightweight way to define push-forwards could reduce the frequency of @require ForwardDiff in the ecosystem quite a bit, and also make it possible to move code from packages like DistributionsAD to Distributions, etc.

@dlfivefifty
Copy link
Contributor

Note we already have DualNumbers.jl. I believe the plan is to move ForwardDiff.Dual to there. (Unless there's a need for un-tagged dual numbers...)

@hyrodium
Copy link
Contributor

x-ref JuliaDiff/DualNumbers.jl#45

@oschulz
Copy link
Author

oschulz commented Apr 16, 2022

JuliaDiff/DualNumbers.jl#45 would be nice ...

@dlfivefifty
Copy link
Contributor

I think #45 is definitely the way to go, but ForwardDiff.Dual definitely needs some work to make it easy to use (beginning with pretty printing).

@longemen3000
Copy link

longemen3000 commented Apr 20, 2022

i was looking at JuliaDiff/DualNumbers.jl#45, JuliaDiff/DualNumbers.jl#49 and the source code of DualNumbers.jl, and for hyphotetically embarking on such migration (lets call it FD DualNumber), i have some questions (and observations):

  • It is necessary for FD DualNumbers to support SpecialFunctions, NaNMath or Calculus ? if FD DualNumbers is made to only support base and expose an overload API, an alternative path would be to make those packages define their own derivatives.
  • On the other part, SpecialFunctions already loads ChainRulesCore. if there is a possibility of any integration between the two packages, that integration could be done at the FD DualNumbers level
  • On the pretty printing, there was an attempt to improve the printing add prettier printing for duals #193, but it was not merged for some valid reasons
  • I suppose that Transfer DualNumbers implementation from ForwardDiff DualNumbers.jl#49 is stale, but the migration guide provided @jrevels by is really useful:

Before you go down this route, be warned that it will probably involve a good bit of work and digging into the implementation details of both packages. I had planned on doing this work myself after v0.6 released to avoid having to continuously update with breakage/depwarn fixes.

We need the change-over to not merely swap out the old implementation for ForwardDiff's, but to ensure that the feature sets of the two implementations are appropriately merged. We'll wish to drop some of the old behaviors, while other behaviors we'll wish to preserve, probably requiring new definitions. For example, there are some primitives defined on DualNumbers.Dual that are not yet defined on ForwardDiff.Dual. There might be more subtle behavioral changes as well.

Things that have to be done (besides just porting over the code):

  • Implement a deprecation layer
  • Implement whatever new functionality we need to appropriately merge the behavior of the two implementations
  • Write new tests for any new definitions
  • Write documentation describing the new interface

I'd also like my name added to the LICENSE (and I believe @mlubin is also within his rights to request this, but I'll let him speak for himself). I believe doing this requires Theo's permission?

Are there any additional things to be done apart from the list above?

@oschulz
Copy link
Author

oschulz commented May 7, 2022

Most of the load time of ForwardDiff is actually due to StaticArrays - that is, I think, only used for the Hessian, Jabobian, etc. functionality, so a package focused on dual-numbers should load very quickly.

@oschulz
Copy link
Author

oschulz commented May 7, 2022

It is necessary for FD DualNumbers to support SpecialFunctions, NaNMath or Calculus

I think if it's lightweight enough there would be a chance to convince SpecialFunctions, NaNMath, etc. to support it, instead of the other way round.

@oschulz
Copy link
Author

oschulz commented May 7, 2022

On the other part, SpecialFunctions already loads ChainRulesCore

Supporting ChainRulesCore would open so many doors. StatsFuns, for example, defines a lof of ChainRulesCore.@scalar_rules, but there are pretty much unusable at the moment because ForwardDiff doesn't utilize them.

@mcabbott
Copy link
Member

mcabbott commented May 7, 2022

One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore, as @scalar_rule could then define methods.

@longemen3000
Copy link

Looking at DualNumbers.jl direct dependencies on github, not all of those have a dependency in their latest version:

@oschulz
Copy link
Author

oschulz commented May 7, 2022

One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore, as @scalar_rule could then define methods.

Coming from you, that's almost an endorsement @mcabbott :-)

Maybe that's not that crazy at all? We don't want ChainRulesCore to become noticeably heavier, of course, now that it's making real inroads throughout the ecosystem - but maybe the cost wouldn't be high? We're currently at (Julia v1.8.0-beta3)

julia> @time_imports using ChainRulesCore
      3.1 ms  ┌ Compat
     58.1 ms  ChainRulesCore

If it's just 5 ms more or so, maybe that would be Ok? DualNumbers are quite fundamental after all - or at least will be once there's only one version of them around.

@oschulz
Copy link
Author

oschulz commented May 7, 2022

One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore [...]
If it's just 5 ms more or so, maybe that would be Ok?

The package load times do suggest a certain graph of package dependencies (run in sequence in a single session):

julia> @time_imports using ChainRulesCore
      3.2 ms  ┌ Compat
     63.2 ms  ChainRulesCore

julia> @time_imports using Calculus
      3.3 ms  Calculus

julia> @time_imports using NaNMath
      1.6 ms  NaNMath

julia> @time_imports using SpecialFunctions
      0.9 ms  ┌ ChangesOfVariables
      0.3 ms  ┌ OpenLibm_jll
      3.0 ms  ┌ DocStringExtensions
      4.1 ms  ┌ IrrationalConstants
      0.6 ms  ┌ CompilerSupportLibraries_jll
      1.4 ms  ┌ LogExpFunctions
     17.4 ms      ┌ Preferences
     18.0 ms    ┌ JLLWrappers
     21.4 ms  ┌ OpenSpecFun_jll
    131.1 ms  SpecialFunctions

julia> @time_imports using DualNumbers
     13.7 ms  DualNumbers

Especially SpecialFunctions should clearly depend on a dual-numbers package and not the other way round. :-) And ChainRulesCore depending on dual-numbers would seem quite natural as well. And the potential benefits would be huge - we would quickly get a lot more dual-numbers/ForwardDiff-support throughout the ecosystem (especially in the statistics sector - DistributionsAD could just go away completely - but also in many other domains).

@devmotion
Copy link
Member

DistributionsAD could just go away completely

ForwardDiff is not the main blocker, it's Tracker and ReverseDiff. There are only very few definitions for dual numbers remaining: https://github.com/TuringLang/DistributionsAD.jl/blob/master/src/forwarddiff.jl

@oschulz
Copy link
Author

oschulz commented May 7, 2022

DistributionsAD could just go away completely
ForwardDiff is not the main blocker, it's Tracker and ReverseDiff.

Ah, sorry, you're right of course. (Full) ChainRulesCore-support in Tracker and ReverseDiff would be so nice ...

Speaking of the statistics domain there's StatsFuns, though, with several @scalar_rule's that could make the respective functions ForwardDiff-compatible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants