Skip to content

Version 0.3.4

Compare
Choose a tag to compare
@marcvanzee marcvanzee released this 18 May 11:27
eba1dba

Possibly breaking changes:

  • When calling init the 'intermediates' collection is no longer mutable.
    Therefore, intermediates will no longer be returned from initialization by default.
  • Don't update batch statistics during initialization.
  • When not using any non-determinism (e.g., dropout), it is not longer necessary to specify the deterministic argument in MultiHeadDotProductAttention.

Other changes:

  • Rewrote various examples to use Optax instead of Flax optimizers (e.g., Imagenet, SST2).
  • Added an NLP text classification example (on the SST-2 dataset) to
    examples/sst2.
    that uses a bidirectional LSTM (BiLSTM) to encode the input text.
  • Added flax.training.train_state to simplify using Optax optimizers.
  • mutable argument is now available on Module.init and Module.init_with_outputs
  • Bug fix: Correctly handle non-default parameters of Linen Modules with nested inheritance.
  • Expose dot_product_attention_weights, allowing access to attention weights.
  • BatchNorm instances will behave correctly during init when called multiple times.
  • Added a more extensive "how to contribute" guide in contributing.md.
  • Add proper cache behavior for lift.jit,
    fixing cache misses.
  • Fix bug in Embed layer: make sure it behaves correctly when embedding is np.array.
  • Fix linen.Module for deep inheritance chains.
  • Fix bug in DenseGeneral: correctly expand bias to account for batch & noncontracting dimensions.
  • Allow Flax lifted transforms to work on partially applied Modules.
  • Make MultiOptimizer use apply_gradient instead of apply_param_gradient.