14 Jan 00:54

tprimak

a930253

v2.5.2

This is a patch release containing the following changes to v2.5.1:

Fixed performance regression in binary primitive with broadcast (b972174, ff75122)
Fixed issue with SYCL device properties initialization (cabc5ca, 095f13e)
Fixed issue in matmul primitive with zero points (3157354)
Fixed segmentation fault in depthwise convolution primitive for shapes with huge spatial size for processors with Intel AVX-512 support (6834764, 1d2addc)
Fixed issue in forward convolution primitive for processors with Intel AVX2 support (d691137)
Fixed performance regression on GPUs with SYCL runtime (d8364e5)

Assets 2

29 Dec 20:27

tprimak

graph-v0.4

e7551fd

graph-v0.4 Pre-release

Pre-release

This is a technical preview for oneDNN Graph API based on oneDNN v2.5.

Functionality

Introduced bf16 inference support.
Introduced multi-head attention (MHA) fusion supported by oneDNN Graph compiler with optimized code generation (experimental).
Updated API to comply with oneDNN Graph API specification v0.9.

Known Issues and Limitations

Some subgraphs might not be recognized as a partition even if it matches the general pattern description due to internal implementation.
The weight’s opaque layout can be queried only from a compiled partition, which requires that tensor shapes must be known at compilation time.
MHA fusion is not activated on machines without AVX-512 support, as oneDNN Graph compiler generates AVX-512 and newer instructions.

Thanks to the Contributors

This release contains contributions from the project core teams as well as Jiong Gong, Chunyuan Wu, Sanchit Jain, Yiqiang Li, Yunfei Mao, Kiefer Kuah and others.

Assets 2

21 Dec 00:09

vpirogov

v2.5.1

3d770ab

v2.5.1

This is a patch release containing the following changes to v2.5:

Improved performance of binary primitive and binary post-op with broadcast over batch and spatial dimension (6d4b092, c4dc38a, be261ab, 3ec15b6, f1c2f9f)
Fixed undefined behavior for cases when different number of threads used at primitive creation and execution (0af92ec, ba2e5a9, 8863e34, 57b1e7a, 72b54de, 9b394dd, 2d4d88a, 4c3e771, 2458105, 6799040, edc40fd)
Replaced deprecated SYCL APIs with SYCL 2020 alternatives (2c2f4a4, a090db8)
Fixed documentation formatting issues (812085d, 591a0f4, 7eadf81, 75a2f06, b73c8a7, ca1eb77)
Updated Microsoft Visual Studio build instructions (add953a, 42b9904)

Assets 2

09 Dec 00:12

vpirogov

v2.5

7e8dffa

v2.5

Performance Optimizations

Intel Architecture Processors
- Improved performance for future Intel Xeon Scalable processors (code name Sapphire Rapids). The functionality is now enabled by default and requires Linux kernel 5.16.
- Improved performance of matmul primitive for processors with Intel AVX-512 support.
Intel Graphics Products
- Introduced initial optimizations for future Xe Architecture graphics (code name Ponte Vecchio).
- Improved pooling and layer normalization primitives performance.
AArch64-based Processors
- Improved softmax and logsoftmax primitives performance with Arm Compute Library (ACL)

Functionality

Introduced support for compiler with SYCL 2020 standard support.
Introduced support for the ICX/ICPX and DPCPP compiler drivers distributed with Intel oneAPI DPC++ Compiler on Windows.

Usability

Added compile time option to manage the set of supported instruction set architectures on Intel64/AMD64 processors. See 'DNNL_ENABLE_PRIMITIVE_CPU_ISA' for more details. This feature further reduces the binary footprint.
Added environment variables and build options with ONEDNN prefix.
Introduced support for QNX operating system.
Introduced support for RISC-V architecture.

Breaking Changes

The Intel MKL-DNN compatibility API is removed. See Transition from Intel MKL-DNN to oneDNN page for instructions on moving to the new API.
Updated minimal supported ACL version to 21.11 (was 21.08).

Deprecated Functionality

Support for Intel Xeon Phi processors is deprecated and will be removed in the next release.
Support for SYCL 1.2.1 (aka SYCL 2017 standard) is deprecated and will be removed in future releases.

Thanks to the Contributors

This release contains contributions from the project core team as well as Aaron Franke @aaronfranke, Arthur Mitrano @aaraujom, Crefeda Rodrigues @cfRod, Diana Bite @diaena, Joel Dippold @jedippold, Joe Konno @thac0, Jonathan Deakin @jondea, Luke Ireland @LukeIreland1, Mark Ryan @markdryan, Mesut Meterelliyoz @mmeterel, Michel Migdal @Michoumichmich, Nathan John Sircombe @nSircombe, Pablo Romero @pablorcum, Peter Caday @petercad, Sergey Razumovskiy @srazumov, and Tsao Zhong @CaoZhongZ. We would also like to thank everyone who asked questions and reported issues.

Contributors

markdryan, aaronfranke, and 14 other contributors

Assets 2

17 Nov 20:12

vpirogov

v2.5-rc

2597b4b

v2.5-rc Pre-release

Pre-release

This is a release candidate for oneDNN v2.5. Please provide feedback and submit defect reports via Github issues.

Performance Optimizations

Intel Architecture Processors
- Improved performance for future Intel Xeon Scalable processors (code name Sapphire Rapids). The functionality is disabled by default and should be enabled via CPU dispatcher control.
- Improved performance of matmul primitive for processors with Intel AVX-512 support.
Intel Graphics Products
- Introduced initial optimizations for future Xe Architecture graphics (code name Ponte Vecchio).
- Improved pooling and layer normalization primitives performance.
AArch64-based Processors
- Improved softmax primitive performance with Arm Compute Library (ACL)

Functionality

Introduced support for compiler with SYCL 2020 standard support.
Introduced support for the ICX/ICPX and DPCPP compiler drivers available in the Intel oneAPI DPC++ Compiler.

Usability

Added compile time option to manage the set of supported instruction set architectures on Intel64/AMD64 processors. See 'DNNL_ENABLE_PRIMITIVE_CPU_ISA' for more details. This feature further reduces the binary footprint.
Added environment variables and build options with 'ONEDNN' prefix.
Introduced support for QNX operating system.
Introduced support for RISC-V architecture.

Breaking Changes

The Intel MKL-DNN compatibility API is removed. See Transition from Intel MKL-DNN to oneDNN page for instructions on moving to the new API.

Deprecated Functionality

Support for Intel Xeon Phi processors is deprecated and will be removed in the next release.
Support for SYCL 1.2.1 (aka SYCL 2017 standard) is deprecated and will be removed in future releases.

Thanks to the Contributors

Contributors

markdryan, aaronfranke, and 14 other contributors

Assets 2

15 Nov 20:30

vpirogov

v2.4.4

145c4b5

v2.4.4

This is a patch release containing the following changes to v2.4.3:

Fixed incorrect results for reorder with zero-points on CPUs (ee63629)
Fixed an issue with reorder with zero-points not respecting rounding mode on processors without Intel DL Boost support (a165c4a)
Fixed correctness issue in bfloat16 inner product weight gradient on processors with Intel DL Boost support (b782f19)
Improved bfloat16 inner product weights gradient performance on processors with Intel AMX support (ebf9f81)
Fixed potential undefined access in convolution, inner product, matmul, and RNNs primitives on processors with Intel AMX support (dcd98ad)

Assets 2

08 Nov 18:46

vpirogov

graph-v0.3

1cb8297

graph-v0.3 Pre-release

Pre-release

This is a technical preview for oneDNN Graph API based on oneDNN v2.4.

Functionality

Introduced int8 inference support.
Updated API to comply with oneDNN Graph API specification v0.8.

Known Issues and Limitations

Some subgraphs might not be recognized as a partition even if it matches the general pattern description due to internal implementation.
The weight’s opaque layout can be queried only from a compiled partition, which requires that tensor shapes must be known at compilation time.

Thanks to the Contributors

This release contains contributions from the project core teams as well as Tian Feng, Zhang Guoming, Jiong Gong, Chunyuan Wu, Nishant Patel, Yiqiang Li, Yang Sheng, Yunfei Mao, Kiefer Kuah and others.

Assets 2

05 Nov 22:25

vpirogov

v2.4.3

8488d01

v2.4.3

This is a patch release containing the following changes to v2.4.2:

Fixed and issue with reorder primitive producing NaN results for some cases on future Intel Xeon Scalable processor (code name Sapphire Rapids) (ac20af3)
Fixed performance regression for inner product primitive for future Intel Xeon Scalable processor (code name Sapphire Rapids) (ac6a24d, 2cf3526, d02dddf, bcdc175)
Fixed segmentation fault in int8 deconvolution primitive with asymmetric quantization for processors with Intel AVX-512 support (6ba086a)

Assets 2

20 Oct 22:50

tprimak

v2.4.2

909aadb

v2.4.2

This is a patch release containing the following changes to v2.4.1:

Fixed performance regression for convolution primitive for the shapes with 3D spatial for future Intel Xeon Scalable processor (code name Sapphire Rapids) (aca0af1)
Fixed segmentation fault in bfloat16 forward and backward inner product primitive or future Intel Xeon Scalable processor (code name Sapphire Rapids) (ae8cf18, 3de9549)
Fixed reorder primitive with compensation (6ba086a)
Fixed issue in scratch pad size calculation for BRGEMM-based convolutions (dd9eceb)

Assets 2

13 Oct 19:40

vpirogov

v2.3.3

f40443c

v2.3.3

This is a patch release containing the following changes to v2.3.2:

Reverted check for memory descriptor stride validity for unit dimensions (861c625)
Fixed build errors on Fucshia OS (753b531)
Fixed implicit conversion in GPU GEMM implementation (30dee23)
Addressed issues detected by clang TSan (888ab52, 7555fd8, 4ffdb3c, 57b8ffd, b52b2c0, 84b200f, 67deb8e)
Fixed undefined access issues detected by clang UBSan (5bab17c, 3494b1e, 6885360, 8cbe861, b13a215, 859622d, 5813c99)
Fixed memory leak in CPU GEMM implementation (45e3039, fd6d14c)
Fixed int8 convolution correctnes issues on Intel Integrated Graphics (b7d40a0, 72e4856)
Fixed access violation issue in GEMM implementation on Windows (aac6b23)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Functionality

Known Issues and Limitations

Thanks to the Contributors

Performance Optimizations

Functionality

Usability

Breaking Changes

Deprecated Functionality

Thanks to the Contributors

Contributors

Performance Optimizations

Functionality

Usability

Breaking Changes

Deprecated Functionality

Thanks to the Contributors

Contributors

Functionality

Known Issues and Limitations

Thanks to the Contributors

Releases: oneapi-src/oneDNN

v2.5.2

graph-v0.4

Functionality

Known Issues and Limitations

Thanks to the Contributors

v2.5.1

v2.5

Performance Optimizations

Functionality

Usability

Breaking Changes

Deprecated Functionality

Thanks to the Contributors

Contributors

v2.5-rc

Performance Optimizations

Functionality

Usability

Breaking Changes

Deprecated Functionality

Thanks to the Contributors

Contributors

v2.4.4

graph-v0.3

Functionality

Known Issues and Limitations

Thanks to the Contributors

v2.4.3

v2.4.2

v2.3.3