oneDPL 2022.3.0 release
ValentinaKats
released this
22 Nov 12:26
·
411 commits
to main
since this release
New Features
- Added an experimental feature to dynamically select an execution context, e.g., a SYCL queue.
The feature provides selection functions such asselect
,submit
andsubmit_and_wait
,
and several selection policies:fixed_resource_policy
,round_robin_policy
,
dynamic_load_policy
, andauto_tune_policy
. unseq
andpar_unseq
policies now enable vectorization also for Intel® oneAPI DPC++/C++ Compiler.- Added support for passing zip iterators as segment value data in
reduce_by_segment
,
exclusive_scan_by_segment
, andinclusive_scan_by_segment
. - Improved performance of the
merge
,sort
,stable_sort
,sort_by_key
,
reduce
,min_element
,max_element
,minmax_element
,is_partitioned
, and
lexicographical_compare
algorithms with DPC++ execution policies.
Fixed Issues
- Fixed the
reduce_async
function to not ignore the provided binary operation.
New Known Issues and Limitations
- When compiled with
-fsycl-pstl-offload
option of Intel® oneAPI DPC++/C++ compiler and with
libstdc++
version 8 orlibc++
,oneapi::dpl::execution::par_unseq
offloads
standard parallel algorithms to the SYCL device similarly tostd::execution::par_unseq
in accordance with the-fsycl-pstl-offload
option value. - When using the dpl modulefile to initialize the user's environment and compiling with
-fsycl-pstl-offload
option of Intel® oneAPI DPC++/C++ compiler, a linking issue or program crash may be encountered due to the directory
containing libpstloffload.so not being included in the search path. Use the env/vars.sh to configure the working
environment to avoid the issue. - Compilation issues may be encountered when passing zip iterators to
exclusive_scan_by_segment
on Windows. - Incorrect results may be produced by
set_intersection
with a DPC++ execution policy,
where elements are copied from the second input range rather than the first input range. - For
transform_exclusive_scan
andexclusive_scan
to run in-place (that is, with the same data
used for both input and destination) and with an execution policy ofunseq
orpar_unseq
,
it is required that the provided input and destination iterators are equality comparable.
Furthermore, the equality comparison of the input and destination iterator must evaluate to true.
If these conditions are not met, the result of these algorithm calls is undefined. sort
,stable_sort
,sort_by_key
,partial_sort_copy
algorithms may work incorrectly or cause
a segmentation fault when used a DPC++ execution policy for CPU device, and built
on Linux with Intel® oneAPI DPC++/C++ Compiler and -O0 -g compiler options.
To avoid the issue, pass-fsycl-device-code-split=per_kernel
option to the compiler.- Incorrect results may be produced by
exclusive_scan
,inclusive_scan
,transform_exclusive_scan
,
transform_inclusive_scan
,exclusive_scan_by_segment
,inclusive_scan_by_segment
,reduce_by_segment
withunseq
orpar_unseq
policy when compiled by Intel® oneAPI DPC++/C++ Compiler
with-fiopenmp
,-fiopenmp-simd
,-qopenmp
,-qopenmp-simd
options on Linux.
To avoid the issue, pass-fopenmp
or-fopenmp-simd
option instead. - Incorrect results may be produced by
reduce
andtransform_reduce
with 64-bit types andstd::multiplies
,
sycl::multiplies
operations when compiled by Intel® C++ Compiler 2021.3 and newer and executed on GPU devices.