xclUnmapBO()
was added to matchxclMapBO()
. This new API should be called when unmapping addresses returned byxclMapB()
. On Linux the API ends up calling POSIXmunmap()
but on Windows the implementation is different.
xclRead()
andxclWrite()
have been marked as deprecated in this release and will be removed in a future release. For direct register access please use replacement APIsxclRegRead()
andxclRegWrite()
which are more secure and multi-process aware.- Edge platforms can now use DFX also known as Partial Reconfiguration.
- Support for U50 board has been added to XRT.
- Support for signing xclbins using xclbinutil and validating xclbin signature in xclbin driver has been added to XRT. Please refer to XRT Security documentation https://xilinx.github.io/XRT/2019.2/html/security.html for more details.
- Edge platforms based on MPSoC now support M2M feature via Zynqmp built-in DMA engine. M2M for both PCIe and edge platforms can be performed using
xclCopyBO()
XRT API orclEnqueueCopyBuffers()
OCL API. Note that the same APIs can also be used to copy buffers between two devices using PCIe peer-to-peer transfer. - For edge platforms XRT now supports ACC (adapter execution model).
- XRT documentation has been reorganized and significantly updated.
- XRT now natively supports fully virtualized environments where management physical function (PF0) is completely hidden in host and only user physical function (PF1) is exported to the guest. End-user applications based on libxrt_core and xbutil command line utility do not need directly interact with xclmgmt driver. Communication between xocl driver and xclmgmt driver is done over hardware mailbox and MPD/MSD framework. For more information refer to MPD/MSD and Mailbox sections in XRT documentation.
- Management Physical Function (PF0) should now be managed using
xbmgmt
utility which is geared towards system adminstrators.xbutil
continues to be end-user facing utility. - Support has been added for device memory only buffer with no backing shadow buffer in host on PCIe platforms. To allocate such buffers use
XCL_BO_FLAGS_DEV_ONLY
in flags field of xclAllocBO() orCL_MEM_HOST_NO_ACCESS
in flags field of OCL API. - XRT now has integrated support for Linux hwmon. Run Linux sensors utility to see all the sensor values exported by Alveo/XRT.
- XRT now has production support for edge platforms. The following non DFX platforms edge platforms are supported: zcu102_base, zcu104_base, zc702, zc706. In addition zcu102_base_dfx platform has DFX support.
- Emulation and HW profiling support has been enabled for all the above mentioned edge platforms. Zynq MPSoC platforms: zcu102_base, zcu104_base and zcu102_base_dfx also has emulation profiling enabled.
- Improved handling of PCIe reset via
xbutil reset
which resolves system crash observed on some servers. - Resource management has been moved out of XMA library.
- Only signed xclbins can be loaded on systems running in UEFI secure boot mode. You can use DKMS key used to sign XRT drivers to sign xclbins as well. As root please use the following command to sign xclbin with DKMS UEFI key--
xclbinutil --private-key /var/lib/shim-signed/mok/MOK.priv --certificate /var/lib/shim-signed/mok/MOK.der --input a.xclbin --output signed.xclbin
- On U280 Platform, downloading XCLBIN is going to reset P2P BAR size back to 256M internally. XRT workaround this issue by reading BAR size register and writing back the same value. This sets the P2P BAR size back to the value before downloading XCLBIN.
- On edge platforms intermittent hang is observed when downloading different xclbins multiple times while CU interrupt is enabled.
- Dynamic clock scaling is not enabled for edge platforms.
- On PPC64LE
xbutil reset
uses PCIe fundamental reset effectively reloading the platform from PROM. Note on x86_64xbutil reset
continues to use PCIe warm reset which just resets the shell and the dynamic region without reloading the platform from PROM.
- Production support for QDMA (Xilinx PCIe Streaming DMA) engine has been added to XRT. Applications can use Xilinx streaming extension APIs defined in cl_ext_xilinx.h to work with streams on QDMA platforms like xilinx_u200_qdma_201910_1. Look for examples on https://github.com/Xilinx/SDAccel_Examples.
- PCIe peer-to-peer functionality is fully supported. Please consult https://xilinx.github.io/XRT/2019.1/html/p2p.html for details on how to setup PCIe peer-to-peer BAR and host system requirements. P2P buffers are created by passing
XCL_MEM_EXT_P2P_BUFFER
flag toclCreateBuffer()
API. Peer PCIe devices like NVMe can directly DMA from/to P2P buffers. P2P transfers between two Alveo™ boards can be triggered through standardclEnqueueCopyBuffers()
API. - Support has been added for AP_CTRL_CHAIN (data-flow) and AP_CTRL_NONE (streaming) execution models. XRT scheduler (including hardware accelerated ERT) have been updated to handle the new execution models. xclbin tools have been updated to annotate xclbin IP_LAYOUT entries with suitable tags to pass the execution model information to XRT.
- Memory to memory (M2M) hardware accelerated transfers from one DDR bank to another within a device can be effected on platforms with M2M IP via standard
clEnqueueCopyBuffer()
- XRT now looks for
xrt.ini
configuration file and if not found looks for legacy sdaccel.ini configuration file. If not found in usual search directories the files are now also searched in working directory. - Embedded platforms based on Zynq MPSoC US+™ are fully supported. For reference designs please explore reVISION™ stack from Xilinx. Embedded platforms now use interrupts for CU completion notification, significantly reducing ARM CPU usage.
- Profiling support has been extended to embedded platforms with timeline trace and profile summary.
- XRT now makes no assumption about CU base addresses on embedded platforms. CU base addresses can be completely floating and are discovered from
IP_LAYOUT
section of xclbin. - XMA (Xilinx Media Accelerator) is now fully integrated into XRT by using the common config reader and messaging framework (also shared by OCL) provided by XRT core.
- XMA uses XRT core framework for scheduling tasks on encoder/decoder/scaler. New XMA APIs provide a method to prepare register write command packet, send the write command to XRT and then wait for completion of one or more command submissions. Please look at https://github.com/Xilinx/xma-samples for recommended way to write XMA plugins and design video IP control interface.
- Multiple process mode is on by default in this release. This means multiple user processes can simultaneously use the same CU on a board. XRT does time division multiplexing. Note there is no support for pre-emption. In multi-process run only the first process gets profiling support.
- OCL can perform automatic binding of cl_mem to DDR bank by using several heuristics like kernel argument index and kernel instance information. The API
clCreateKernel
is enhanced to accept annotated CU name(s) to fetch asymmetrical compute units (If all the CUs of a kernel have exact same port maps or port connections they are symmetrical compute units, otherwise CUs are asymmetrical) and streaming compute units. - XRT will give error if it cannot identify the buffer location (in earlier releases it used to assume a default location). Remedies: a) Check kernel XCLBIN to make sure kernel argument corresponding to the buffer is mapped to device memory properly b) Use
clSetKernelArg
before any enqueue operation on buffer - Host applications directly linking with libxilinxopencl.so must use
-Wl,-rpath-link,$(XILINX_XRT)/lib
in the linker line. Host applications linking with ICD loader, libOpenCL.so do not need to change. xbutil top
now reports live CU usage metric.xclbincat
andxclbinsplit
are deprecated byxclbinutil
. These deprecated tools are currently scheduled to be obsoleted in the next release.- Profiling subsystem has been enhanced to show dataflow, PCIe peer to peer transfers, M2M transfers and kernel to kernel streaming information.
- XRT has switched to new header file
xrt.h
in place ofxclhal2.h
. The latter is still around for backwards compatibility but hash includes xrt.h for all definitions. A new filexrt-next.h
has been added for experimental features.
- xbutil can now generate output in JSON format for easy parsing by other tools. Use
xbutil dump
to generate JSON output on stdout. - Initial support for PCIe peer-to-peer transactions has been added. Please consult https://xilinx.github.io/XRT/2018.3/html/p2p.html for details.
- 64-bit BARs in Alveo shells are natively supported.
- Initial implementation of XRT logging API, xclLogMsg() for use by XRT clients.
- Initial support for Alveo shell KDMA feature in OpenCL.
- Yocto recipes to build XRT for embedded platforms. Please consult https://xilinx.github.io/XRT/2018.3/html/yocto.html for details.
xbutil flash -a
PROM corruption issue with multiple Alveo boards.- XRT scheduling bug with multiple boards on AWS F1 when scheduler was serializing board access.
- xocl kernel driver bugs in handling multiple processes accessing the same device.
- PPC64LE build failure.
- Several core QDMA driver fixes.
- xocl scheduler thread now yields correctly when running in polling mode.
- Several Coverity/Fortify code scan fixes.
- XMA plugin API xma_plg_register_write has been marked for deprecation. It will be removed in a future release.
- XMA plugin API xma_plg_register_read has been marked for deprecation. It will be removed in a future release.