This example shows how the look up table capability of the AIE can be used to perform approximations to well-known functions like 1024
bfloat16
numbers. Each core contains a lookup table approximation of the
-
aie2.py
: A Python script that defines the AIE array structural design using MLIR-AIE operations. This generates MLIR that is then compiled usingaiecc.py
to produce design binaries (i.e., XCLBIN and inst.txt for the NPU in Ryzen™ AI). -
bf16_exp.cc
: A C++ implementation of vectorized table lookup operations for AIE cores. The lookup operationgetExpBf16
operates on vectors of size16
, loading the vectorized accumulator registers with the look up table results. It is then necessary to copy the accumulator register to a regular vector register before storing it back into memory. The source can be found here. -
test.cpp
: This C++ code is a testbench for the design example. The code is responsible for loading the compiled XCLBIN file, configuring the AIE module, providing input data, and executing the AIE design on the NPU. After executing, the program verifies the results.
The design also uses a single file from the AIE runtime to initialize the look up table contents to approximate the
To compile the design:
make
To compile the C++ testbench:
make testExp.exe
To run the design:
make run