Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RISC-V port #503

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Add RISC-V port #503

wants to merge 5 commits into from

Conversation

tingsi5
Copy link

@tingsi5 tingsi5 commented Oct 31, 2024

The port is based on pattonkan/sse2rvv@ca0e0b4, which for now supports SSE2 and SSE4.2 that embree uses. It is verified using Ubuntu 24.04.1 RISC-V preinstalled server image from https://cdimage.ubuntu.com/releases/24.04/release/ with QEMU on Ubuntu 24.04.

Note this is just the initial port of RVV and has a limitation for VLEN=128 only. We expect further performance improvement from porting the other extensions.

Test steps:

  1. Follow https://wiki.ubuntu.com/RISC-V/QEMU to install required packages and run the QEMU using the following command with vector extension enabled:

    # qemu-system-riscv64 \
    -machine virt -nographic -m 4096 -smp 4 \
    -bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.bin \
    -kernel /usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
    -device virtio-net-device,netdev=eth0 -netdev user,id=eth0 \
    -device virtio-rng-pci \
    -drive file=ubuntu-24.04.1-preinstalled-server-riscv64.img,format=raw,if=virtio \
    -cpu rv64,v=true,vlen=128,elen=64,vext_spec=v1.0,zba=true,zbb=true,zbs=true
    
  2. Use clang-18 to build embree in QEMU guest
    sudo apt update; sudo apt install cmake libtbb-dev libglfw3-dev clang-18
    cd embree; mkdir build; cd build; cmake -DCMAKE_CXX_COMPILER=clang++-18 -DCMAKE_C_COMPILER=clang-18 ..; make -j4; sudo make install

  3. Untar the archive https://github.com/RenderKit/embree/releases/download/v4.3.3/embree-4.3.3-testing.tar.gz to /usr/local and run tests
    cd /usr/local/testing; sudo cmake -B build -DCMAKE_CXX_COMPILER=clang++-18 -DCMAKE_C_COMPILER=clang-18 -DEMBREE_TESTING_INTENSITY=1; sudo cmake --build build --target test
    The results are:

    99% tests passed, 2 tests failed out of 269
    
    Total Test time (real) = 13564.67 sec
    
    The following tests FAILED:
            246 - embree_verify (Timeout)
            258 - bvh_builder (Timeout)
    

    Running embree_bvh_builder alone is passed, and running embree_verify will have a segmentation fault in SSE4.2.regression_static_memory_monitor from libtbb for an invalid register value. Use regex to select only the test is passed though. This doesn't seem relate to the port and can be listed as a known issue.

    Thread 16 "embree_verify" received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0x7fffee600140 (LWP 24732)]
    0x00007ffff77d2a50 in tbb::detail::r1::cancel_group_execution(tbb::detail::d1::task_group_context&) () from /lib/riscv64-linux-gnu/libtbb.so.12
    (gdb) bt
    #0  0x00007ffff77d2a50 in tbb::detail::r1::cancel_group_execution(tbb::detail::d1::task_group_context&) () from /lib/riscv64-linux-gnu/libtbb.so.12
    #1  0x00007ffff77d5bee in ?? () from /lib/riscv64-linux-gnu/libtbb.so.12
    #2  0x00007ffff77d666e in ?? () from /lib/riscv64-linux-gnu/libtbb.so.12
    #3  0x00007ffff72f10f4 in start_thread (arg=<optimized out>)
        at ./nptl/pthread_create.c:447
    #4  0x00007ffff7343908 in __thread_start_clone3 ()
        at ../sysdeps/unix/sysv/linux/riscv/clone3.S:71
    (gdb) x/5i $pc-10
       0x7ffff77d2a46 <_ZN3tbb6detail2r122cancel_group_executionERNS0_2d118task_group_contextE+284>:        ld       a5,-16(a4)
       0x7ffff77d2a4a <_ZN3tbb6detail2r122cancel_group_executionERNS0_2d118task_group_contextE+288>:
        beqz        a5,0x7ffff77d2a54 <_ZN3tbb6detail2r122cancel_group_executionERNS0_2d118task_group_contextE+298>
       0x7ffff77d2a4c <_ZN3tbb6detail2r122cancel_group_executionERNS0_2d118task_group_contextE+290>:
        beq s1,a5,0x7ffff77d2ab6 <_ZN3tbb6detail2r122cancel_group_executionERNS0_2d118task_group_contextE+396>
    => 0x7ffff77d2a50 <_ZN3tbb6detail2r122cancel_group_executionERNS0_2d118task_group_contextE+294>:        ld       a5,16(a5)
       0x7ffff77d2a52 <_ZN3tbb6detail2r122cancel_group_executionERNS0_2d118task_group_contextE+296>:
        bnez        a5,0x7ffff77d2a4c <_ZN3tbb6detail2r122cancel_group_executionERNS0_2d118task_group_contextE+290>
    (gdb) i r a5
    a5             0x4445535341505b20       4919429785015311136
    
    ubuntu@ubuntu:~$ embree_verify --no-colors --run ".*regression_static_memory_monitor.*"
                                                    SSE2.regression_static_memory_monitor ............................................................... [PASSED]
                                                  SSE4.2.regression_static_memory_monitor ............................................................... [PASSED]
    
                                                                             Tests passed: 2
                                                                             Tests failed: 0
                                                                 Tests failed and ignored: 0
    

@tingsi5 tingsi5 mentioned this pull request Oct 31, 2024
rm = 0b01;
}

asm volatile("csrw vxrm,%0" :: "r"(rm));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think _mm_setcsr is for floating point rounding modes. So this should be a write to frm not vxrm.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update SSE2RVV to include the proper implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants