Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmark #41

Merged
merged 15 commits into from
Feb 8, 2024
Merged

Add benchmark #41

merged 15 commits into from
Feb 8, 2024

Conversation

yasahi-hpc
Copy link
Collaborator

In this PR, I have added micro bench based on googlebenchmark. Need to implement more practical examples.

@yasahi-hpc yasahi-hpc added the enhancement New feature or request label Jan 31, 2024
@yasahi-hpc yasahi-hpc self-assigned this Jan 31, 2024
@yasahi-hpc yasahi-hpc marked this pull request as draft January 31, 2024 12:13
@yasahi-hpc
Copy link
Collaborator Author

Results on Icelake

Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
  In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
  For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
  For unit testing set OMP_PROC_BIND=false

2024-01-31T21:58:11+09:00
Running ./fft/perf_test/_PerformanceTest_Benchmark
Run on (112 X 3400.04 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x56)
  L1 Instruction 32 KiB (x56)
  L2 Unified 1024 KiB (x56)
  L3 Unified 39424 KiB (x2)
Load Average: 3.58, 3.96, 3.31
CPU architecture: none
Default Device: N6Kokkos6OpenMPE
GPU architecture: none
KOKKOSFFT_ENABLE_TPL_CUFFT: no
KOKKOSFFT_ENABLE_TPL_FFTW: yes
KOKKOSFFT_ENABLE_TPL_HIPFFT: no
KOKKOSFFT_ENABLE_TPL_ONEMKL: no
KOKKOS_COMPILER_INTEL_LLVM: 20230000
KOKKOS_ENABLE_ASM: yes
KOKKOS_ENABLE_CXX17: yes
KOKKOS_ENABLE_CXX20: no
KOKKOS_ENABLE_CXX23: no
KOKKOS_ENABLE_DEBUG_BOUNDS_CHECK: no
KOKKOS_ENABLE_HBWSPACE: no
KOKKOS_ENABLE_HWLOC: no
KOKKOS_ENABLE_INTEL_MM_ALLOC: no
KOKKOS_ENABLE_LIBDL: yes
KOKKOS_ENABLE_LIBRT: no
KOKKOS_ENABLE_OPENMP: yes
KOKKOS_ENABLE_PRAGMA_IVDEP: no
KOKKOS_ENABLE_PRAGMA_LOOPCOUNT: no
KOKKOS_ENABLE_PRAGMA_UNROLL: no
KOKKOS_ENABLE_PRAGMA_VECTOR: no
Kokkos: OpenMP thread_pool_topology[ 1 x 112 x 1 ]
Kokkos Version: 4.2.0
KokkosFFT Version: 0.0.0
platform: 64bit
------------------------------------------------------------------------------------------------------------------------
Benchmark                                                              Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------
FFT_1DView<float, Kokkos::LayoutLeft>/N:4096/manual_time            5330 us         1775 us          129 GB/s=0.0122966/s MB (In)=0.032768 MB (Out)=0.032768
FFT_1DView<float, Kokkos::LayoutLeft>/N:8192/manual_time             882 us          876 us          597 GB/s=0.148661/s MB (In)=0.065536 MB (Out)=0.065536
FFT_1DView<float, Kokkos::LayoutLeft>/N:16384/manual_time            820 us          816 us          860 GB/s=0.31978/s MB (In)=0.131072 MB (Out)=0.131072
FFT_1DView<float, Kokkos::LayoutLeft>/N:32768/manual_time           2225 us         2209 us          343 GB/s=0.235583/s MB (In)=0.262144 MB (Out)=0.262144
FFT_1DView<float, Kokkos::LayoutLeft>/N:65536/manual_time           3881 us         3810 us          216 GB/s=0.270188/s MB (In)=0.524288 MB (Out)=0.524288

@yasahi-hpc
Copy link
Collaborator Author

Build errors on HIP

[ 78%] Building CXX object fft/perf_test/CMakeFiles/_PerformanceTest_Benchmark.dir/PerfTest_FFT1.cpp.o
clang-14: warning: argument unused during compilation: '-fno-gpu-rdc' [-Wunused-command-line-argument]
In file included from /work/fft/perf_test/PerfTest_FFT1.cpp:2:
In file included from /work/fft/perf_test/Benchmark_Context.hpp:9:
In file included from /work/tpls/kokkos/core/src/Kokkos_Core.hpp:43:
In file included from /work/tpls/kokkos/core/src/Kokkos_Core_fwd.hpp:28:
In file included from /work/tpls/kokkos/core/src/Kokkos_Macros.hpp:96:
In file included from /work/build_HIP/tpls/kokkos/KokkosCore_Config_SetupBackend.hpp:22:
In file included from /work/tpls/kokkos/core/src/setup/Kokkos_Setup_HIP.hpp:24:
In file included from /opt/rocm-5.2.0/hip/include/hip/hip_runtime.h:26:
/opt/rocm-5.2.0/hip/include/hip/../../../include/hip/hip_runtime.h:66:2: error: ("Must define exactly one of __HIP_PLATFORM_AMD__ or __HIP_PLATFORM_NVIDIA__");
#error("Must define exactly one of __HIP_PLATFORM_AMD__ or __HIP_PLATFORM_NVIDIA__");
 ^

@yasahi-hpc yasahi-hpc marked this pull request as ready for review February 3, 2024 04:19
@jbigot jbigot requested a review from acalloo February 5, 2024 10:16
Copy link

@acalloo acalloo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two minor comments for my understanding.

Comment on lines +14 to +16
set(KokkosFFT_VERSION_MAJOR 0)
set(KokkosFFT_VERSION_MINOR 0)
set(KokkosFFT_VERSION_PATCH 00)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're set to release a beta, you should probably put a 0.1.0 or something I suppose.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. As well as DDC, I am planning to begin with 0.0.00.

Comment on lines +20 to +23
math(EXPR KOKKOSFFT_VERSION "${KokkosFFT_VERSION_MAJOR} * 10000 + ${KokkosFFT_VERSION_MINOR} * 100 + ${KokkosFFT_VERSION_PATCH}")
math(EXPR KOKKOSFFT_VERSION_MAJOR "${KOKKOSFFT_VERSION} / 10000")
math(EXPR KOKKOSFFT_VERSION_MINOR "${KOKKOSFFT_VERSION} / 100 % 100")
math(EXPR KOKKOSFFT_VERSION_PATCH "${KOKKOSFFT_VERSION} % 100")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the use of the KOKKOSFFT_VERSION variable. Could you be a little more explicit please?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KOKKOSFFT_VERSION is a temporary variable to convert version variables into integers. What we would really like to have in integers are KOKKOSFFT_VERSION_MAJOR, KOKKOSFFT_VERSION_MINOR and KOKKOSFFT_VERSION_PATCH. These are used to replace the placeholders in c++ codes that are used to show version info in benchmark.

Is it OK for you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented, this will be done in the fix of #32
For the moment, need to focus on docs and thread capability.

@yasahi-hpc
Copy link
Collaborator Author

@acalloo First of all, thank you for your reviews.

@yasahi-hpc yasahi-hpc changed the title [WIP] Add benchmark Add benchmark Feb 8, 2024
@yasahi-hpc
Copy link
Collaborator Author

@pzehner I found the errors in hip build, which says

Traceback (most recent call last):
  File "/opt/rocm-5.2.0/bin/rocm_agent_enumerator", line 257, in <module>
    main()
  File "/opt/rocm-5.2.0/bin/rocm_agent_enumerator", line 241, in main
    target_list = readFromKFD()
  File "/opt/rocm-5.2.0/bin/rocm_agent_enumerator", line 193, in readFromKFD
    for node in sorted(os.listdir(topology_dir)):
FileNotFoundError: [Errno 2] No such file or directory: '/sys/class/kfd/kfd/topology/nodes/'

For some reason, build tests completed.

There seems to be a fix for that after rocm5.3.0+.
I will try to update the base image for hip.
ROCm/rocminfo#56

@yasahi-hpc
Copy link
Collaborator Author

I will merge this one and update CMake in other PR.

@yasahi-hpc yasahi-hpc merged commit dd796b1 into main Feb 8, 2024
16 checks passed
@yasahi-hpc yasahi-hpc deleted the add-benchmark branch February 8, 2024 15:23
@yasahi-hpc yasahi-hpc mentioned this pull request Feb 8, 2024
@yasahi-hpc yasahi-hpc mentioned this pull request Feb 8, 2024
31 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants