Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-7746] Support static link libhdfs3 in velox #7697

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

JkSelf
Copy link
Contributor

@JkSelf JkSelf commented Oct 28, 2024

What changes were proposed in this pull request?

Support static link libhdfs3 to fix the performance gap issue.

  • Static linking with libhdfs3.so: 340%
  • Dynamic linking with JVM: 214.75% (requires further investigation)
  • Dynamic linking with libhdfs3: 323.85% (only replacing libhdfs.so with libhdfs3.so)
  • Expected performance for static linking in this PR: 340%

How was this patch tested?

Jenkins tests

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@FelixYBW
Copy link
Contributor

Just talked with Ke. Here is a summary:

  1. Previously we link to libhdfs3 at compile time, it can be static or dynamic link.
  2. Velox currently loads libhdfs library (can be jvm libhdfs or libhdfs3) at run time, record the pointer to libhdfs functions, then indirectly call the function when accessing hdfd.
  3. libhdfs shows much worse performance than libhdfs3
  4. libhdfs3 has a ~7% regression compared to link way in 1.

The PR is to confirm 4. Also we need to make clear the root cause of 3.

Thank you! Ke.

@zhouyuan
Copy link
Contributor

@GlutenPerfBot benchmark

@JkSelf JkSelf changed the title [WIP] Support libhdfs3 in velox [GLUTEN-7746] Support static link libhdfs3 in velox Oct 31, 2024
Copy link

#7746

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_7697_time.csv log/native_master_10_31_2024_2662646eaf_time.csv difference percentage
q1 53.90 54.78 0.884 101.64%
q2 30.14 30.21 0.065 100.22%
q3 52.50 53.76 1.262 102.40%
q4 43.86 43.43 -0.437 99.00%
q5 105.99 103.39 -2.607 97.54%
q6 12.65 12.79 0.139 101.10%
q7 112.14 114.32 2.177 101.94%
q8 115.27 117.17 1.898 101.65%
q9 170.40 172.95 2.545 101.49%
q10 68.33 66.45 -1.876 97.25%
q11 26.99 26.85 -0.145 99.46%
q12 31.87 32.71 0.832 102.61%
q13 52.76 52.88 0.121 100.23%
q14 22.22 21.84 -0.381 98.28%
q15 51.07 51.29 0.219 100.43%
q16 17.80 16.89 -0.903 94.93%
q17 126.52 123.15 -3.375 97.33%
q18 197.63 198.92 1.290 100.65%
q19 28.57 32.29 3.711 112.99%
q20 42.42 42.21 -0.205 99.52%
q21 339.47 332.75 -6.722 98.02%
q22 15.94 16.30 0.361 102.27%
total 1718.46 1717.31 -1.147 99.93%

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JkSelf, we need change to use ENABLE_HDFS3 for the below code, then static libhdfs3 from vcpkg's build can be available to use.

if [ "$ENABLE_HDFS" = "ON" ]; then

Also note CMAKE_FIND_LIBRARY_SUFFIXES is set to ".so" when finding libhdfs3, but this patch's purpose is to use static linkage. Is it simply for some test?

@@ -104,6 +110,9 @@ function compile {
if [ $ENABLE_HDFS == "ON" ]; then
COMPILE_OPTION="$COMPILE_OPTION -DVELOX_ENABLE_HDFS=ON"
fi
if [ $ENABLE_HDFS3 == "ON" ]; then
COMPILE_OPTION="$COMPILE_OPTION -DVELOX_ENABLE_HDFS3=ON"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the pr facebookincubator/velox@10cdf6f , velox have removed VELOX_ENABLE_HDFS3

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_7697_time.csv log/native_master_10_31_2024_2662646eaf_time.csv difference percentage
q1 52.95 54.78 1.834 103.46%
q2 30.61 30.21 -0.401 98.69%
q3 53.38 53.76 0.380 100.71%
q4 41.93 43.43 1.492 103.56%
q5 105.80 103.39 -2.416 97.72%
q6 11.68 12.79 1.102 109.43%
q7 110.08 114.32 4.242 103.85%
q8 118.13 117.17 -0.959 99.19%
q9 173.88 172.95 -0.927 99.47%
q10 66.10 66.45 0.357 100.54%
q11 25.93 26.85 0.915 103.53%
q12 32.66 32.71 0.050 100.15%
q13 51.56 52.88 1.326 102.57%
q14 22.20 21.84 -0.368 98.34%
q15 53.40 51.29 -2.111 96.05%
q16 18.44 16.89 -1.548 91.61%
q17 126.05 123.15 -2.901 97.70%
q18 196.31 198.92 2.615 101.33%
q19 29.27 32.29 3.020 110.32%
q20 43.09 42.21 -0.883 97.95%
q21 339.59 332.75 -6.845 97.98%
q22 16.01 16.30 0.290 101.81%
total 1719.04 1717.31 -1.734 99.90%

@@ -26,6 +27,10 @@ for arg in "$@"; do
ENABLE_HDFS=("${arg#*=}")
shift # Remove argument name from processing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does allow people enable flag both ENABLE_HDFS and ENABLE_HDFS3 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can't see performance gain from compile time link, we will close the PR.

@JkSelf JkSelf force-pushed the lihbdfs3-support branch 2 times, most recently from 98f7efe to c6af67a Compare November 12, 2024 08:26
@github-actions github-actions bot added CORE works for Gluten Core CLICKHOUSE DATA_LAKE labels Nov 12, 2024
Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

1 similar comment
Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants