Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(base): add stacktrace to replace backtrace #16643

Merged
merged 100 commits into from
Nov 11, 2024

Conversation

zhang2014
Copy link
Member

@zhang2014 zhang2014 commented Oct 19, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

refactor(base): add stacktrace to replace backtrace

// Rewrite the backtrace on linux ELF using gimli-rs.
//
// Differences from backtrace-rs[https://github.com/rust-lang/backtrace-rs]:
// - Almost lock-free (backtrace-rs requires large-grained locks or frequent lock operations)
// - Symbol resolution is lazy, only resolved when outputting
// - Cache the all stack frames for the stack, not just a single stack frame
// - Output the physical addresses of the stack instead of virtual addresses, even in the absence of symbols (this will help us use backtraces to get cause in the case of splitted symbol tables)
// - Output inline functions and marked it
//
// What's different from gimli-addr2line[https://github.com/gimli-rs/addr2line](why not use gimli-addr2line):
// - Use aranges to optimize the lookup of DWARF units (if present)
// - gimli-addr2line caches and sorts the symbol tables to speed up symbol lookup, which would introduce locks and caching (but in reality, symbol lookup is a low-frequency operation in databend, and rapid reconstruction based on mmap is sufficient).

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - ci build pass

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Oct 19, 2024
@zhang2014
Copy link
Member Author

zhang2014 commented Oct 19, 2024

binary size:

ls -lsh ./target/release/databend-query*
332M -rwxr-xr-x 2 ubuntu ubuntu 332M Oct 19 11:07 ./target/release/databend-query
1.2G -rw-r--r-- 1 ubuntu ubuntu 1.2G Oct 19 11:16 ./target/release/databend-query.debug

default:

./target/release/databend-query


   0: backtrace::backtrace::libunwind::trace[inlined]
             at /opt/rust/cargo/git/checkouts/backtrace-rs-fb1f822361417489/72265be/src/backtrace/libunwind.rs:116:5
   1: backtrace::backtrace::trace_unsynchronized[inlined]
             at /opt/rust/cargo/git/checkouts/backtrace-rs-fb1f822361417489/72265be/src/backtrace/mod.rs:66:5
   2: databend_common_exception::exception_backtrace::StackTrace::capture_frames[inlined]
             at /workspace/src/common/exception/src/exception_backtrace.rs:150:13
   3: databend_common_exception::exception_backtrace::StackTrace::capture@50cee64
             at /workspace/src/common/exception/src/exception_backtrace.rs:143:9
   4: databend_query::main@9194388
             at /workspace/src/binaries/query/ee_main.rs:42:23
   5: core::ops::function::FnOnce::call_once[inlined]
             at /rustc/cf2df68d1f5e56803c97d91e2b1a9f1c9923c533/library/core/src/ops/function.rs:250:5
   6: std::sys::backtrace::__rust_begin_short_backtrace@9194c24

remove debug file

rm ./target/release/databend-query.debug 
./target/release/databend-query


   0: <unknown>@50cee64
   1: databend_query::main::hb7b11b0ec1f24acb@9194388
   2: <unknown>@9194c24
   3: <unknown>@919d644
   4: <unknown>@a677100
   5: <unknown>@9194844
   6: <unknown>@284c4
   7: __libc_start_main@28598
   8: <unknown>@414d034

use addr2line to parse address

addr2line -e  ./target/databend-query.debug -a 50cee64 -a 9194388 -a 9194c24 -f -i -C
0x00000000050cee64
databend_common_exception::exception_backtrace::StackTrace::capture
/workspace/src/common/exception/src/exception_backtrace.rs:144
0x0000000009194388
databend_query::main
/workspace/src/binaries/query/ee_main.rs:44
0x0000000009194c24
std::sys::backtrace::__rust_begin_short_backtrace
/rustc/cf2df68d1f5e56803c97d91e2b1a9f1c9923c533/library/std/src/sys/backtrace.rs:161

restore debug file

cp target/databend-query.debug target/release/databend-query.debug
./target/release/databend-query


   0: backtrace::backtrace::libunwind::trace[inlined]
             at /opt/rust/cargo/git/checkouts/backtrace-rs-fb1f822361417489/72265be/src/backtrace/libunwind.rs:116:5
   1: backtrace::backtrace::trace_unsynchronized[inlined]
             at /opt/rust/cargo/git/checkouts/backtrace-rs-fb1f822361417489/72265be/src/backtrace/mod.rs:66:5
   2: databend_common_exception::exception_backtrace::StackTrace::capture_frames[inlined]
             at /workspace/src/common/exception/src/exception_backtrace.rs:150:13
   3: databend_common_exception::exception_backtrace::StackTrace::capture@50cee64
             at /workspace/src/common/exception/src/exception_backtrace.rs:143:9
   4: databend_query::main@9194388
             at /workspace/src/binaries/query/ee_main.rs:42:23
   5: core::ops::function::FnOnce::call_once[inlined]
             at /rustc/cf2df68d1f5e56803c97d91e2b1a9f1c9923c533/library/core/src/ops/function.rs:250:5
   6: std::sys::backtrace::__rust_begin_short_backtrace@9194c24

@zhang2014 zhang2014 marked this pull request as ready for review November 7, 2024 16:02
@zhang2014
Copy link
Member Author

zhang2014 commented Nov 8, 2024

aarch64-unknown-linux-gnu and x86_64-unknown-linux-musl test passed.

@andylokandy
Copy link
Collaborator

@zhang2014 Good job! Will you consider to publish it as a crate?

@zhang2014
Copy link
Member Author

@zhang2014 Good job! Will you consider to publish it as a crate?

It‘s only rewritten for ELF, so I think it may not be able to handle all scenarios.

@Xuanwo
Copy link
Member

Xuanwo commented Nov 8, 2024

be able to handle all scenarios.

A crate doesn't need to handle all scenarios. It's quite useful even if it’s only rewritten for ELF. I encourage publishing a crate for us, which may attract other contributors and make it much easier to test and reuse.

@zhang2014 zhang2014 added this pull request to the merge queue Nov 11, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 11, 2024
@zhang2014 zhang2014 added this pull request to the merge queue Nov 11, 2024
Merged via the queue into databendlabs:main with commit 1f712dc Nov 11, 2024
74 checks passed
@zhang2014 zhang2014 deleted the refactor/optimize_backtrace branch November 11, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-refactor this PR changes the code base without new features or bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants