Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JVM crash at DatabaseTransactionMgr.getCommittedTxnList #52134

Open
murphyatwork opened this issue Oct 21, 2024 · 2 comments
Open

JVM crash at DatabaseTransactionMgr.getCommittedTxnList #52134

murphyatwork opened this issue Oct 21, 2024 · 2 comments
Labels
type/bug Something isn't working

Comments

@murphyatwork
Copy link
Contributor

Steps to reproduce the behavior (Required)

  • StarRocks: main 20241021
  • JVM: openjdk 11.0.24
  • OS: Ubuntu 22.04, kernel 5.15.0-46-generic
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fb46a0361e7, pid=103726, tid=104294
#
# JRE version: OpenJDK Runtime Environment (11.0.24+8) (build 11.0.24+8-post-Ubuntu-1ubuntu322.04)
# Java VM: OpenJDK 64-Bit Server VM (11.0.24+8-post-Ubuntu-1ubuntu322.04, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
#
[error occurred during error reporting (printing problematic frame), id 0xb, SIGSEGV (0xb) at pc=0x00007fb79539d636]

# Core dump will be written. Default location: Core dumps may be processed with "/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h" (or dumping to /home/disk6/murphy/starrocks/output/core.103726)
#
# If you would like to submit a bug report, please visit:
#   https://bugs.launchpad.net/ubuntu/+source/openjdk-lts
#

---------------  S U M M A R Y ------------

Command Line: -Dlog4j2.formatMsgNoLookups=true -Xmx20123m -XX:SurvivorRatio=8 -Xlog:gc*:/home/disk6/murphy/starrocks/output/fe/log/fe.gc.log.20240819-104850:time -XX:ErrorFile=/home/disk6/murphy/starrocks/output/fe/log/hs_err_pid%p.log com.starrocks.StarRocksFE

Host: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz, 104 cores, 376G, Ubuntu 22.04.4 LTS
Time: Mon Aug 19 11:27:08 2024 CST elapsed time: 2297.408672 seconds (0d 0h 38m 17s)

---------------  T H R E A D  ---------------

Current thread (0x00007fb6fc55a000):  JavaThread "PUBLISH_VERSION" daemon [_thread_in_vm, id=104294, stack(0x00007fb520f00000,0x00007fb521000000)]

Stack: [0x00007fb520f00000,0x00007fb521000000],  sp=0x00007fb520ffd970,  free space=1014k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)

[error occurred during error reporting (printing native stack), id 0xb, SIGSEGV (0xb) at pc=0x00007fb79539d636]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v  ~RuntimeStub::_new_instance_Java
J 13101 c2 java.util.stream.SortedOps$RefSortingSink.begin(J)V java.base@11.0.24 (48 bytes) @ 0x00007fb77cf7f444 [0x00007fb77cf7f240+0x0000000000000204]
J 9214 c2 java.util.stream.ReferencePipeline$2$1.begin(J)V java.base@11.0.24 (13 bytes) @ 0x00007fb77cc2a0e0 [0x00007fb77cc29fc0+0x0000000000000120]
J 9837 c2 java.util.stream.AbstractPipeline.wrapAndCopyInto(Ljava/util/stream/Sink;Ljava/util/Spliterator;)Ljava/util/stream/Sink; java.base@11.0.24 (18 bytes) @ 0x00007fb77cd17e38 [0x00007fb77cd17b40+0x00000000000002f8]
J 7374 c2 java.util.stream.ReduceOps$ReduceOp.evaluateSequential(Ljava/util/stream/PipelineHelper;Ljava/util/Spliterator;)Ljava/lang/Object; java.base@11.0.24 (18 bytes) @ 0x00007fb77c9605a8 [0x00007fb77c9604c0+0x00000000000000e8]
J 13132 c2 com.starrocks.transaction.DatabaseTransactionMgr.getCommittedTxnList()Ljava/util/List; (66 bytes) @ 0x00007fb77cfef230 [0x00007fb77cfee300+0x0000000000000f30]
J 17911 c2 com.starrocks.transaction.GlobalTransactionMgr.getReadyToPublishTransactions(Z)Ljava/util/List; (75 bytes) @ 0x00007fb77cb33e50 [0x00007fb77cb33820+0x0000000000000630]
J 27867% c2 com.starrocks.common.util.Daemon.run()V (126 bytes) @ 0x00007fb77d540424 [0x00007fb77d540280+0x00000000000001a4]
v  ~StubRoutines::call_stub

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00000800aa607fff

Expected behavior (Required)

Real behavior (Required)

StarRocks version (Required)

  • You can get the StarRocks version by executing SQL select current_version()
@murphyatwork murphyatwork added the type/bug Something isn't working label Oct 21, 2024
@murphyatwork
Copy link
Contributor Author

murphyatwork commented Oct 22, 2024

got a new crash with openjdk-17:

  • triggered by libasyncProfiler.so
#1  __pthread_kill_internal (signo=6, threadid=139758955132480) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=139758955132480, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007f1cbfccd476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007f1cbfcb37f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007f1cbeb565e7 in os::abort(bool, void*, void const*) [clone .cold] () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#6  0x00007f1cbf83c94c in VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#7  0x00007f1cbf83d31f in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*, char const*, ...) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#8  0x00007f1cbf83d352 in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#9  <signal handler called>
#10 0x00007f1cbfdff636 in determine_info (symbolp=0x0, mapp=0x0, info=0x7f1c2adf3440, match=0x7f1810001fa0, addr=139741649656068) at ./elf/dl-addr.c:61
#11 _dl_addr (address=0x7f1823635904 <PerfEvents::signalHandler(int, siginfo*, void*)+84>, info=0x7f1c2adf3440, mapp=0x0, symbolp=0x0) at ./elf/dl-addr.c:137
#12 0x00007f1cbf54297a in os::find(unsigned char*, outputStream*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#13 0x00007f1cbf53a20b in os::print_location(outputStream*, long, bool) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#14 0x00007f1cbf83c083 in VMError::report(outputStream*, bool) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#15 0x00007f1cbf83c855 in VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#16 0x00007f1cbf83d31f in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*, char const*, ...) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#17 0x00007f1cbf83d352 in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#18 <signal handler called>
#19 0x00007f1cbfdff636 in determine_info (symbolp=0x0, mapp=0x0, info=0x7f1c2adf5900, match=0x7f1810001fa0, addr=139741649479528) at ./elf/dl-addr.c:61
#20 _dl_addr (address=0x7f182360a768 <PerfEvents::readCounter(siginfo*, void*) [clone .isra.667] [clone .part.668]+24>, info=0x7f1c2adf5900, mapp=0x0, symbolp=0x0) at ./elf/dl-addr.c:137
#21 0x00007f1cbf53cf27 in os::address_is_in_vm(unsigned char*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#22 0x00007f1cbefd7ed9 in frame::print_C_frame(outputStream*, char*, int, unsigned char*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#23 0x00007f1cbf838466 in VMError::print_native_stack(outputStream*, frame, Thread*, char*, int) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#24 0x00007f1cbf83a0b9 in VMError::report(outputStream*, bool) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#25 0x00007f1cbf83c855 in VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#26 0x00007f1cbf83d31f in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*, char const*, ...) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#27 0x00007f1cbf83d352 in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#28 <signal handler called>
#29 0x00007f1cbfdff636 in determine_info (symbolp=0x0, mapp=0x0, info=0x7f1c2adf7ec0, match=0x7f1810001fa0, addr=139741649479528) at ./elf/dl-addr.c:61
#30 _dl_addr (address=0x7f182360a768 <PerfEvents::readCounter(siginfo*, void*) [clone .isra.667] [clone .part.668]+24>, info=0x7f1c2adf7ec0, mapp=0x0, symbolp=0x0) at ./elf/dl-addr.c:137
#31 0x00007f1cbf53cf27 in os::address_is_in_vm(unsigned char*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#32 0x00007f1cbefd7ed9 in frame::print_C_frame(outputStream*, char*, int, unsigned char*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#33 0x00007f1cbf83a832 in VMError::report(outputStream*, bool) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#34 0x00007f1cbf83c855 in VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#35 0x00007f1cbf83d31f in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*, char const*, ...) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#36 0x00007f1cbf83d352 in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#37 <signal handler called>
#38 0x00007f1cbfdff636 in determine_info (symbolp=0x0, mapp=0x0, info=0x7f1c2adfa330, match=0x7f1810001fa0, addr=139741649479528) at ./elf/dl-addr.c:61
#39 _dl_addr (address=0x7f182360a768 <PerfEvents::readCounter(siginfo*, void*) [clone .isra.667] [clone .part.668]+24>, info=0x7f1c2adfa330, mapp=0x0, symbolp=0x0) at ./elf/dl-addr.c:137
#40 0x00007f1cbf53cf27 in os::address_is_in_vm(unsigned char*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#41 0x00007f1cbefd7ed9 in frame::print_C_frame(outputStream*, char*, int, unsigned char*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#42 0x00007f1cbf83a832 in VMError::report(outputStream*, bool) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#43 0x00007f1cbf83c813 in VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#44 0x00007f1cbf83d31f in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*, char const*, ...) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#45 0x00007f1cbf83d352 in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*) () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#46 0x00007f1cbf6dae83 in JVM_handle_linux_signal () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#47 <signal handler called>
#48 0x0000000000004856 in ?? ()
#49 0x00007f182360a768 in PerfEvents::readCounter(siginfo*, void*) [clone .isra.667] [clone .part.668] () from /home/disk6/murphy/starrocks/output/fe/bin/build/libasyncProfiler.so
#50 0x00007f1823635904 in PerfEvents::signalHandler(int, siginfo*, void*) () from /home/disk6/murphy/starrocks/output/fe/bin/build/libasyncProfiler.so

@murphyatwork
Copy link
Contributor Author

anther crash:

Current thread (0x00007fdb20824770):  JavaThread "JournalWriter" daemon [_thread_in_vm, id=801519, stack(0x00007fdb76d00000,0x00007fdb76e00000)]

Stack: [0x00007fdb76d00000,0x00007fdb76e00000],  sp=0x00007fdb76dfd6b0,  free space=1013k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)

[error occurred during error reporting (printing native stack), id 0xb, SIGSEGV (0xb) at pc=0x00007fdc0bfe0636]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v  ~RuntimeStub::_new_instance_Java
J 11122 c2 com.sleepycat.je.dbi.CursorImpl.insertRecordInternal([BLcom/sleepycat/je/tree/LN;Lcom/sleepycat/je/dbi/ExpirationInfo;ZLcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/log/ReplicationContext;)Lcom/sleepycat/je/utilint/Pair; (598 bytes) @ 0x00007fdbf5b71e10 [0x00007fdbf5b6fe40+0x0000000000001fd0]
J 11288 c2 com.sleepycat.je.dbi.CursorImpl.insertOrUpdateRecord(Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/tree/LN;Lcom/sleepycat/je/dbi/ExpirationInfo;Lcom/sleepycat/je/dbi/PutMode;Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/log/ReplicationContext;)Lcom/sleepycat/je/OperationResult; (499 bytes) @ 0x00007fdbf5780530 [0x00007fdbf5780320+0x0000000000000210]
J 11287 c2 com.sleepycat.je.Cursor.putNoNotify(Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/tree/LN;Lcom/sleepycat/je/CacheMode;Lcom/sleepycat/je/dbi/ExpirationInfo;Lcom/sleepycat/je/dbi/PutMode;Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/log/ReplicationContext;)Lcom/sleepycat/je/OperationResult; (408 bytes) @ 0x00007fdbf56d5860 [0x00007fdbf56d5740+0x0000000000000120]
J 11286 c2 com.sleepycat.je.Cursor.putNotify(Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/tree/LN;Lcom/sleepycat/je/CacheMode;Lcom/sleepycat/je/dbi/ExpirationInfo;Lcom/sleepycat/je/dbi/PutMode;Lcom/sleepycat/je/log/ReplicationContext;)Lcom/sleepycat/je/OperationResult; (558 bytes) @ 0x00007fdbf55582c0 [0x00007fdbf5558160+0x0000000000000160]
J 11130 c2 com.sleepycat.je.Database.put(Lcom/sleepycat/je/Transaction;Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/Put;Lcom/sleepycat/je/WriteOptions;)Lcom/sleepycat/je/OperationResult; (187 bytes) @ 0x00007fdbf5b7e318 [0x00007fdbf5b7dac0+0x0000000000000858]
J 11143 c2 com.starrocks.journal.bdbje.CloseSafeDatabase.put(Lcom/sleepycat/je/Transaction;Lcom/sleepycat/je/DatabaseEntry;Lcom/sleepycat/je/DatabaseEntry;)Lcom/sleepycat/je/OperationStatus; (62 bytes) @ 0x00007fdbf5b8cb38 [0x00007fdbf5b8c920+0x0000000000000218]
J 11290 c2 com.starrocks.journal.bdbje.BDBJEJournal.batchWriteAppend(JLcom/starrocks/common/io/DataOutputBuffer;)V (296 bytes) @ 0x00007fdbf5716238 [0x00007fdbf5715d80+0x00000000000004b8]
J 11297 c2 com.starrocks.journal.JournalWriter.writeOneBatch()V (447 bytes) @ 0x00007fdbf5c08044 [0x00007fdbf5c078a0+0x00000000000007a4]
J 11296 c2 com.starrocks.journal.JournalWriter$1.runOneCycle()V (33 bytes) @ 0x00007fdbf58d2b84 [0x00007fdbf58d2b40+0x0000000000000044]
J 8011% c2 com.starrocks.common.util.Daemon.run()V (102 bytes) @ 0x00007fdbf5637648 [0x00007fdbf5637540+0x0000000000000108]
v  ~StubRoutines::call_stub

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant