Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fe.log file frequently outputs Fail to publish version for tablets #52245

Open
zxb2503 opened this issue Oct 23, 2024 · 0 comments
Open

fe.log file frequently outputs Fail to publish version for tablets #52245

zxb2503 opened this issue Oct 23, 2024 · 0 comments
Assignees
Labels
type/bug Something isn't working

Comments

@zxb2503
Copy link

zxb2503 commented Oct 23, 2024

Steps to reproduce the behavior (Required)

The leader fe fe.log file frequently outputs the following error message:

2024-10-23 18:06:24.778+08:00 ERROR (lake-publish-task-198|687) [PublishVersionDaemon.publishPartitionBatch():556] Fail to publish partition 5054238 of txnIds [117796]:
com.starrocks.rpc.RpcException: Fail to publish version for tablets [5055000]: prepare_primary_index: load primary index failed: Corruption: Bad segment file staros://5055000/data/000000000001525e_b0631f8c-61ff-4fcd-b895-b9faaf9fed79.dat: file size 0 < 12
be/src/storage/rowset/segment.cpp:243 Segment::parse_segment_footer(read_file.get(), &footer, footer_length_hint, partial_rowset_footer)
be/src/storage/lake/tablet_manager.cpp:719 segment->open(footer_size_hint, nullptr, lake_io_opts)
be/src/storage/lake/rowset.cpp:322 load_segments(&segments, false)
be/src/storage/persistent_index.cpp:3374 loader->rowset_iterator(pkey_schema, [&](const std::vector& itrs, uint32_t rowset_id) { for (size_t i = 0; i < itrs.size(); i++) { auto itr = itrs[i].get(); if (itr == nullptr) { continue; } while (true) { chunk->reset(); rowids.clear(); auto st = itr->get_next(chunk, &rowids); if (st.is_end_of_file()) { break; } else if (!st.ok()) { return st; } else { Column* pkc = nullptr; if (pk_column != nullptr) { pk_column->reset_column(); do { try { auto catched_setter_L3393 = CurrentThreadCatchSetter(true); { PrimaryKeyEncoder::encode(pkey_schema, chunk, 0, chunk->num_rows(), pk_column.get()); } } catch (std::bad_alloc const&) { MemTracker exceed_tracker = tls_exceed_mem_tracker; tls_exceed_mem_tracker = nullptr; tls_thread_status.set_is_catched(false); if (__builtin_expect(!!(exceed_tracker != nullptr), 1)) { return Status::MemoryLimitExceeded( exceed_tracker->err_msg(fmt::format("try consume:{}", tls_thread_status.try_consume_mem_size()))); } else { return Status::MemoryLimitExceeded("Mem usage has exceed the limit of BE"); } } catch (std::runtime_error const& e) { return Status::RuntimeError(fmt::format("Runtime error: {}", e.what())); } } while (0); pkc = pk_column.get(); } else { pkc = chunk->columns()[0].get(); } uint32_t rssid = rowset_id + i; uint64_t base = ((uint64_t)rssid) << 32; std::vector values; values.reserve(pkc->size()); while (false) static_cast(0), !((__builtin_expect(!(pkc->size() <= rowids.size()), 0))) ? (void)0 : google::logging::internal::LogMessageVoidify() & google::LogMessageFatal("be/src/storage/persistent_index.cpp", 3403).stream() << "Check failed: " "pkc->size() <= rowids.size()" " "; for (uint32_t i = 0; i < pkc->size(); i++) { values.emplace_back(base + rowids[i]); } Status st; if (pkc->is_binary()) { st = insert(pkc->size(), reinterpret_cast<const Slice*>(pkc->raw_data()), values.data(), false); } else { std::vector keys; do { try { auto catched_setter_L3412 = CurrentThreadCatchSetter(true); { keys.reserve(pkc->size()); } } catch (std::bad_alloc const&) { MemTracker* exceed_tracker = tls_exceed_mem_tracker; tls_exceed_mem_tracker = nullptr; tls_thread_status.set_is_catched(false); if (__builtin_expect(!!(exceed_tracker != nullptr), 1)) { return Status::MemoryLimitExceeded( exceed_tracker->err_msg(fmt::format("try consume:{}", tls_thread_status.try_consume_mem_size()))); } else { return Status::MemoryLimitExceeded("Mem usage has exceed the limit of BE"); } } catch (std::runtime_error const& e) { return Status::RuntimeError(fmt::format("Runtime error: {}", e.what())); } } while (0); const auto* fkeys = pkc->continuous_data(); for (size_t i = 0; i < pkc->size(); ++i) { keys.emplace_back(fkeys, _key_size); fkeys += _key_size; } st = insert(pkc->size(), reinterpret_cast<const Slice*>(keys.data()), values.data(), false); } if (!st.ok()) { google::LogMessage("be/src/storage/persistent_index.cpp", 3421, google::GLOG_ERROR).stream() << "load index failed: tablet=" << loader->tablet_id() << " rowset:" << rowset_id << " segment:" << i << " reason: " << st.to_string() << " current_size:" << size(); return st; } } } itr->close(); } return Status::OK(); })
be/src/storage/persistent_index.cpp:5273 _insert_rowsets(loader, pkey_schema, std::move(pk_column)), host: xx.xxx.xx.19
at com.starrocks.lake.Utils.publishVersionBatch(Utils.java:153) ~[starrocks-fe.jar:?]
at com.starrocks.transaction.PublishVersionDaemon.publishPartitionBatch(PublishVersionDaemon.java:546) ~[starrocks-fe.jar:?]
at com.starrocks.transaction.PublishVersionDaemon.lambda$publishLakeTransactionBatchAsync$12(PublishVersionDaemon.java:665) ~[starrocks-fe.jar:?]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:834) ~[?:?]

StarRocks version (Required)

StarRocks 3.3.4-56bcf6f run_mode = shared_data

@zxb2503 zxb2503 added the type/bug Something isn't working label Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants