You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Among the many HLT crashes in run-388769 (PbPb collisions in 2024), one looked different from the ones reported in #46783. It was a segmentation violation, and the original stack trace can be found here: old_hlt_run388769_pid4080142.log. It contains
Thread 15 (Thread 0x7fea91bff700 (LWP 4082750) "cmsRun"):
#0 0x00007feb217890e1 in poll () from /lib64/libc.so.6
#1 0x00007feb0c32e6e7 in edm::service::InitRootHandlers::stacktraceFromThread() () from /opt/offline/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_5/lib/el8_amd64_gcc12/scram_x86-64-v3/pluginFWCoreServicesPlugins.so
#2 0x00007feb0c32e8e4 in sig_dostack_then_abort () from /opt/offline/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_5/lib/el8_amd64_gcc12/scram_x86-64-v3/pluginFWCoreServicesPlugins.so
#3 <signal handler called>
#4 0x00007fea2aec8070 in SiPixelDigisClustersFromSoAAlpaka<pixelTopology::HIonPhase1>::produce(edm::StreamID, edm::Event&, edm::EventSetup const&) const () from /opt/offline/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_5/lib/el8_amd64_gcc12/scram_x86-64-v3/pluginRecoLocalTrackerSiPixelClusterizerPlugins.so
#5 0x00007feb2421cca2 in edm::global::EDProducerBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /opt/offline/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_5/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#6 0x00007feb2421613c in edm::WorkerT<edm::global::EDProducerBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /opt/offline/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_5/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
[...]
Current Modules:
Module: SiPixelDigisClustersFromSoAAlpakaHIonPhase1:hltSiPixelClustersPPOnAA (crashed)
Using the input file in question, a crash can be reproduced with the script in [1] (tested it on lxplus800, with GPU offloading enabled, using CMSSW_15_0_0_pre1; I used the latter pre-release just for convenience; the behavior is the same in 14_1_X, as far as I can see). It's worth noting that the output of the reproducer is not always the same: at times it crashes with an output like [2], while other times it ends with the same exception as in #46783.
The problem seems to be related to the pixel local reconstruction. I'm opening a separate issue in case the problem behind this crash is not exactly the same as the problem behind #46783.
Among the many HLT crashes in run-388769 (PbPb collisions in 2024), one looked different from the ones reported in #46783. It was a segmentation violation, and the original stack trace can be found here: old_hlt_run388769_pid4080142.log. It contains
Using the input file in question, a crash can be reproduced with the script in [1] (tested it on
lxplus800
, with GPU offloading enabled, usingCMSSW_15_0_0_pre1
; I used the latter pre-release just for convenience; the behavior is the same in14_1_X
, as far as I can see). It's worth noting that the output of the reproducer is not always the same: at times it crashes with an output like [2], while other times it ends with the same exception as in #46783.The problem seems to be related to the pixel local reconstruction. I'm opening a separate issue in case the problem behind this crash is not exactly the same as the problem behind #46783.
[1]
[2]
The text was updated successfully, but these errors were encountered: