Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set MALLOC_CONF=junk:true for PR tests #2228

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

smuzaffar
Copy link
Contributor

as proposed in cms-sw/cmssw#44962

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @smuzaffar for branch master.

@smuzaffar, @iarspider, @aandvalenzuela, @cmsbuild can you please review it and eventually sign? Thanks.
@antoniovilela, @rappoccio, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented May 13, 2024

cms-bot internal usage

@smuzaffar
Copy link
Contributor Author

test parameters:

  • full_cmssw = true

@smuzaffar
Copy link
Contributor Author

please test

lets build full cmssw to run all unit tests under MALLOC_CONF=junk:true

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests RelVals AddOn
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-414ef1/39361/summary.html
COMMIT: fe10837
CMSSW: CMSSW_14_1_X_2024-05-13-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2228/39361/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 2 errors in the following unit tests:

---> test testSiStripHitResolution had ERRORS
---> test testSiStripHitEfficiency had ERRORS

RelVals

----- Begin Fatal Exception 13-May-2024 19:56:58 CEST-----------------------
An exception of category 'PFEcalEndcapRecHitCreator' occurred while
   [0] Processing  Event run: 369978 lumi: 547 event: 625471048 stream: 0
   [1] Running path 'MC_PFScouting_v2'
   [2] Calling method for module PFRecHitProducer/'hltParticleFlowRecHitECALUnseeded'
Exception Message:
detid 2779096485not found in geometry
----- End Fatal Exception -------------------------------------------------
  • 141.044A fatal system signal has occurred: segmentation violation
  • 141.046A fatal system signal has occurred: segmentation violation

AddOn Tests

A fatal system signal has occurred: segmentation violation
A fatal system signal has occurred: segmentation violation
A fatal system signal has occurred: segmentation violation
Expand to see more addon errors ...

@mmusich
Copy link
Contributor

mmusich commented Jun 14, 2024

@cmsbuild please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-414ef1/39883/summary.html
COMMIT: fe10837
CMSSW: CMSSW_14_1_X_2024-06-13-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2228/39883/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 2 errors in the following unit tests:

---> test testSiStripHitResolution had ERRORS
---> test testSiStripHitEfficiency had ERRORS

Comparison Summary

Summary:

@makortel
Copy link
Contributor

makortel commented Jul 9, 2024

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2024

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-414ef1/40303/summary.html
COMMIT: fe10837
CMSSW: CMSSW_14_1_X_2024-07-09-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2228/40303/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 2 errors in the following unit tests:

---> test testSiStripHitResolution had ERRORS
---> test testSiStripHitEfficiency had ERRORS

Comparison Summary

Summary:

@makortel
Copy link
Contributor

The testSiStripHitEfficiency test fails in

New IOV starting from run 325172 event 1 lumiBlock 1 time 1
-----------------

-----------------
Global Info
-----------------
BadComponent 		Modules 	Fibers 	Apvs	Strips
----------------------------------------------------------------
<cut>
----------------------------------------------------------------
		   Detid  	Modules Fibers Apvs
----------------------------------------------------------------
<cut>
TEC- Disk 9 :


A fatal system signal has occurred: segmentation violation
The following is the call stack containing the origin of the signal.

Tue Jul  9 22:08:59 CEST 2024
Thread 2 (Thread 0x14cb4250b700 (LWP 226679) "cmsRun"):
#0  0x000014cb696706a2 in waitpid () from /lib64/libpthread.so.0
#1  0x000014cb65b7dd37 in edm::service::cmssw_stacktrace_fork() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#2  0x000014cb65b8043a in edm::service::InitRootHandlers::stacktraceHelperThread() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#3  0x000014cb69cd8a73 in std::execute_native_thread_routine (__p=0x14cb449c8240) at ../../../../../libstdc++-v3/src/c++11/thread.cc:82
#4  0x000014cb696661ca in start_thread () from /lib64/libpthread.so.0
#5  0x000014cb692c18d3 in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x14cb6b2f4680 (LWP 226379) "cmsRun"):
#0  0x000014cb693baac1 in poll () from /lib64/libc.so.6
#1  0x000014cb65b80657 in edm::service::InitRootHandlers::stacktraceFromThread() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#2  0x000014cb65b80854 in sig_dostack_then_abort () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x000014cb6b0b82a2 in TList::FindLink(TObject const*, int&) const () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libCore.so
#5  0x000014cb6b0c0654 in TList::Remove(TObject*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libCore.so
#6  0x000014cb6b0b6136 in THashTable::Remove(TObject*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libCore.so
#7  0x000014cb6b0b545b in THashList::Delete(char const*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libCore.so
#8  0x000014cb6b532426 in TDirectoryFile::~TDirectoryFile() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libRIO.so
#9  0x000014cb6b532489 in TDirectoryFile::~TDirectoryFile() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libRIO.so
#10 0x000014cb6b0b5568 in THashList::Delete(char const*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libCore.so
#11 0x000014cb6b53258a in TDirectoryFile::Close(char const*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libRIO.so
#12 0x000014cb6b550116 in TFile::Close(char const*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libRIO.so
#13 0x000014cb645e10af in TFileService::~TFileService() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libCommonToolsUtilAlgos.so
#14 0x000014cb64607e14 in edm::serviceregistry::ServiceWrapper<TFileService>::~ServiceWrapper() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/pluginCommonToolsToolsUtilAlgos_plugins.so
#15 0x000014cb6c56a09f in edm::serviceregistry::ServicesManager::~ServicesManager() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libFWCoreServiceRegistry.so
#16 0x000014cb6c55ff37 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libFWCoreServiceRegistry.so
#17 0x000014cb6c55ff37 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libFWCoreServiceRegistry.so
#18 0x000014cb6c12dd87 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libFWCoreFramework.so
#19 0x000014cb6c147a55 in edm::EventProcessor::~EventProcessor() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libFWCoreFramework.so
#20 0x0000000000405711 in (anonymous namespace)::EventProcessorWithSentry::~EventProcessorWithSentry() ()
#21 0x00000000004050cd in main ()

Current Modules:

Module: none (crashed)

A fatal system signal has occurred: segmentation violation
/pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/src/CalibTracker/SiStripHitEfficiency/test/test_SiStripHitEfficiency.sh: line 16: 226379 Segmentation fault      (core dumped) cmsRun ${SCRAM_TEST_PATH}/testSiStripHitEffFromCalibTree_cfg.py inputFiles=HitEffTree.root runNumber=325172
failed running testSiStripHitEffFromCalibTree_cfg.py: status 139

The testSiStripHitResolution fails in

Testing SiStripHitResolutionFromCalibTree_cfg.py 
<cut>
Severity    # Occurrences   Total Occurrences
--------    -------------   -----------------
Info                 8986                8986
FwkInfo                 1                   1

dropped waiting message count 0


A fatal system signal has occurred: segmentation violation
The following is the call stack containing the origin of the signal.

Thread 1 (Thread 0x154ae3153680 (LWP 222707) "cmsRun"):
2  0x0000154adddc0854 in sig_dostack_then_abort () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x0000154ae34b82a2 in TList::FindLink(TObject const*, int&) const () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libCore.so
#5  0x0000154ae34c0654 in TList::Remove(TObject*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libCore.so
#6  0x0000154ae34b6136 in THashTable::Remove(TObject*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libCore.so
#7  0x0000154ae34b545b in THashList::Delete(char const*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libCore.so
#8  0x0000154ae3932426 in TDirectoryFile::~TDirectoryFile() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libRIO.so
#9  0x0000154ae3932489 in TDirectoryFile::~TDirectoryFile() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libRIO.so
#10 0x0000154ae34b5568 in THashList::Delete(char const*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libCore.so
#11 0x0000154ae393258a in TDirectoryFile::Close(char const*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libRIO.so
#12 0x0000154ae3950116 in TFile::Close(char const*) () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/external/el8_amd64_gcc12/lib/libRIO.so
#13 0x0000154adc8260af in TFileService::~TFileService() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libCommonToolsUtilAlgos.so
#14 0x0000154adc84ce14 in edm::serviceregistry::ServiceWrapper<TFileService>::~ServiceWrapper() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/pluginCommonToolsToolsUtilAlgos_plugins.so
#15 0x0000154ae47b509f in edm::serviceregistry::ServicesManager::~ServicesManager() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libFWCoreServiceRegistry.so
#16 0x0000154ae47aaf37 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libFWCoreServiceRegistry.so
#17 0x0000154ae47aaf37 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libFWCoreServiceRegistry.so
#18 0x0000154ae432dd87 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libFWCoreFramework.so
#19 0x0000154ae4347a55 in edm::EventProcessor::~EventProcessor() () from /pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/lib/el8_amd64_gcc12/libFWCoreFramework.so
#20 0x0000000000405711 in (anonymous namespace)::EventProcessorWithSentry::~EventProcessorWithSentry() ()
#21 0x00000000004050cd in main ()

Current Modules:

Module: none (crashed)

A fatal system signal has occurred: segmentation violation
/pool/condor/dir_2584642/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-07-09-1100/src/CalibTracker/SiStripHitResolution/test/test_SiStripHitResolution.sh: line 16: 222707 Segmentation fault      (core dumped) cmsRun ${SCRAM_TEST_PATH}/SiStripHitResolutionFromCalibTree_cfg.py
failed running SiStripHitResolutionFromCalibTree_cfg.py: status 139

@mmusich
Copy link
Contributor

mmusich commented Jul 10, 2024

The testSiStripHitEfficiency test fails in

just for the record there is a follow up issue for that since some time cms-sw/cmssw#45084

@makortel
Copy link
Contributor

The testSiStripHitEfficiency test fails in

just for the record there is a follow up issue for that since some time cms-sw/cmssw#45084

Thanks for reminding!

@cmsbuild
Copy link
Contributor

REMINDER @rappoccio, @mandrenguyen, @antoniovilela, @sextonkennedy: This PR was tested with cms-sw/cmssw#46111, please check if they should be merged together

@smuzaffar
Copy link
Contributor Author

hold

Do not merge this, this is just for testing

@cmsbuild
Copy link
Contributor

Pull request has been put on hold by @smuzaffar
They need to issue an unhold command to remove the hold state or L1 can unhold it for all

@cmsbuild cmsbuild added the hold label Sep 24, 2024
@antoniovilela
Copy link

-orp

@smuzaffar smuzaffar force-pushed the master branch 2 times, most recently from 03f42b1 to cf14ee3 Compare December 10, 2024 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants