-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory consumption higher than 2.5 GB #1725
Comments
@nathanwbrei Does JANA2 have any way of accounting for memory use by factory and service? |
Adding to the previous post this happens for other datasets too and the frequency depends on the dataset. DIS NC 18x275 (JOB ID 1837) for example has 10220 total jobs and the memory exceeded alarm shows up in logs 1571 times or 1571/10220=15% of the time (ignoring reruns here so this is an overestimate) Electron beamgas (JOB ID 1850) on the other hand has 2756 total jobs and the error shows up 1312 times or 1312/2756=47% of the time. We need to bring this under 2-3% for the larger datasets at least because requesting more memory might mean more wait time at the queue.
|
For jobs that succeed, does prmon show leaky behavior? e.g. rising memory use over the course of a job? |
Just as a side note, the number of status==1 particles in the input isn't excessive (<~60). |
Haven't looked in detail yet. But it seems like setting the memory upper limit to 3 G gives much better failure rate for the SIDIS 10x100 (1.9%) and 5x41 (0.4%) set respectively
|
Well, yeah, that's not surprising. But it doesn't resolve the problem and is not sustainable. We have to figure out why, and only then make a decision about whether it is necessary/unavoidable to increase memory limit. |
Looks like most of the memory for baseline DIS CC is consumed in DD4hep, but nothing comes to mind in terms of changes to the geometry. A bit of memory can be freed by applying #1729, I wonder if that is enough to get us through 25.02. |
Environment: (where does this bug occur, have you tried other environments)
main
for latest released): 25.01.1HEAD
for the most recent on git): HEADSteps to reproduce: (give a step by step account of how to trigger the bug)
A significant number of jobs on the pythia6 dataset are requiring more than 2.5 G memory
OWNER BATCH_NAME SUBMITTED DONE RUN IDLE HOLD TOTAL JOB_IDS
rahmans1 25.01.1/epic_craterlake/pythia6-eic-1.0.0_5x41_q2_0to1_ep_noradcor- 1/30 13:56 311 2 500 187 16700 25202904.7-999
Out of the 311+187=498 jobs completed or on hold, 247 had peak RSS greater than 2.5 G.
[rahmans1@ap23 osg_25202904_errors]$ for log in osg_25202904_*.log; do awk '/ResidentSetSize/ { if ($1 > max) max=$1 } END { print "Peak RSS for", FILENAME ":", max " KB" }' "$log"; done
Peak RSS for osg_25202904_100.log: 2644292 KB
Peak RSS for osg_25202904_101.log: 2633652 KB
Peak RSS for osg_25202904_102.log: 2625872 KB
Peak RSS for osg_25202904_103.log: 2637752 KB
Peak RSS for osg_25202904_104.log: 2625772 KB
Peak RSS for osg_25202904_107.log: 2631856 KB
Peak RSS for osg_25202904_108.log: 2633024 KB
Peak RSS for osg_25202904_109.log: 2635700 KB
Peak RSS for osg_25202904_10.log: 2619404 KB
Peak RSS for osg_25202904_110.log: 2629508 KB
Peak RSS for osg_25202904_111.log: 2622676 KB
Peak RSS for osg_25202904_112.log: 2646644 KB
Peak RSS for osg_25202904_113.log: 2625180 KB
Peak RSS for osg_25202904_114.log: 2600980 KB
Peak RSS for osg_25202904_115.log: 2622536 KB
Peak RSS for osg_25202904_116.log: 2630968 KB
Peak RSS for osg_25202904_117.log: 2636600 KB
Peak RSS for osg_25202904_118.log: 2609236 KB
Peak RSS for osg_25202904_119.log: 2627756 KB
Peak RSS for osg_25202904_11.log: 2624416 KB
Peak RSS for osg_25202904_120.log: 2637768 KB
Peak RSS for osg_25202904_121.log: 2639696 KB
Peak RSS for osg_25202904_122.log: 2625084 KB
Peak RSS for osg_25202904_124.log: 2630588 KB
Peak RSS for osg_25202904_126.log: 2613000 KB
Peak RSS for osg_25202904_12.log: 2625276 KB
Peak RSS for osg_25202904_139.log: 2641532 KB
Peak RSS for osg_25202904_13.log: 2618640 KB
Peak RSS for osg_25202904_144.log: 2623116 KB
Peak RSS for osg_25202904_145.log: 2615744 KB
Peak RSS for osg_25202904_146.log: 2640736 KB
Peak RSS for osg_25202904_147.log: 2782280 KB
Peak RSS for osg_25202904_148.log: 2628732 KB
Peak RSS for osg_25202904_149.log: 2699336 KB
Peak RSS for osg_25202904_150.log: 2622788 KB
Peak RSS for osg_25202904_151.log: 2619580 KB
Peak RSS for osg_25202904_152.log: 2680948 KB
Peak RSS for osg_25202904_153.log: 2600524 KB
Peak RSS for osg_25202904_154.log: 2623232 KB
Peak RSS for osg_25202904_155.log: 2622708 KB
Peak RSS for osg_25202904_157.log: 2784176 KB
Peak RSS for osg_25202904_158.log: 2790132 KB
Peak RSS for osg_25202904_15.log: 2629224 KB
Peak RSS for osg_25202904_160.log: 2608848 KB
Peak RSS for osg_25202904_161.log: 2610860 KB
Peak RSS for osg_25202904_163.log: 2629564 KB
Peak RSS for osg_25202904_164.log: 2636156 KB
Peak RSS for osg_25202904_165.log: 2622172 KB
Peak RSS for osg_25202904_166.log: 2625320 KB
Peak RSS for osg_25202904_16.log: 2628268 KB
Peak RSS for osg_25202904_17.log: 2630448 KB
Peak RSS for osg_25202904_186.log: 2739176 KB
Peak RSS for osg_25202904_188.log: 2607300 KB
Peak RSS for osg_25202904_189.log: 2638620 KB
Peak RSS for osg_25202904_18.log: 2634628 KB
Peak RSS for osg_25202904_190.log: 2627728 KB
Peak RSS for osg_25202904_191.log: 2633696 KB
Peak RSS for osg_25202904_192.log: 2623460 KB
Peak RSS for osg_25202904_193.log: 2628712 KB
Peak RSS for osg_25202904_195.log: 2607912 KB
Peak RSS for osg_25202904_196.log: 2622680 KB
Peak RSS for osg_25202904_197.log: 2624984 KB
Peak RSS for osg_25202904_198.log: 2620424 KB
Peak RSS for osg_25202904_19.log: 2608828 KB
Peak RSS for osg_25202904_208.log: 2725344 KB
Peak RSS for osg_25202904_209.log: 2644248 KB
Peak RSS for osg_25202904_20.log: 2626708 KB
Peak RSS for osg_25202904_212.log: 2625444 KB
Peak RSS for osg_25202904_213.log: 2622260 KB
Peak RSS for osg_25202904_214.log: 2613604 KB
Peak RSS for osg_25202904_215.log: 2639060 KB
Peak RSS for osg_25202904_216.log: 2626600 KB
Peak RSS for osg_25202904_217.log: 2630284 KB
Peak RSS for osg_25202904_218.log: 2633208 KB
Peak RSS for osg_25202904_21.log: 2421876 KB
Peak RSS for osg_25202904_224.log: 2621932 KB
Peak RSS for osg_25202904_225.log: 2627532 KB
Peak RSS for osg_25202904_226.log: 2625416 KB
Peak RSS for osg_25202904_227.log: 2619584 KB
Peak RSS for osg_25202904_228.log: 2638320 KB
Peak RSS for osg_25202904_229.log: 2626812 KB
Peak RSS for osg_25202904_22.log: 2633048 KB
Peak RSS for osg_25202904_230.log: 2717336 KB
Peak RSS for osg_25202904_232.log: 2636296 KB
Peak RSS for osg_25202904_23.log: 2625616 KB
Peak RSS for osg_25202904_240.log: 2616592 KB
Peak RSS for osg_25202904_241.log: 2617680 KB
Peak RSS for osg_25202904_242.log: 2609648 KB
Peak RSS for osg_25202904_243.log: 2626540 KB
Peak RSS for osg_25202904_244.log: 2634596 KB
Peak RSS for osg_25202904_245.log: 2592764 KB
Peak RSS for osg_25202904_246.log: 2595708 KB
Peak RSS for osg_25202904_247.log: 2630776 KB
Peak RSS for osg_25202904_248.log: 2608412 KB
Peak RSS for osg_25202904_249.log: 2638788 KB
Peak RSS for osg_25202904_250.log: 2712960 KB
Peak RSS for osg_25202904_251.log: 2628172 KB
Peak RSS for osg_25202904_252.log: 2587732 KB
Peak RSS for osg_25202904_253.log: 2640792 KB
Peak RSS for osg_25202904_254.log: 2630324 KB
Peak RSS for osg_25202904_255.log: 2629340 KB
Peak RSS for osg_25202904_256.log: 2630292 KB
Peak RSS for osg_25202904_257.log: 2616168 KB
Peak RSS for osg_25202904_258.log: 2628708 KB
Peak RSS for osg_25202904_259.log: 2627540 KB
Peak RSS for osg_25202904_261.log: 2627336 KB
Peak RSS for osg_25202904_262.log: 2627164 KB
Peak RSS for osg_25202904_265.log: 2637364 KB
Peak RSS for osg_25202904_266.log: 2627484 KB
Peak RSS for osg_25202904_268.log: 2605808 KB
Peak RSS for osg_25202904_284.log: 2702160 KB
Peak RSS for osg_25202904_285.log: 2619128 KB
Peak RSS for osg_25202904_286.log: 2644776 KB
Peak RSS for osg_25202904_287.log: 2602508 KB
Peak RSS for osg_25202904_288.log: 2625956 KB
Peak RSS for osg_25202904_289.log: 2607972 KB
Peak RSS for osg_25202904_28.log: 2627988 KB
Peak RSS for osg_25202904_290.log: 2613232 KB
Peak RSS for osg_25202904_291.log: 2631024 KB
Peak RSS for osg_25202904_292.log: 2650304 KB
Peak RSS for osg_25202904_293.log: 2626076 KB
Peak RSS for osg_25202904_295.log: 2633088 KB
Peak RSS for osg_25202904_296.log: 2613260 KB
Peak RSS for osg_25202904_297.log: 2623460 KB
Peak RSS for osg_25202904_298.log: 2602964 KB
Peak RSS for osg_25202904_299.log: 2623524 KB
Peak RSS for osg_25202904_29.log: 2603584 KB
Peak RSS for osg_25202904_300.log: 2636844 KB
Peak RSS for osg_25202904_302.log: 2630032 KB
Peak RSS for osg_25202904_303.log: 2639240 KB
Peak RSS for osg_25202904_304.log: 2625792 KB
Peak RSS for osg_25202904_305.log: 2611820 KB
Peak RSS for osg_25202904_306.log: 2601408 KB
Peak RSS for osg_25202904_307.log: 2600168 KB
Peak RSS for osg_25202904_30.log: 2627348 KB
Peak RSS for osg_25202904_313.log: 2624020 KB
Peak RSS for osg_25202904_314.log: 2615496 KB
Peak RSS for osg_25202904_315.log: 2616196 KB
Peak RSS for osg_25202904_316.log: 2606972 KB
Peak RSS for osg_25202904_317.log: 2631924 KB
Peak RSS for osg_25202904_318.log: 2620304 KB
Peak RSS for osg_25202904_319.log: 2627840 KB
Peak RSS for osg_25202904_31.log: 2631068 KB
Peak RSS for osg_25202904_320.log: 2632364 KB
Peak RSS for osg_25202904_321.log: 2642780 KB
Peak RSS for osg_25202904_322.log: 2610708 KB
Peak RSS for osg_25202904_323.log: 2628104 KB
Peak RSS for osg_25202904_324.log: 2623936 KB
Peak RSS for osg_25202904_32.log: 2625276 KB
Peak RSS for osg_25202904_335.log: 2631228 KB
Peak RSS for osg_25202904_336.log: 2626920 KB
Peak RSS for osg_25202904_338.log: 2628280 KB
Peak RSS for osg_25202904_339.log: 2615168 KB
Peak RSS for osg_25202904_33.log: 2632248 KB
Peak RSS for osg_25202904_340.log: 2631144 KB
Peak RSS for osg_25202904_341.log: 2615920 KB
Peak RSS for osg_25202904_342.log: 2615192 KB
Peak RSS for osg_25202904_343.log: 2633472 KB
Peak RSS for osg_25202904_344.log: 2619700 KB
Peak RSS for osg_25202904_345.log: 2617624 KB
Peak RSS for osg_25202904_346.log: 2635236 KB
Peak RSS for osg_25202904_349.log: 2622100 KB
Peak RSS for osg_25202904_34.log: 2629084 KB
Peak RSS for osg_25202904_35.log: 2628852 KB
Peak RSS for osg_25202904_36.log: 2623440 KB
Peak RSS for osg_25202904_37.log: 2626420 KB
Peak RSS for osg_25202904_38.log: 2632200 KB
Peak RSS for osg_25202904_391.log: 2625720 KB
Peak RSS for osg_25202904_392.log: 2700840 KB
Peak RSS for osg_25202904_393.log: 2637124 KB
Peak RSS for osg_25202904_395.log: 2622940 KB
Peak RSS for osg_25202904_396.log: 2622904 KB
Peak RSS for osg_25202904_397.log: 2627188 KB
Peak RSS for osg_25202904_398.log: 2608212 KB
Peak RSS for osg_25202904_399.log: 2626264 KB
Peak RSS for osg_25202904_400.log: 2629332 KB
Peak RSS for osg_25202904_401.log: 2644496 KB
Peak RSS for osg_25202904_402.log: 2634736 KB
Peak RSS for osg_25202904_403.log: 2631556 KB
Peak RSS for osg_25202904_404.log: 2625024 KB
Peak RSS for osg_25202904_405.log: 2617392 KB
Peak RSS for osg_25202904_406.log: 2629652 KB
Peak RSS for osg_25202904_407.log: 2625912 KB
Peak RSS for osg_25202904_409.log: 2624172 KB
Peak RSS for osg_25202904_40.log: 2616604 KB
Peak RSS for osg_25202904_413.log: 2622788 KB
Peak RSS for osg_25202904_436.log: 2625876 KB
Peak RSS for osg_25202904_437.log: 2636116 KB
Peak RSS for osg_25202904_438.log: 2626852 KB
Peak RSS for osg_25202904_439.log: 2663172 KB
Peak RSS for osg_25202904_440.log: 2614152 KB
Peak RSS for osg_25202904_441.log: 2635120 KB
Peak RSS for osg_25202904_442.log: 2633364 KB
Peak RSS for osg_25202904_443.log: 2634956 KB
Peak RSS for osg_25202904_444.log: 2621944 KB
Peak RSS for osg_25202904_445.log: 2629872 KB
Peak RSS for osg_25202904_446.log: 2618224 KB
Peak RSS for osg_25202904_447.log: 2613692 KB
Peak RSS for osg_25202904_448.log: 2627920 KB
Peak RSS for osg_25202904_449.log: 2627552 KB
Peak RSS for osg_25202904_450.log: 2637964 KB
Peak RSS for osg_25202904_451.log: 2623312 KB
Peak RSS for osg_25202904_452.log: 2629712 KB
Peak RSS for osg_25202904_453.log: 2622400 KB
Peak RSS for osg_25202904_457.log: 2617028 KB
Peak RSS for osg_25202904_458.log: 2632260 KB
Peak RSS for osg_25202904_459.log: 2621640 KB
Peak RSS for osg_25202904_461.log: 2622868 KB
Peak RSS for osg_25202904_462.log: 2628236 KB
Peak RSS for osg_25202904_463.log: 2641500 KB
Peak RSS for osg_25202904_464.log: 2627864 KB
Peak RSS for osg_25202904_465.log: 2632244 KB
Peak RSS for osg_25202904_466.log: 2631156 KB
Peak RSS for osg_25202904_467.log: 2623392 KB
Peak RSS for osg_25202904_469.log: 2623260 KB
Peak RSS for osg_25202904_474.log: 2416164 KB
Peak RSS for osg_25202904_475.log: 2636284 KB
Peak RSS for osg_25202904_476.log: 2623360 KB
Peak RSS for osg_25202904_477.log: 2589000 KB
Peak RSS for osg_25202904_47.log: 2621804 KB
Peak RSS for osg_25202904_481.log: 2718756 KB
Peak RSS for osg_25202904_484.log: 2611760 KB
Peak RSS for osg_25202904_485.log: 2621916 KB
Peak RSS for osg_25202904_486.log: 2599576 KB
Peak RSS for osg_25202904_488.log: 2648196 KB
Peak RSS for osg_25202904_489.log: 2629284 KB
Peak RSS for osg_25202904_48.log: 2500956 KB
Peak RSS for osg_25202904_490.log: 2624380 KB
Peak RSS for osg_25202904_491.log: 2626716 KB
Peak RSS for osg_25202904_492.log: 2636212 KB
Peak RSS for osg_25202904_493.log: 2620636 KB
Peak RSS for osg_25202904_494.log: 2620460 KB
Peak RSS for osg_25202904_495.log: 2622148 KB
Peak RSS for osg_25202904_496.log: 2633912 KB
Peak RSS for osg_25202904_498.log: 2782436 KB
Peak RSS for osg_25202904_499.log: 2760256 KB
Peak RSS for osg_25202904_61.log: 2504656 KB
Peak RSS for osg_25202904_63.log: 2560184 KB
Peak RSS for osg_25202904_64.log: 2634648 KB
Peak RSS for osg_25202904_67.log: 2628028 KB
Peak RSS for osg_25202904_68.log: 2621624 KB
Peak RSS for osg_25202904_69.log: 2626144 KB
Peak RSS for osg_25202904_7.log: 2627608 KB
Peak RSS for osg_25202904_92.log: 2622344 KB
Peak RSS for osg_25202904_94.log: 2644336 KB
Peak RSS for osg_25202904_96.log: 2631528 KB
Peak RSS for osg_25202904_98.log: 2642816 KB
Expected Result: (what do you expect when you execute the steps above)
Actual Result: (what do you get when you execute the steps above)
The text was updated successfully, but these errors were encountered: