Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query on Event Insertion Threshold per Lumi Block in DBS #104

Open
hassan11196 opened this issue Nov 15, 2023 · 3 comments
Open

Query on Event Insertion Threshold per Lumi Block in DBS #104

hassan11196 opened this issue Nov 15, 2023 · 3 comments

Comments

@hassan11196
Copy link
Member

This issue is to request information about lumi insertions into DBS, There is limitation with the PnR software Unified where lumi blocks with more than 1101 events are inserted into the DBS but not announced due to having excessive events, workflows with such Lumis are referred as BigLumi.

Is the 1101 event threshold per lumi block still necessary? Understanding the rationale behind this limit would be helpful. As per my understanding, this threshold is here for historical reasons and based on Valentins comment in a PnR meeting it might not be relevant anymore.

Can the current DBS system handle lumis with a larger number of events, is there any limit or not? If not, we will remove this threshold from our system else we will not allow workflows with large events per lumi size to run.

Your insights on adjusting or managing this limitation would be greatly appreciated.

Thank you.

@haozturk

@vkuznet
Copy link
Contributor

vkuznet commented Nov 15, 2023

@hassan11196 the DBS limitation is not about concrete number but rather limitation on processing time we allocate for insertion of specific payload. Since DBS is part of CMSWEB, the CMSWEB frontend imposes 5 min limit on ANY HTTP request. Said that, the DBS injection takes given JSON payload and insert it into DBS database (ORACLE). If such payload contains large number of file/lumis this time is dramatically increases and if payload injection will take more then 5 minutes it will be killed and therefore entire transaction to DBS DB will be rolled back. Therefore, it is very hard to define concrete X number of event per lumi since everything depends on actual payload side and processing time on ORACLE db.

If I remember correctly, we concluded that if payload contains 3M-5M file lumi list entries, e.g.

      "file_lumi_list": [
        {
          "lumi_section_num": 0,
          "run_num": 98,
          "event_count": 66
        },

then it will take close to 5min threshold on CMSWEB during the injection. And, we imposed specific constrains that such payload should not exceed 5M limit. @belforte may correct me further, but we did benchmark such injections. Said that, the 1K even threshold per lumi block may be released if total number of file lumi entries will not exceed 3M. Bottom line, we need to perform proper benchmark if we'll release such threshold and see how much time injection will take on DBS DB/CMSWEB side.

@belforte
Copy link
Member

I can't imagine why #events/lumi matters. Surely we have limitaions on #lumis per block (due to insention as Valentin explained) and memory blow up (in CRAB at least) when doing the dictionary manipulation for splitting by lumi in order to process those blocks. And surely "a lot of lumis" overall is bad as it make lumi table big, hard to search etc.
The point is that we do no track events in DBS, we store for each file a list of lumis with the number of events in each of those lumis, but I presume that storing 1101 or 9999 is the same.

@amaltaro
Copy link

On top of what Valentin and Stefano said, having 1, 1000, or 100k events within a lumi makes no difference for DBS.

If Unified has this constraint, that is based on further workflow chaining and time to (re)process those events. As of now, luminosity block is still the smallest processing atomic unit, hence it cannot be splitted between different jobs.

In other words, the concern with "big lumis" is the potential inability to process them in a future workflow. WM and DBS have no problems with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants