-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSUnmerged: Manual cleanup of /store/unmerged area at T2_CH_CERN #11904
Comments
And Here is the Jira ticket for requesting access to the Once we are done with the manual cleanup we should come back and crosscheck if these permission errors from this Jira ticket are still present: https://its.cern.ch/jira/browse/CMSVOC-491. As mentioned in the ticket as well we currently cannot even load the the needed objects in memory and run the MSUnmerged service for [1]
|
@todor-ivanov before getting write access to the unmerged area, why don't you temporarily increase the resource requirements of the service? Such that it can go through the first run and get back into a more normal load. Please also help me understanding why we have accumulated so much unneeded data at CERN. Is it: |
Only the the combination of the last 3:
We did have a certificate issue for |
Given the lack of communication in the ticket #491 above, should we re-open it? For the watchdog kill, are you seeing it in kubernetes or in your own environment? |
It is our Watchdog who is killing the process with I do not have the permissions to reopen this ticket, I hope @arooshap have: https://its.cern.ch/jira/browse/CMSVOC-491?focusedId=6244590&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-6244590 |
I don't understand why this is a high priority issue. This seems like it should be an "Operations" issue. We didn't plan for this. It's extra and only being considered because of it's operational impact. @todor-ivanov, if this is correct can you please recategorize this as an Operations issue? |
Hi @klannon I did categorize it and labeled it as Operations from the very beginning. The only think I can do is to completely remove the |
@todor-ivanov That is not correct. You need to set |
Finally CERN is clean: [1] I am closing this issue now. |
Awesome! Should we reinsert T2_CH_CERN into the MSUnmerged configuration - once the unmerged consistency monitor dump is in a more manageable size? |
I think we can do it immediately |
Impact of the bug
T2_CH_CERN site
Describe the bug
As a consequence of the bug: #11893 we have accumulated almost 1 PB of data in the
/store/unmerged
area atT2_CH_CERN
. The combination of the previously prolonged fix of the Gfal recursive errors leads to a lot of empty directories being accumulating as well. This two govern the necessity of initial manual intervention for cleaning those either trough/eos
mount point.For the later we have to either ask the VOC for delete permissions at
/eos/cms/store/unmerged
area or provide a list of deletable objects and let him or DM Team to do the deletionsHow to reproduce it
N/A
Expected behavior
Follow up on the manual intervention and delete all unneeded objects
Additional context and error message
None
The text was updated successfully, but these errors were encountered: