-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dynamic Buffer][Mellanox] Skip PGs in pending deleting set while checking accumulative headroom of a port #2871
[Dynamic Buffer][Mellanox] Skip PGs in pending deleting set while checking accumulative headroom of a port #2871
Conversation
…ive headroom of a port Signed-off-by: Stephen Sun <[email protected]>
/azpw run |
/AzurePipelines run |
Azure Pipelines successfully started running 1 pipeline(s). |
… be counted Signed-off-by: Stephen Sun <[email protected]>
The vs test case With the fix, the test succeeds
Without the fix, the test fails as below (which is expected).
|
…cking accumulative headroom of a port (#2871) **What I did** Skip the PGs that are about to be removed in the Lua script while checking the accumulative headroom of a port. **Why I did it** This is to avoid the following error message ``` Jul 26 14:59:04.754566 r-moose-02 ERR swss#buffermgrd: :- guard: RedisReply catches system_error: command: .*, reason: ERR Error running script (call to f_a02f6f856d876d607a7ac81f5bc0890cad68bf71): @user_script:125: user_script:125: attempt to perform arithmetic on local 'current_profile_size' (a nil value): Input/output error ``` This is because 1. Too many notifications were accumulated in the `m_toSync` queue, belonging to `BUFFER_PROFILE_TABLE` and `BUFFER_PG_TABLE` 2. Even the buffer manager removed the buffer PG ahead of buffer profile, the buffer profile was handled ahead of buffer PG in the orchagent, which means they were handled in reverse order. As a result, the notification of buffer PG was still in `BUFFER_PG_TABLE_DELSET` set and remained in `BUFFER_PG_TABLE` while the buffer profile was removed from `BUFFER_PROFILE_TABLE`. 3. When it checked the accumulative headroom, it fetched all items from the APPL_DB tables and got the to-be-removed buffer PG but didn't get the buffer profile because it had been removed in 2. 4. As a result, it complained the `1current_profile_size` was nil which was the consequence of 3. Fix: Do not check buffer PGs that are in the `BUFFER_PG_TABLE_DELSET`. **How I verified it** Regression and manual test.
…cking accumulative headroom of a port (#2871) **What I did** Skip the PGs that are about to be removed in the Lua script while checking the accumulative headroom of a port. **Why I did it** This is to avoid the following error message ``` Jul 26 14:59:04.754566 r-moose-02 ERR swss#buffermgrd: :- guard: RedisReply catches system_error: command: .*, reason: ERR Error running script (call to f_a02f6f856d876d607a7ac81f5bc0890cad68bf71): @user_script:125: user_script:125: attempt to perform arithmetic on local 'current_profile_size' (a nil value): Input/output error ``` This is because 1. Too many notifications were accumulated in the `m_toSync` queue, belonging to `BUFFER_PROFILE_TABLE` and `BUFFER_PG_TABLE` 2. Even the buffer manager removed the buffer PG ahead of buffer profile, the buffer profile was handled ahead of buffer PG in the orchagent, which means they were handled in reverse order. As a result, the notification of buffer PG was still in `BUFFER_PG_TABLE_DELSET` set and remained in `BUFFER_PG_TABLE` while the buffer profile was removed from `BUFFER_PROFILE_TABLE`. 3. When it checked the accumulative headroom, it fetched all items from the APPL_DB tables and got the to-be-removed buffer PG but didn't get the buffer profile because it had been removed in 2. 4. As a result, it complained the `1current_profile_size` was nil which was the consequence of 3. Fix: Do not check buffer PGs that are in the `BUFFER_PG_TABLE_DELSET`. **How I verified it** Regression and manual test.
What I did
Skip the PGs that are about to be removed in the Lua script while checking the accumulative headroom of a port.
Signed-off-by: Stephen Sun [email protected]
Why I did it
This is to avoid the following error message
This is because
m_toSync
queue, belonging toBUFFER_PROFILE_TABLE
andBUFFER_PG_TABLE
BUFFER_PG_TABLE_DELSET
set and remained inBUFFER_PG_TABLE
while the buffer profile was removed fromBUFFER_PROFILE_TABLE
.1current_profile_size
was nil which was the consequence of 3.Fix:
Do not check buffer PGs that are in the
BUFFER_PG_TABLE_DELSET
.How I verified it
Regression and manual test.
Details if related