-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCP/PROTO/RECONFIG: Fix copy header handling in reconfig. #10452
base: master
Are you sure you want to change the base?
UCP/PROTO/RECONFIG: Fix copy header handling in reconfig. #10452
Conversation
src/ucp/proto/proto_reconfig.c
Outdated
@@ -51,6 +53,15 @@ static ucs_status_t ucp_proto_reconfig_progress(uct_pending_req_t *self) | |||
return UCS_OK; | |||
} | |||
|
|||
if (ucs_unlikely(ucp_proto_config_is_am(req->send.proto_config) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can there be ucp_proto_request_restart()
call (from wireup) on this request before any other AM proto calls, but after ucp_proto_reconfig_progress()
? If yes I suspect this could be an issue because of:
-
Line 1758 in c4e66dc
ucs_assert(!(request->flags & UCP_REQUEST_FLAG_USER_HEADER_COPIED)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I see ucp_proto_request_restart
calls reset then proto selection is done and ucp_request_send
is called again.
If it choose ZCOPY then ucp_am_eager_zcopy_pack_user_header
will be called and will set this flag to 0, but perhaps I'm wrong.
test/gtest/ucp/test_ucp_sockaddr.cc
Outdated
&sb[0], size, ¶m); | ||
if (flags & UCP_AM_SEND_FLAG_COPY_HEADER) { | ||
ucs::fill_random(shdr_cpy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe fill with specific value, and also free?
|
||
#include <ucp/core/ucp_worker.inl> | ||
#include <ucp/am/ucp_am.inl> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Order
@@ -11,8 +11,10 @@ | |||
#include "proto_debug.h" | |||
#include "proto_select.h" | |||
#include "proto_common.inl" | |||
#include "proto_am.inl" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Order
if (ucp_proto_config_is_am(req->send.proto_config)) { | ||
ucp_am_release_user_header(req); | ||
} | ||
ucp_request_complete_send(req, status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add blank line
src/ucp/proto/proto_reconfig.c
Outdated
static ucs_status_t ucp_proto_reconfig_progress(uct_pending_req_t *self) | ||
{ | ||
ucp_request_t *req = ucs_container_of(self, ucp_request_t, send.uct); | ||
ucp_ep_h ep = req->send.ep; | ||
ucs_status_t status; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move down
test/gtest/ucp/test_ucp_sockaddr.cc
Outdated
@@ -2526,6 +2526,7 @@ class test_ucp_sockaddr_protocols : public test_ucp_sockaddr { | |||
std::string sb(size, 'x'); | |||
std::string rb(size, 'y'); | |||
std::string shdr(hdr_size, 'x'); | |||
std::string shdr_cpy = shdr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cpy
-> copy
test/gtest/ucp/test_ucp_sockaddr.cc
Outdated
&sb[0], size, ¶m); | ||
if (flags & UCP_AM_SEND_FLAG_COPY_HEADER) { | ||
shdr_cpy[0] = 'X'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add an explanation (comment)
test/gtest/ucp/test_ucp_sockaddr.cc
Outdated
* To check UCP_AM_SEND_FLAG_COPY_HEADER we change AM header | ||
* content while the request is still in pending queue.*/ | ||
if (flags & UCP_AM_SEND_FLAG_COPY_HEADER) { | ||
shdr_copy[0] = 'X'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- the shdr_copy should remain the original value and shdr should be "corrupted"
- i would fill entire "corrupted" header with aaa..
src/ucp/proto/proto_reconfig.c
Outdated
UCP_AM_SEND_FLAG_COPY_HEADER))) { | ||
status = ucp_proto_am_req_copy_header(req); | ||
if (ucs_unlikely(status != UCS_OK)) { | ||
return status; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if copy fails we should abort the operation and return OK from progress. pls see how it's done in other places ucp_proto_am_req_copy_header is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After checking code again we noticed that also for other protocols in protov2 status != UCS_OK
case is not handled properly. I will fix it and add it to this PR.
src/ucp/proto/proto_reconfig.c
Outdated
return UCS_OK; | ||
} | ||
|
||
if (ucs_unlikely(ucp_proto_config_is_am(req->send.proto_config) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not fast path so ucs_unlikely is not needed here
What?
ucp_proto_reconfig_progress
.status != UCS_OK
correctly.Why?
When using sockaddr to establish connection requests go into pending queue but user header is not being copy even if
UCP_AM_SEND_FLAG_COPY_HEADER
is enabled. It breaks API and can cause data corruption.fix #10424
How?
Copy user header to internal UCX buffer if needed like done in other protocols.