Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle EINTR return code from tcdrain to effectively flush #225

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ndusart
Copy link

@ndusart ndusart commented Oct 23, 2024

Hi, this PR adds the handling of EINTR return code from tcdrain.

If returning EINTR, it does not mean an actual error for the flushing operation and this should be retried.

@sirhcel
Copy link
Contributor

sirhcel commented Oct 23, 2024

Thank you for looking into this and opening this PR!

Recently there was a discussion regarding EINTR and returning errors in #215 (comment). We discussed this in the context of std::io::Read::read and for this operation it was clearly stated that read is supposed to return an error with ErrorKind::Interrupted in this case.

The same is stated for std::io::Write::write and from my gut instinct I would have expected it to be the same for flush. But it isn't. So repeating the underlying flush operation is likely the right thing to do in this case like for example the default implementation of Write::write_all documents and does. Let me sleep over it.

Nonetheless, the (previously) returned error kind ErrorKind::Other is not what I would have expected here nor does the documentation for flush mandate such a behavior. Shouldn't the underlying error from tcdrain being "bubbled up" like in is done for read with From<nix::Error> for Error?

@ndusart
Copy link
Author

ndusart commented Oct 23, 2024

@sirhcel if the specific error is forwarded as an ErrorKind::Interrupted that would indeed be a solution too, the main issue here is that this error cannot be differentiated in regards to other errors.

It is probably a better approach to indeed let the client side decide how to handle this error.

Maybe we should instead have a method in std::io::Write trait similar to write_all but for flush. But, definitely not the place to discuss this ^^'

I'd favor your proposition then :)

@sirhcel
Copy link
Contributor

sirhcel commented Oct 23, 2024

@sirhcel if the specific error is forwarded as an ErrorKind::Interrupted that would indeed be a solution too, the main issue here is that this error cannot be differentiated in regards to other errors.

Then we agree on the point that flush should return distinguishable error values. Good.

It is probably a better approach to indeed let the client side decide how to handle this error.

Under which circumstances are you getting EINTR in flush?

I asked the question on how flush is expected to behave in the t-libs Zulip channel and the response made a point for retrying.

In case of Ctrl + C automatically retrying at "library level" might interfere with the user's intention of stopping the program execution (if flushing takes a significant amount of time). Returning ErrorKind::Interrupted lays this decision into the hands of the application. There have been no other issues with errors from flush so far and so I'm assuming that returning ErrorKind::Interrupted in this case is fine.

Maybe we should instead have a method in std::io::Write trait similar to write_all but for flush. But, definitely not the place to discuss this ^^'

I'd favor your proposition then :)

Then let's "bubble up" the right error, document this behavior and we should be sorted here.

@ndusart
Copy link
Author

ndusart commented Oct 24, 2024

Under which circumstances are you getting EINTR in flush?

I have a library using serialport exposing a blocking API to communicate with a specific device and we export a C API to reuse it on multiple projects. We have this behaviour when the library is used in a Flutter application.

Flutter application runs on a asynchronous event loop and I guess application must receive some signals that are sent internally by the flutter engine to drive the loop.

You make a point about signal from Ctrl+C. And basically looping through tcdrain would make a similar issue as the timeout issue for read_exact or write_all (#177) and potentially could deadlock in some situations.

Then let's "bubble up" the right error, document this behavior and we should be sorted here.

Yes indeed. Do you prefer that I rebase the PR with this fix or will you do it yourself and we close this ?

@sirhcel
Copy link
Contributor

sirhcel commented Oct 27, 2024

Thank you for the information on the background of this PR!

You make a point about signal from Ctrl+C. And basically looping through tcdrain would make a similar issue as the timeout issue for read_exact or write_all (#177) and potentially could deadlock in some situations.

Sleeping over it for some nights put my decisiveness a step back. Because - as you're mentioning here - read_exact and write_all are retrying on ErrorKind::Interrupted. So we are already bought into retrying in with these methods.

Let's assume for a moment that #177 is fixed. The execution time of write_all and read_exact would be limited by this timeout in the case of ErrorKind::Interrupted from the underlying write/read. So what about applying the same semantics to flush as well? Retrying, but with timeout as the limit? Wouldn't this behavior follow the documentation of Write::flush more closely than returning ErrorKind::Interrupted for EINTR?

Timeout behavior is not specified by Read and Write. But timeouts are a specialty of our implementation for serial ports and could be seem as something applying to all Read and Write operations.

What's your opinion on this @ndusart?

Yes indeed. Do you prefer that I rebase the PR with this fix or will you do it yourself and we close this ?

Would be great if you could lend a hand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants