Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry poll on EINTR/EAGAIN instead of debug printing #215

Closed
wants to merge 3 commits into from

Conversation

ThomasHabets
Copy link

EINTR is a perfectly cromulent return code, and should not trigger any output to stderr. In fact, no error should cause a library to print to stderr, but especially not a syscall abort that's not even an error condition.

@sirhcel
Copy link
Contributor

sirhcel commented Aug 31, 2024

Thank you very much for spotting this and your PR @ThomasHabets! Were you running into this issue regularly? Are you able to reproduce this in a "clean room" environment at best (for deriving a regression test)?

The dbg usage was accidentally checked in with 5043ce7.
The docs for the implemetation of Sub for Instant state that the
operation currently saturates but future versions may reintroduce
panics.
@sirhcel
Copy link
Contributor

sirhcel commented Sep 1, 2024

Just a little nitpick: The documentation of the implementation of Sub::sub for Instant states that future implementations might panic. So let's just use the saturating variant from the start.

@ThomasHabets
Copy link
Author

I found that the tests were not just a matter of running cargo test, so my only testing is while running this in my real application, where before this PR I got this debug line every single time I press Ctrl-C in my example program, since it almost all the time has an outstanding serial port read. And after the PR (with a path=… in my Cargo.toml) my PR was applied.

@ThomasHabets
Copy link
Author

Good point about the Sub. I did not know that.

@sirhcel
Copy link
Contributor

sirhcel commented Sep 5, 2024

After sleeping over it, i would not retry reading on EINTR because its likely the intent to interrupt a (read) operation in this situation. Like in your case when pressing Ctrl + C.

I'm still undecided on EAGAIN. We should not encounter this after a successful poll, but on the other hand ti would be the right thing to do if we ever do. Have you actually seen EAGAIN in the wild or is this a precaution in the wake of EINTR?

I'm with you, that serialport-rs should not generate any output to stdout or stderr. So I've set up #216 to clean my leftovers from a previous commit quickly and independently of our discussion here. May be the latter is already everything need to solve your actual issue for you.

@ThomasHabets
Copy link
Author

You may be right. The complete solution may be to just remove the debug print. Rust seems to document using ErrorKind::Interrupted for trait Read, not retry it.

glibc provides TEMP_FAILURE_RETRY as a pattern for applications, and almost always EINTR should be retried.

IIRC the actual behaviour is actually specific to which Unix you're using. E.g. it used to be the case that if you ran sleep 10, then suspended (^Z) and then resumed (fg), sleep would immediately exit. Whereas on Solaris IIRC the syscall would not return just because signal handling code ran. It was syscall reentrant.

It's been a long time since I looked at this, but the application could also be running with SA_RESTART.

That is to say that you can't rely on EINTR ever happening. AFAIK it's perfectly POSIX OK for the OS to just continue the syscall.

So I guess just make sure ErrorKind is Interrupt, and that should be good enough.

@sirhcel
Copy link
Contributor

sirhcel commented Sep 20, 2024

You may be right. The complete solution may be to just remove the debug print.

This fix #216 is released trough #217 as 4.5.1 just now. Thank you @eldruin for your swift review!

Rust seems to document using ErrorKind::Interrupted for trait Read, not retry it.

And it looks like we are doing exactly that in wait_fd through the conversions

glibc provides TEMP_FAILURE_RETRY as a pattern for applications, and almost always EINTR should be retried.

Thanks for the reference! I did not know this macro. Do you know if there is a similar adapter available in the rust ecosystem?

That is to say that you can't rely on EINTR ever happening. AFAIK it's perfectly POSIX OK for the OS to just continue the syscall.

So I guess just make sure ErrorKind is Interrupt, and that should be good enough.

Looks like we're already there with respect to the error propagation. And with removed debug output released, there is nothing left to here. Thank you for the fruitful discussion! Please feel free to reopen the issue if i missed something @ThomasHabets.

@sirhcel sirhcel closed this Sep 20, 2024
@sirhcel
Copy link
Contributor

sirhcel commented Sep 20, 2024

IIRC the actual behaviour is actually specific to which Unix you're using. E.g. it used to be the case that if you ran sleep 10, then suspended (^Z) and then resumed (fg), sleep would immediately exit. Whereas on Solaris IIRC the syscall would not return just because signal handling code ran. It was syscall reentrant.

It's been a long time since I looked at this, but the application could also be running with SA_RESTART.

Thanks, this is a good pointer too. I will keep this in mind and if users start to run into EINTR frequently, we could add this to the documentiation and and example. But as this does no seem to bother many users right now, I feel like not adrressing this in the first place will help users getting to results without distracting them with things they don't run into on a daily basis.

@ThomasHabets
Copy link
Author

@sirhcel I don't know if there's anything like TEMP_FAILURE_RETRY in Rust.

Yeah, this all seems good to me. Thanks for fixing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants