-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix access violation in resume_after and resume_on_signal #1342
Fix access violation in resume_after and resume_on_signal #1342
Conversation
Fixes microsoft#1329 This fixes the issue by using m_state before suspending, rather than after. The issue occurs because the callback could fire before our thread of execution resumes, causing the timespan_awaiter/signal_awaiter to be destroyed inside the coroutine frame before m_state is accessed. As a drive-by improvement, currently if await_suspend is called with a non-idle state, the threadpool object is closed (cancelling the timer/wait), the existing coroutine handle is just dropped, and resume on the new handle is fired immediately. This would cause the existing pending coroutine to hang forever. Instead, avoid doing anything and throw an exception when the awaiter is not idle. This is a very unlikely event, the test does some gymnastics (reused awaiter) to achieve this state, but better safe than sorry.
CI build failing with:
|
Yep, taking a look |
A driver update hosed my system install (I love Windows), will take a bit more time than expected |
…b.com/sylveon/cppwinrt into user/sylveon/awaiter-access-violation
I stopped trying to be clever and just used a mutex, that way we can be sure that await_suspend completes before the threadpool callback resumes the coroutine. I also took the opportunity to refactor the code to use CRTP and share implementation between timers and waits, as well as insert some useful short circuits in case of cancellation. |
{ | ||
} | ||
|
||
#if defined(__GNUC__) && !defined(__clang__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this because the upstream bug has been fixed since GCC 12
Everything should work now :) |
@oldnewthing are you able to review this? |
@kennykerr Will try to get to it, but kind of busy with other API stuff. |
This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Bump |
This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Can someone reopen? |
Bump |
This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Is there anything preventing this from being merged? |
I'm not able to review this code #1329 (comment). My preference is to strip coroutine support in C++/WinRT down to the basics, removing all of the helpers, cancellation, and progress support as it all appears somewhat unreliable and buggy and is too difficult to reason about. |
Migrating all of that to WIL seems preferred (I'm not opposed to that), but then it would be a breaking change that requires C++/WinRT 3. Maybe @oldnewthing can review this until then |
The PR description says "This fixes the issue by using m_state before suspending, rather than after. " But the change is much bigger than this sentence suggests. Can we untangle the essential fix from the drive-by fix? |
Making the fix separately in both classes would not lead to a much better PR diff unfortunately, so I merged them together into a single base class to reduce code duplication (and therefore less effort to review). |
This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
In short, I don't think untangling the drive-by would make it much easier to read. |
This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Closing this PR as there's been no real activity in nearly 8 months. If a project maintainer is available to review, they can always reopen this PR as needed. |
Why hasn't Microsoft got anyone to look after C++/WinRT? |
Fixes #1329
This fixes the issue by using m_state before suspending, rather than after. The issue occurs because the callback could fire before our thread of execution resumes, causing the timespan_awaiter/signal_awaiter to be destroyed inside the coroutine frame before m_state is accessed.
As a drive-by improvement, currently if await_suspend is called with a non-idle state, the threadpool object is closed (cancelling the timer/wait), the existing coroutine handle is just dropped, and resume on the new handle is fired immediately. This would cause the existing pending coroutine to hang forever. Instead, avoid doing anything and throw an exception when the awaiter is not idle. This is a very unlikely event, the test does some gymnastics (reused awaiter) to achieve this state, but better safe than sorry.