-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Debug/Fix #485 #495
[WIP] Debug/Fix #485 #495
Conversation
- Add Stage::introspection() accessor - Introspection: debug-log solution registration - RemoteSolutionModel: show internal solution id as tooltip in 1st column
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #495 +/- ##
==========================================
- Coverage 59.00% 58.99% -0.01%
==========================================
Files 90 90
Lines 8485 8497 +12
==========================================
+ Hits 5006 5012 +6
- Misses 3479 3485 +6 ☔ View full report in Codecov by Sentry. |
65537fc
to
00074ef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was obviously a lot of tedious debugging (which I started to repeat yesterday before noticing this PR again). Thank you for these patches.
Please see my inline comments. I would love to merge this upstream, but am not clear on some details yet.
@@ -487,7 +487,8 @@ struct SolutionCollector | |||
solutions.emplace_back(std::make_pair(trace, prio)); | |||
} else { | |||
for (SolutionBase* successor : next) { | |||
assert(!successor->isFailure()); // We shouldn't have invalid solutions | |||
if (successor->isFailure()) | |||
continue; // skip failures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to get my head around why this happens in some situations I encountered in the past and failed.
Can you describe a scenario where the assert will be violated with pruning active?
I'm fine with changing it because it is a step necessary if we want to have a flag that disables pruning for a whole task (any unpruned stage with a dead end on the other side of a partial solution will trigger this assert), but I do not yet understand why it can be invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the reasons anymore (it's nearly half a year ago that I handled this).
I think we cannot be sure that only valid solutions enter the start/end interfaces. For example, Connect stages do preserve all InterfaceStates, both successes and failures. We just save failures for later inspection as well.
Thus, we need to explicitly filter them here.
I remember that this was a no-brainer when encountering it. But now I don't remember a concrete example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's nearly half a year ago that I handled this
Fair enough, my fault for letting it lie around for so long.
For example, Connect stages do preserve all InterfaceStates, both successes and failures.
Of course, but they also prune the paths on failure. so unless the solution path forks, the other end should not compute at all (as the interface state is pruned). I came up with an example case below where the assert is violated with forking solution paths:
But I observed a rare stochastic assert in the following situation some days ago where turn
yielded a solution which hit the assert when propagating back the priority to the failed attempt in approach grasp
:
In this situation there are no obvious forking solution paths (unless ComputeIK factors through a single pose in the wrapped generator, but that should not happen?) and the failure in approach grasp
should have pruned the only connected interface start state in turn
, so the compute call that generated the turn
solution should not have happened.
I tried to write a test case for this, but failed to do so in reasonable time.
So TLDR: I guess dropping the assert is correct, but also obfuscates bugs in the pruning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange that you cannot reproduce the error in a simpler unit test.
When adding pending state pairs for a new incoming state to Connect, we have to re-enable opposite states from ARMED state. This changes the order of states in the interface. If we do this while iterating over the states, we might add pairs multiple times, because iteration continues with same state at an earlier position.
5674df2
to
e163f57
Compare
Attempt to fix #485 (comment):
SolutionCollector
:https://github.com/ros-planning/moveit_task_constructor/blob/564804bd7bb9e320b304b8c4880d96c6ce63b915/core/src/container.cpp#L491
compute()
calls)