-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
count_total
panic with advanced trace
#526
Comments
Almost certainly not intended. Will investigate today; thanks very much for the report! |
diff --git a/src/operators/count.rs b/src/operators/count.rs
index e44a59a8..23572301 100644
--- a/src/operators/count.rs
+++ b/src/operators/count.rs
@@ -69,7 +69,8 @@ where
self.stream.unary_frontier(Pipeline, "CountTotal", move |_,_| {
- // tracks the upper limit of known-complete timestamps.
+ // tracks the lower and upper limits of known-complete timestamps.
+ let mut lower_limit = timely::progress::frontier::Antichain::from_elem(<G::Timestamp as timely::progress::Timestamp>::minimum());
let mut upper_limit = timely::progress::frontier::Antichain::from_elem(<G::Timestamp as timely::progress::Timestamp>::minimum());
move |input, output| {
@@ -80,7 +81,8 @@ where
let mut session = output.session(&capability);
for batch in buffer.drain(..) {
let mut batch_cursor = batch.cursor();
- let (mut trace_cursor, trace_storage) = trace.cursor_through(batch.lower().borrow()).unwrap();
+ trace.advance_upper(&mut lower_limit);
+ let (mut trace_cursor, trace_storage) = trace.cursor_through(lower_limit.borrow()).unwrap();
upper_limit.clone_from(batch.upper());
while let Some(key) = batch_cursor.get_key(&batch) {
diff --git a/src/operators/threshold.rs b/src/operators/threshold.rs
index 3d8a405a..027193fb 100644
--- a/src/operators/threshold.rs
+++ b/src/operators/threshold.rs
@@ -117,7 +117,8 @@ where
self.stream.unary_frontier(Pipeline, "ThresholdTotal", move |_,_| {
- // tracks the upper limit of known-complete timestamps.
+ // tracks the lower and upper limits of known-complete timestamps.
+ let mut lower_limit = timely::progress::frontier::Antichain::from_elem(<G::Timestamp as timely::progress::Timestamp>::minimum());
let mut upper_limit = timely::progress::frontier::Antichain::from_elem(<G::Timestamp as timely::progress::Timestamp>::minimum());
move |input, output| {
@@ -126,9 +127,9 @@ where
batches.swap(&mut buffer);
let mut session = output.session(&capability);
for batch in buffer.drain(..) {
-
+ trace.advance_upper(&mut lower_limit);
let mut batch_cursor = batch.cursor();
- let (mut trace_cursor, trace_storage) = trace.cursor_through(batch.lower().borrow()).unwrap();
+ let (mut trace_cursor, trace_storage) = trace.cursor_through(lower_limit.borrow()).unwrap();
upper_limit.clone_from(batch.upper()); |
Fixes a bug where we'd obtain a trace's cursor based on a batch's lower frontier. The trace can be compacted past the batch's lower frontier, which violates `curso_through`'s assumptions. Instead, we use the trace's upper to obtain the cursor. Fixes TimelyDataflow#526. Signed-off-by: Moritz Hoffmann <[email protected]>
Fixes a bug where we'd obtain a trace's cursor based on a batch's lower frontier. The trace can be compacted past the batch's lower frontier, which violates `cursor_through`'s assumptions. Instead, we use the trace's upper to obtain the cursor. Fixes TimelyDataflow#526. Signed-off-by: Moritz Hoffmann <[email protected]>
@antiguru After the patch, the demo program won't panic, but it produce invalid data
You can see the history count was never extracted, the correct output should be
I think the main problem is that Another way is we can handle all the data in the trace in the first place, then we do the computation batch by batch, the |
I'm going to peek at this now, but one thing that .. probably isn't clear even to me .. is that importing an advanced trace is some amount of undefined. The operators will react to the data they are presented, but there is no guarantee that a trace that has been allowed to compact will have done so. I recommend using The doccomments for /// The current behavior is that the introduced collection accumulates updates to some times less or equal
/// to `self.get_logical_compaction()`. There is *not* currently a guarantee that the updates are accumulated *to*
/// the frontier, and the resulting collection history may be weirdly partial until this point. In particular,
/// the historical collection may move through configurations that did not actually occur, even if eventually
/// arriving at the correct collection. This is probably a bug; although we get to the right place in the end,
/// the intermediate computation could do something that the original computation did not, like diverge.
///
/// I would expect the semantics to improve to "updates are advanced to `self.get_logical_compaction()`", which
/// means the computation will run as if starting from exactly this frontier. It is not currently clear whose
/// responsibility this should be (the trace/batch should only reveal these times, or an operator should know
/// to advance times before using them). The panic occurs even if using
It may be as simple as what @antiguru has, though I need to page in this logic. T.T |
It seems that the @antiguru fix is not exactly right, in that it produces
which hasn't counted anything (ideally we'd see something other than |
Quick diagnosis: Little bit of a rewrite to fix this up, but .. makes sense. |
I've got a fix put together, but probably worth getting eyes on it when @antiguru has free cycles (e.g. not the weekend). Sorry about the crash! These methods are definitely less well attended to than others, but .. they shouldn't be crashing! :D |
One caveat is that |
Thanks for pointing out the existence of I found #530 have a small problem : timely::execute_directly(move |worker| {
let mut probe = ProbeHandle::new();
let (mut input, mut trace) = worker.dataflow::<u32, _, _>(|scope| {
let (handle, input) = scope.new_collection();
let arrange = input.arrange_by_self();
arrange.stream.probe_with(&mut probe);
(handle, arrange.trace)
});
// introduce empty batch
input.advance_to(input.time() + 1);
input.flush();
worker.step_while(|| probe.less_than(input.time()));
// ingest some batches
for _ in 0..10 {
input.insert(10);
input.advance_to(input.time() + 1);
input.flush();
worker.step_while(|| probe.less_than(input.time()));
}
// advance the trace
trace.set_physical_compaction(AntichainRef::new(&[2]));
trace.set_logical_compaction(AntichainRef::new(&[2]));
worker.dataflow::<u32, _, _>(|scope| {
let arrange = trace.import(scope);
arrange
.count_total()
.inspect(|x| println!("{:?}", x));
});
}); If I introduce some empty batches before ingesting any data the program still panics. |
Ah, right that line shouldn't be there. The PR updated to reflect this. It seems like that same bug didn't make its way into |
The above code panic:
It seems
count_total
only works for traces with default compaction frontier. Is it intentional?Full code is here
The text was updated successfully, but these errors were encountered: