Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP feat: View Metadata Builder #908

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

c-thiel
Copy link
Collaborator

@c-thiel c-thiel commented Jan 23, 2025

This PR is not completely ready yet as I believe the current mechanism of view expiration is flawed.
I opened a PR in Java to demonstrate the problem and use for discussions:
apache/iceberg#12051

Feedback from anyone is welcome. I am not sure what the best solutions looks like.

Comment on lines +253 to +261
// ToDo Discuss: Similar to TableMetadata sort-order, Java does not add changes
// in this case. I prefer to add changes as the state of the builder is
// potentially mutated (`last_added_version_id`), thus we should record the change.
if self.last_added_version_id != Some(version_id) {
self.changes.push(ViewUpdate::AddViewVersion {
view_version: view_version.with_version_id(version_id),
});
self.last_added_version_id = Some(version_id);
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +293 to +309
// ToDo Discuss: This check is not present in Java.
// The `TableMetadataBuilder` uses these checks in multiple places - also in Java.
// If we think delayed requests are a problem, I think we should also add it here.
if let Some(last) = self.metadata.version_log.last() {
// commits can happen concurrently from different machines.
// A tolerance helps us avoid failure for small clock skew
if view_version.timestamp_ms() - last.timestamp_ms() < -ONE_MINUTE_MS {
return Err(Error::new(
ErrorKind::DataInvalid,
format!(
"Invalid snapshot timestamp {}: before last snapshot timestamp {}",
view_version.timestamp_ms(),
last.timestamp_ms()
),
));
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +344 to +348
// ToDo Discuss: In Java this uses `new_view_version.version_id()` instead.
// I believe 0 is more appropriate here, as the first version should always be 0.
// The TableMetadataBuilder also uses 0 for partition specs (and other `reuse_or_create_*` functions).
// Consistent behaviour between the two builders is desirable.
.unwrap_or(INITIAL_VIEW_VERSION_ID)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +367 to +376
// ToDo Discuss: Java does not add changes in this case. I prefer to add changes
// as the state of the builder is potentially mutated (`last_added_schema_id`),
// thus we should record the change.
if self.last_added_schema_id != Some(schema_id) {
self.changes.push(ViewUpdate::AddSchema {
schema: schema.clone().with_schema_id(schema_id),
last_column_id: None,
});
self.last_added_schema_id = Some(schema_id);
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - CC @Fokko , @nastra

metadata.versions.keys().cloned().collect::<HashSet<_>>(),
HashSet::from_iter(vec![2, 3, 4])
);
// assert_eq!(metadata.version_log.len(), 1); // ToDo: Enable after expiration discussion
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToDo: Resolve after expiration discussion.

let builder = builder_without_changes();
let v1 = new_view_version(0, 1, "select * from ns.tbl");

// ToDo Java: I think we should remove line 688 - 693 in Java TestVeiwMetadata.java as it does nothing.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @nastra

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant