You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These changes are part of V2 checkpoint read support.
For a scan, we need to build the list of add and remove actions required to make up the table’s state. These changes are required to read the necessary actions in sidecar files referenced by V2 checkpoints.
Describe the functionality you are proposing.
To create the actions iterator, we chain together:
An iterator of actions from commit files
An iterator of actions from a checkpoint file
For every batch of EngineData from a checkpoint file:
Visit the rows of each checkpoint batch with the new SidecarVisitor. This visitor collects all sidecar file paths found in sidecar actions within a checkpoint batch.
If sidecar file paths exist
Read the corresponding sidecar files, generating an iterator over batches of actions in the sidecar files.
Replace the originating checkpoint batch with the sidecar batches that contain the add actions which make up the table’s state.
If no sidecar file paths exist
Leave the checkpoint batch as-is in the checkpoint batches iterator as it already contains the add actions which make up the table’s state.
Note: A batch may not include add actions, but other actions (like txn, metadata, protocol). This is safe as the non-file actions will be ignored.
Additional context
No response
The text was updated successfully, but these errors were encountered:
Please describe why this is necessary.
These changes are part of V2 checkpoint read support.
For a scan, we need to build the list of add and remove actions required to make up the table’s state. These changes are required to read the necessary actions in sidecar files referenced by V2 checkpoints.
Describe the functionality you are proposing.
To create the actions iterator, we chain together:
For every batch of EngineData from a checkpoint file:
Visit the rows of each checkpoint batch with the new SidecarVisitor. This visitor collects all sidecar file paths found in sidecar actions within a checkpoint batch.
If sidecar file paths exist
Read the corresponding sidecar files, generating an iterator over batches of actions in the sidecar files.
Replace the originating checkpoint batch with the sidecar batches that contain the add actions which make up the table’s state.
If no sidecar file paths exist
Leave the checkpoint batch as-is in the checkpoint batches iterator as it already contains the add actions which make up the table’s state.
Note: A batch may not include add actions, but other actions (like txn, metadata, protocol). This is safe as the non-file actions will be ignored.
Additional context
No response
The text was updated successfully, but these errors were encountered: