-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support metadata table "Metadata Log Entries" #846
base: main
Are you sure you want to change the base?
Conversation
bdb6372
to
067ada8
Compare
crates/iceberg/src/metadata_scan.rs
Outdated
fn snapshot_id_as_of_time( | ||
table_metadata: &TableMetadataRef, | ||
timestamp_ms_inclusive: i64, | ||
) -> Option<&SnapshotRef> { | ||
let mut snapshot_id = None; | ||
// The table metadata snapshot log is chronological | ||
for log_entry in table_metadata.history() { | ||
if log_entry.timestamp_ms <= timestamp_ms_inclusive { | ||
snapshot_id = Some(log_entry.snapshot_id); | ||
} | ||
} | ||
snapshot_id.and_then(|id| table_metadata.snapshot_by_id(id)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes that history()
is chronologically ordered because that gets asserted TableMetadata#try_normalize
.
067ada8
to
8d76183
Compare
crates/iceberg/src/metadata_scan.rs
Outdated
// Include the current metadata locaction and modification time in the table. This matches | ||
// the Java implementation. Unlike the Java implementation, a current metadata location is | ||
// optional here. In that case, we omit current metadata from the metadata log table. | ||
if let Some(current_metadata_location) = &scan.metadata_location { | ||
append_metadata_log_entry( | ||
scan.metadata_ref.last_updated_ms(), | ||
current_metadata_location, | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.metadata_location(table_metadata1_location.as_os_str().to_str().unwrap()) | ||
.metadata_location(template_json_location) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this change, the location in the metadata log and the current location would both be ../metadata/v1.json
.
I wanted to have a distinction so we can assert that metadata_log_entries
includes the current metadata location, even if not in the metadata log.
#822 has been merged, let's move on! |
8d76183
to
395ff1f
Compare
Thank you, @Xuanwo. This is rebased and ready for review. |
76c2bf2
to
439c529
Compare
crates/iceberg/src/metadata_table.rs
Outdated
-- validity: | ||
-- validity: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was failing for me locally.
pub mod metadata_scan; | ||
pub mod metadata_table; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up to this comment: #822 (comment)
Don't need to do in this PR. Up to you.
pub fn metadata_table(self) -> MetadataTable { | ||
pub fn metadata_table(&self) -> MetadataTable<'_> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up to this comment: #822 (comment)
Also not necessary to do in this PR.
Let's get #863 merge first. |
+1. |
Re #823. This adds support for the Metadata Log Entries metadata table.
This is building on @xxchan's unmerged #822. I'll update and rebase this PR when #822 merges.✅Metadata Log Entries is the metadata log with the latest snapshot per metadata file.
Reference implementations: