dannycjones commented on code in PR #2589:
URL: https://github.com/apache/iceberg-rust/pull/2589#discussion_r3362065015
##########
crates/iceberg/src/spec/snapshot_summary.rs:
##########
@@ -483,27 +483,37 @@ fn update_totals(
added_property: &str,
removed_property: &str,
) {
- let previous_total = previous_summary.map_or(0, |previous_summary| {
- previous_summary
- .additional_properties
- .get(total_property)
- .map_or(0, |value| value.parse::<u64>().unwrap())
- });
+ let previous_total = match previous_summary {
+ Some(previous_summary) => {
+ match previous_summary.additional_properties.get(total_property) {
+ Some(value) => match value.parse::<u64>() {
+ Ok(v) => v,
+ Err(e) => {
+ tracing::warn!(
+ "Failed to parse previous summary property
'{total_property}' value '{value}': {e}. \
+ Skipping total computation.",
+ );
+ return;
+ }
+ },
+ None => return,
Review Comment:
I don't think so. There exists a previous summary however the total is
omitted, which is fine since its optional. The way I'm interpreting the spec is
that any one of these metrics could be missing.
https://iceberg.apache.org/spec/#metrics
When the previous snapshot summary exists but there's no previous total, I
don't think we can come up with a correct total for the new snapshot. The
previous snapshot could have had data files (as an example).
I believe my change aligns with the Java implementation, which specifies no
total if the previous one is missing or it fails to parse the integer:
https://github.com/apache/iceberg/blob/c00669fde813cac7e9d474c1a2c38fa8e4f75a95/core/src/main/java/org/apache/iceberg/SnapshotProducer.java#L926-L956
##########
crates/iceberg/src/spec/snapshot_summary.rs:
##########
@@ -483,27 +483,37 @@ fn update_totals(
added_property: &str,
removed_property: &str,
) {
- let previous_total = previous_summary.map_or(0, |previous_summary| {
- previous_summary
- .additional_properties
- .get(total_property)
- .map_or(0, |value| value.parse::<u64>().unwrap())
- });
+ let previous_total = match previous_summary {
+ Some(previous_summary) => {
+ match previous_summary.additional_properties.get(total_property) {
+ Some(value) => match value.parse::<u64>() {
+ Ok(v) => v,
+ Err(e) => {
+ tracing::warn!(
+ "Failed to parse previous summary property
'{total_property}' value '{value}': {e}. \
+ Skipping total computation.",
+ );
+ return;
+ }
+ },
+ None => return,
+ }
+ }
+ None => 0,
+ };
let mut new_total = previous_total;
- if let Some(value) = summary
- .additional_properties
- .get(added_property)
- .map(|value| value.parse::<u64>().unwrap())
- {
- new_total += value;
+ if let Some(value) = summary.additional_properties.get(added_property) {
+ match value.parse::<u64>() {
+ Ok(v) => new_total += v,
+ Err(_) => return,
Review Comment:
I agree that we should not swallow the error.
Would it make sense to use `expect` here? We just set this value, so we
should expect it to be valid.
If not `expect`, I would propose to return an Iceberg error. I think a
warning is the wrong choice here - if we cannot read our own properties, we're
doing something wrong and should bail out of the commit.
##########
crates/iceberg/src/spec/snapshot_summary.rs:
##########
@@ -483,27 +484,40 @@ fn update_totals(
added_property: &str,
removed_property: &str,
) {
- let previous_total = previous_summary.map_or(0, |previous_summary| {
- previous_summary
- .additional_properties
- .get(total_property)
- .map_or(0, |value| value.parse::<u64>().unwrap())
- });
+ let previous_total = match previous_summary {
+ Some(previous_summary) => {
+ match previous_summary.additional_properties.get(total_property) {
+ Some(value) => match value.parse::<u64>() {
+ Ok(v) => v,
+ Err(e) => {
+ tracing::warn!(
+ "Failed to parse previous summary property '{}'
value '{}': {}. \
+ Skipping total computation.",
+ property_name,
+ property_value,
+ parse_error,
+ );
Review Comment:
I'm not sure I understand the question.
I am asking if it's a good thing to add a warning here. I am doubtful, only
because we are not using `tracing` very much in the codebase right now.
I want to warn here since we are choosing to skip the totals where the
previous snapshot has them set. In this case, totals would disappear in
snapshots created by iceberg-rust. We are skipping because we are _unable to
parse the previous total and thus unable calculate the total_.
##########
crates/iceberg/src/spec/snapshot_summary.rs:
##########
@@ -483,27 +484,40 @@ fn update_totals(
added_property: &str,
removed_property: &str,
) {
- let previous_total = previous_summary.map_or(0, |previous_summary| {
- previous_summary
- .additional_properties
- .get(total_property)
- .map_or(0, |value| value.parse::<u64>().unwrap())
- });
+ let previous_total = match previous_summary {
+ Some(previous_summary) => {
+ match previous_summary.additional_properties.get(total_property) {
+ Some(value) => match value.parse::<u64>() {
+ Ok(v) => v,
+ Err(e) => {
+ tracing::warn!(
+ "Failed to parse previous summary property '{}'
value '{}': {}. \
+ Skipping total computation.",
+ property_name,
+ property_value,
+ parse_error,
+ );
Review Comment:
For reference, Java does not warn here either.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]