Re: [PR] Document stats `ndv` value representation [iceberg]

2024-08-01 Thread via GitHub
findepi commented on PR #10793: URL: https://github.com/apache/iceberg/pull/10793#issuecomment-2264001934 > I am actually more in favor of making it a double to keep consistent with the algorithm @szehon-ho i am totally fine with that approach too. We would need to define the string

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-30 Thread via GitHub
emkornfield commented on code in PR #10793: URL: https://github.com/apache/iceberg/pull/10793#discussion_r1697247451 ## format/puffin-spec.md: ## @@ -121,7 +121,9 @@ distinct values converted to bytes using Iceberg's single-value serialization. The blob metadata for this blo

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-30 Thread via GitHub
emkornfield commented on code in PR #10793: URL: https://github.com/apache/iceberg/pull/10793#discussion_r1697230444 ## format/puffin-spec.md: ## @@ -121,7 +121,9 @@ distinct values converted to bytes using Iceberg's single-value serialization. The blob metadata for this blo

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-30 Thread via GitHub
szehon-ho commented on PR #10793: URL: https://github.com/apache/iceberg/pull/10793#issuecomment-2257706966 Thanks @amogh-jahagirdar. I guess I need to give the context. In https://github.com/apache/iceberg/pull/10288#discussion_r1691077522 we realize that in fact ndv as defined by theta

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-29 Thread via GitHub
szehon-ho commented on code in PR #10793: URL: https://github.com/apache/iceberg/pull/10793#discussion_r1696058526 ## format/puffin-spec.md: ## @@ -121,7 +121,9 @@ distinct values converted to bytes using Iceberg's single-value serialization. The blob metadata for this blob

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-29 Thread via GitHub
szehon-ho commented on PR #10793: URL: https://github.com/apache/iceberg/pull/10793#issuecomment-2257148519 Yes this pr as is should not require a spec change. > The wording used for apache-datasketches-theta-v1 should have been better and clearly define both: allowed values and their

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-29 Thread via GitHub
szehon-ho commented on code in PR #10793: URL: https://github.com/apache/iceberg/pull/10793#discussion_r1696054978 ## format/puffin-spec.md: ## @@ -121,7 +121,9 @@ distinct values converted to bytes using Iceberg's single-value serialization. The blob metadata for this blob

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-29 Thread via GitHub
findepi commented on code in PR #10793: URL: https://github.com/apache/iceberg/pull/10793#discussion_r1695941907 ## format/puffin-spec.md: ## @@ -121,7 +121,9 @@ distinct values converted to bytes using Iceberg's single-value serialization. The blob metadata for this blob ma

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-29 Thread via GitHub
findepi commented on PR #10793: URL: https://github.com/apache/iceberg/pull/10793#issuecomment-2256968863 > misinterpreted this pr to support double as per [#10288 (comment)](https://github.com/apache/iceberg/pull/10288#discussion_r1691077522) . sorry for the confusion! in that PR

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-29 Thread via GitHub
szehon-ho commented on code in PR #10793: URL: https://github.com/apache/iceberg/pull/10793#discussion_r1695779192 ## format/puffin-spec.md: ## @@ -121,7 +121,9 @@ distinct values converted to bytes using Iceberg's single-value serialization. The blob metadata for this blob

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-29 Thread via GitHub
szehon-ho commented on PR #10793: URL: https://github.com/apache/iceberg/pull/10793#issuecomment-2256768734 @findepi Got it sorry i misinterpreted this pr to support double as per https://github.com/apache/iceberg/pull/10288#discussion_r1691077522 . This pr makes more sense then.

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-29 Thread via GitHub
findepi commented on PR #10793: URL: https://github.com/apache/iceberg/pull/10793#issuecomment-2256564863 > I am not sure, but does this mean an integer value like 2 now becomes 2.0? (if using java toString) no, this should be "2" > And in any case, as its not entirely backward

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-29 Thread via GitHub
findepi commented on code in PR #10793: URL: https://github.com/apache/iceberg/pull/10793#discussion_r1695640265 ## format/puffin-spec.md: ## @@ -121,7 +121,9 @@ distinct values converted to bytes using Iceberg's single-value serialization. The blob metadata for this blob ma

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-29 Thread via GitHub
szehon-ho commented on code in PR #10793: URL: https://github.com/apache/iceberg/pull/10793#discussion_r1695559265 ## format/puffin-spec.md: ## @@ -121,7 +121,9 @@ distinct values converted to bytes using Iceberg's single-value serialization. The blob metadata for this blob

Re: [PR] Document stats `ndv` value representation [iceberg]

2024-07-27 Thread via GitHub
findepi commented on PR #10793: URL: https://github.com/apache/iceberg/pull/10793#issuecomment-2254249862 cc @karuppayya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To