scovich commented on code in PR #9610:
URL: https://github.com/apache/arrow-rs/pull/9610#discussion_r3016737854
##########
parquet-variant-compute/src/unshred_variant.rs:
##########
@@ -325,7 +330,9 @@ impl<'a> ValueOnlyUnshredVariantBuilder<'a> {
if self.value.is_null(index) {
builder.append_null();
} else {
- let variant = Variant::new_with_metadata(metadata.clone(),
self.value.value(index));
+ let value_bytes = binary_array_value(self.value.as_ref(), index)
+ .expect("value field must be a binary-like array");
Review Comment:
This will panic if the expectation fails. Do we have a guarantee that the
expectation always holds?
Also: this method returns `Result`, why not take advantage of `?`?
##########
parquet-variant-compute/src/variant_array.rs:
##########
@@ -375,7 +397,11 @@ impl VariantArray {
}
// Otherwise fall back to value, if available
(_, Some(value)) if value.is_valid(index) => {
- Ok(Variant::new(self.metadata.value(index),
value.value(index)))
+ let metadata = binary_array_value(self.metadata.as_ref(),
index)
+ .expect("metadata field must be a binary-like array");
+ let value = binary_array_value(value.as_ref(), index)
+ .expect("value field must be a binary-like array");
Review Comment:
Again: Why risk panic in a method that returns Result?
##########
parquet-variant-compute/src/variant_array.rs:
##########
@@ -41,6 +41,40 @@ use parquet_variant::{
use std::borrow::Cow;
use std::sync::Arc;
+/// Returns the raw bytes at the given index from a binary-like array.
+///
+/// # Panics
+/// Panics if the array is not `Binary`, `LargeBinary`, or `BinaryView`,
+/// or if `index` is out of bounds.
+pub(crate) fn binary_array_value(array: &dyn Array, index: usize) -> &[u8] {
+ match array.data_type() {
+ DataType::Binary => array.as_binary::<i32>().value(index),
+ DataType::LargeBinary => array.as_binary::<i64>().value(index),
+ DataType::BinaryView => array.as_binary_view().value(index),
+ other => panic!("Expected Binary, LargeBinary, or BinaryView array,
got {other}"),
Review Comment:
`.expect` is still a panic if the expectation fails... do we have some
guarantee that that each call site can never fail the expectation?
##########
parquet-variant-compute/src/unshred_variant.rs:
##########
@@ -73,7 +76,9 @@ pub fn unshred_variant(array: &VariantArray) ->
Result<VariantArray> {
if array.is_null(i) {
value_builder.append_null();
} else {
- let metadata = VariantMetadata::new(metadata.value(i));
+ let metadata_bytes = binary_array_value(metadata.as_ref(), i)
+ .expect("metadata field must be a binary-like array");
Review Comment:
why panic instead of `?`?
##########
parquet-variant-compute/src/unshred_variant.rs:
##########
@@ -347,7 +354,11 @@ trait AppendToVariantBuilder: Array {
macro_rules! handle_unshredded_case {
($self:expr, $builder:expr, $metadata:expr, $index:expr,
$partial_shredding:expr) => {{
let value = $self.value.as_ref().filter(|v| v.is_valid($index));
- let value = value.map(|v|
Variant::new_with_metadata($metadata.clone(), v.value($index)));
+ let value = value.map(|v| {
+ let bytes = binary_array_value(v.as_ref(), $index)
+ .expect("value field must be a binary-like array");
Review Comment:
`handle_unshredded_case!` is expected to `return Err(...)` in case of a
problem. See L374 below, for example.
##########
parquet-variant-compute/src/variant_array.rs:
##########
@@ -41,6 +41,36 @@ use parquet_variant::{
use std::borrow::Cow;
use std::sync::Arc;
+/// Returns the raw bytes at the given index from a binary-like array, return
`None` if the array isn't binary-like.
+pub(crate) fn binary_array_value(array: &dyn Array, index: usize) ->
Option<&[u8]> {
+ match array.data_type() {
+ DataType::Binary => Some(array.as_binary::<i32>().value(index)),
+ DataType::LargeBinary => Some(array.as_binary::<i64>().value(index)),
+ DataType::BinaryView => Some(array.as_binary_view().value(index)),
+ _ => None,
+ }
+}
+
+/// Returns `true` if the data type is `Binary`, `LargeBinary`, or
`BinaryView`.
+fn is_binary_like(dt: &DataType) -> bool {
+ matches!(
+ dt,
+ DataType::Binary | DataType::LargeBinary | DataType::BinaryView
+ )
+}
+
+/// Validates that an array has a binary-like data type.
+fn validate_binary_array(array: &dyn Array, field_name: &str) -> Result<()> {
+ if is_binary_like(array.data_type()) {
Review Comment:
Single callsite helper function? Why not just:
```rust
fn validate_binary_array(...) -> Result<()> {
match array.data_type() {
DataType::Binary | DataType::LargeBinary | DataType::BinaryView =>
Ok(()),
_ => Err(...),
}
}
```
##########
parquet-variant-compute/src/variant_array.rs:
##########
@@ -957,9 +983,9 @@ fn typed_value_to_variant<'a>(
let value = array.value(index);
Ok(Uuid::from_slice(value).unwrap().into()) // unwrap is safe:
slice is always 16 bytes
}
- DataType::BinaryView => {
- let array = typed_value.as_binary_view();
- let value = array.value(index);
+ DataType::Binary | DataType::LargeBinary | DataType::BinaryView => {
+ let value = binary_array_value(typed_value.as_ref(), index)
+ .expect("match arm guarantees the array is binary-like");
Review Comment:
another panic where Err would suffice
Tho in this case it _is_ true (albeit somewhat brittle) that the panic
should be unreachable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]