rustyconover opened a new issue, #401:
URL: https://github.com/apache/arrow-js/issues/401
`RecordBatch.numRows` returns 0 for zero-column batches deserialized from
IPC, even when the IPC message header specifies a non-zero length. Zero-column
RecordBatches are valid Arrow payloads and should preserve their row count
through serialization round-trips.
## Current behavior
1. An IPC stream contains a zero-column RecordBatch with `length: 100`
2. The IPC reader correctly passes `header.length` to `makeData`, and
`visitStruct` preserves it
3. But `new RecordBatch(schema, data)` calls `ensureSameLengthData` which
recomputes length as `chunks.reduce((max, col) => Math.max(max, col.length),
0)` — with zero children, this returns 0
## Expected behavior
`RecordBatch.numRows` should be 100 after deserializing a zero-column batch
with `length: 100`.
## Reproducer
```typescript
import { makeData, RecordBatch, Schema, Struct } from 'apache-arrow';
const schema = new Schema([]);
const data = makeData({ type: new Struct([]), length: 100, nullCount: 0,
children: [] });
const batch = new RecordBatch(schema, data);
console.log(batch.numRows); // 0 — expected 100
```
## Root cause
In `src/recordbatch.ts` line 84, the 2-arg constructor path calls:
```typescript
[this.schema, this.data] = ensureSameLengthData<T>(this.schema,
this.data.children as Data<T[keyof T]>[]);
```
`ensureSameLengthData` (line 323) defaults `maxLength` to
`chunks.reduce(...)`, which returns 0 for an empty array.
The 1-arg constructor path (line 102) already passes `length` explicitly and
does not have this bug.
## Fix
Pass `this.data.length` as the third argument to `ensureSameLengthData`,
which already accepts an optional `maxLength` parameter.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]