alexandre-normand opened a new issue, #727:
URL: https://github.com/apache/arrow-go/issues/727
### Describe the bug, including details regarding any error messages,
version, and platform.
github.com/apache/arrow-go/v18: v18.5.1
I'm using arrow-go to write arrow parquet files and using `WriteBuffered` to
periodically flush record batches buffered in memory to disk every 10 seconds.
It generally works fine but sometimes I get an error on `WriteBuffered` like
this:
```
unknown error type: interface conversion: interface is nil, not
encoding.Encoder[int64]
```
or
```
unknown error type: interface conversion: interface is nil, not
encoding.Encoder[github.com/apache/arrow-go/v18/parquet.ByteArray]
```
I tried dumping the record batch on which the WriteBuffered fails to disk as
json and the rows all look complete and correct.
While I don't have an easy reproducible test because I can't find the
trigger for the issue, the parquet writer is created like this:
```go
fileWriter, err := pqarrow.NewFileWriter(schema.arrowSchema, file,
parquet.NewWriterProperties(parquet.WithAllocator(memory.DefaultAllocator)),
pqarrow.DefaultWriterProps())
...
```
And then we call something like this every 10 seconds:
```
...
builder := array.NewRecordBuilder(memory.DefaultAllocator,
schema.arrowSchema)
defer builder.Release()
arrowBuilder := newArrowBuilder(schema.arrowSchema, builder)
for _, op := range toWrite {
aErr = op.appendArrowRecord(arrowBuilder)
if aErr != nil {
aErr =
ingestion.NewIrrecoverableError(fmt.Errorf("failed to append arrow record for
write operation %s: %w", op.key.String(), aErr), "arrow_transformation_failure")
return aErr
}
}
// Create a new record batch from the buffer we just filled
batch := builder.NewRecordBatch()
defer batch.Release()
aErr = activeBuffer.fileWriter.WriteBuffered(batch)
if aErr != nil {
// Dump a json representation of the batch in blob
storage for inspection/troubleshooting
jsonFileLocation, err := dumpBatchJSON(activeBuffer.fs,
activeBuffer.locationProvider, batch)
if err != nil {
jsonFileLocation =
"unavailable_failure_to_dump_batch"
}
aErr = fmt.Errorf("failed to write batch to parquet
file, see json dump at '%s' for troubleshooting: %w", jsonFileLocation, aErr)
return aErr
}
...
```
That code obviously doesn't compile but it's a decent representation of how
we use the APIs. I initially suspected we were misusing arrow-go/pqarrow but
the fact that this isn't deterministic given the same data makes me think that
there might be a bug.
I will also add that the failure on WriteBuffered seems to leave the file
writer in an inconsistent state as I also see occasional failures on `Close()`
like the one below and they seem to correlate with failures on WriteBuffered
that happened prior to that file being closed:
```
row mismatch for buffered row group: 0, column: 60, count expected: 55000,
actual: 54397
```
Note that I could run with a modified version of arrow-go if there are
hypotheses to test. Usually, I can get this error within 30 minutes of startup.
### Component(s)
Parquet
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]