alexandre-normand opened a new issue, #727:
URL: https://github.com/apache/arrow-go/issues/727

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   github.com/apache/arrow-go/v18: v18.5.1
   
   I'm using arrow-go to write arrow parquet files and using `WriteBuffered` to 
periodically flush record batches buffered in memory to disk every 10 seconds. 
It generally works fine but sometimes I get an error on `WriteBuffered` like 
this:
   ```
   unknown error type: interface conversion: interface is nil, not 
encoding.Encoder[int64]
   ```
   
   or 
   
   ```
   unknown error type: interface conversion: interface is nil, not 
encoding.Encoder[github.com/apache/arrow-go/v18/parquet.ByteArray]
   ```
   
   I tried dumping the record batch on which the WriteBuffered fails to disk as 
json and the rows all look complete and correct. 
   
   While I don't have an easy reproducible test because I can't find the 
trigger for the issue, the parquet writer is created like this: 
   ```go
   fileWriter, err := pqarrow.NewFileWriter(schema.arrowSchema, file, 
parquet.NewWriterProperties(parquet.WithAllocator(memory.DefaultAllocator)), 
pqarrow.DefaultWriterProps())
   ...
   ```
   
   And then we call something like this every 10 seconds:
   ```
   ...
                    builder := array.NewRecordBuilder(memory.DefaultAllocator, 
schema.arrowSchema)
                defer builder.Release()
   
                arrowBuilder := newArrowBuilder(schema.arrowSchema, builder)
                for _, op := range toWrite {
                        aErr = op.appendArrowRecord(arrowBuilder)
                        if aErr != nil {
                                aErr = 
ingestion.NewIrrecoverableError(fmt.Errorf("failed to append arrow record for 
write operation %s: %w", op.key.String(), aErr), "arrow_transformation_failure")
                                return aErr
                        }
                }
   
                // Create a new record batch from the buffer we just filled
                batch := builder.NewRecordBatch()
                defer batch.Release()
   
                aErr = activeBuffer.fileWriter.WriteBuffered(batch)
                if aErr != nil {
                        // Dump a json representation of the batch in blob 
storage for inspection/troubleshooting
                        jsonFileLocation, err := dumpBatchJSON(activeBuffer.fs, 
activeBuffer.locationProvider, batch)
                        if err != nil {
                                jsonFileLocation = 
"unavailable_failure_to_dump_batch"
                        }
                        aErr = fmt.Errorf("failed to write batch to parquet 
file, see json dump at '%s' for troubleshooting: %w", jsonFileLocation, aErr)
                        return aErr
                }
   ...
   ```
   
   That code obviously doesn't compile but it's a decent representation of how 
we use the APIs. I initially suspected we were misusing arrow-go/pqarrow but 
the fact that this isn't deterministic given the same data makes me think that 
there might be a bug.
   
   I will also add that the failure on WriteBuffered seems to leave the file 
writer in an inconsistent state as I also see occasional failures on `Close()` 
like the one below and they seem to correlate with failures on WriteBuffered 
that happened prior to that file being closed:
   
   ```
   row mismatch for buffered row group: 0, column: 60, count expected: 55000, 
actual: 54397
   ```
   
   Note that I could run with a modified version of arrow-go if there are 
hypotheses to test. Usually, I can get this error within 30 minutes of startup. 
   
   ### Component(s)
   
   Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to