laskoviymishka opened a new issue, #337:
URL: https://github.com/apache/arrow-go/issues/337

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   i'm implementing streaming (https://github.com/transferia/iceberg/pull/3) 
sink from kafka-like sources into iceberg, and utilize iceberg to create 
parquet files, I expirience some weird panic in pq.Write method:
   
   ```
   panic: runtime error: invalid memory address or nil pointer dereference
     [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x3e2f7b4]
     
     goroutine 415 [running]:
     github.com/apache/arrow-go/v18/arrow/memory.(*Buffer).Bytes(...)
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/v18@v18.2.0/arrow/memory/buffer.go:106
     github.com/apache/arrow-go/v18/parquet/file.(*page).Data(...)
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/v18@v18.2.0/parquet/file/page_reader.go:90
     
github.com/apache/arrow-go/v18/parquet/file.(*columnWriter).TotalBytesWritten(...)
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/v18@v18.2.0/parquet/file/column_writer.go:203
     
github.com/apache/arrow-go/v18/parquet/file.(*rowGroupWriter).Close(0xc0015267e0)
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/v18@v18.2.0/parquet/file/row_group_writer.go:237
 +0x8b
     
github.com/apache/arrow-go/v18/parquet/pqarrow.(*FileWriter).Close(0xc000612690)
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/v18@v18.2.0/parquet/pqarrow/file_writer.go:303
 +0x38
     
github.com/apache/arrow-go/v18/parquet/pqarrow.(*FileWriter).Write(0xc000612690,
 {0x699b5c0, 0xc001928360})
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/v18@v18.2.0/parquet/pqarrow/file_writer.go:243
 +0x2a5
     github.com/transferia/iceberg.writeFile({0xc00170bab0?, 0x61?}, 
0xc00060c2a0, {0xc001553b00, 0x1, 0x1})
        /home/runner/work/iceberg/iceberg/s3_writer.go:59 +0x43b
     github.com/transferia/iceberg.(*SinkStreaming).writeBatch(0xc00184e1b0, 
0xc00060c2a0, {0xc001553b00, 0x1, 0x1})
        /home/runner/work/iceberg/iceberg/sink_streaming.go:173 +0xde
     github.com/transferia/iceberg.(*SinkStreaming).writeDataToTable(...)
        /home/runner/work/iceberg/iceberg/sink_streaming.go:162
     github.com/transferia/iceberg.(*SinkStreaming).processTable(0xc00184e1b0, 
{0xc001553b00, 0x1, 0x1})
        /home/runner/work/iceberg/iceberg/sink_streaming.go:105 +0x195
     github.com/transferia/iceberg.(*SinkStreaming).Push(0xc00184e1b0, 
{0xc0015539e0, 0x1, 0x4aeb8e?})
        /home/runner/work/iceberg/iceberg/sink_streaming.go:80 +0x327
     
github.com/transferia/transferia/pkg/middlewares.(*errorTracker).Push(0xc0010837d0,
 {0xc0015539e0?, 0xc000af16f0?, 0xc0015539e0?})
        
/home/runner/go/pkg/mod/github.com/transferia/transferia@v0.0.2/pkg/middlewares/error_tracker.go:35
 +0x29
     
github.com/transferia/transferia/pkg/middlewares.(*outputDataMetering).Push(0xc000567810?,
 {0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/transferia@v0.0.2/pkg/middlewares/metering.go:65
 +0x2a
     
github.com/transferia/transferia/pkg/middlewares.(*statistician).Push(0xc001ce63c0,
 {0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/transferia@v0.0.2/pkg/middlewares/statistician.go:58
 +0x92
     
github.com/transferia/transferia/pkg/middlewares.(*filter).Push(0xc001852810, 
{0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/transferia@v0.0.2/pkg/middlewares/filter.go:76
 +0xa3
     
github.com/transferia/transferia/pkg/middlewares.(*nonRowSeparator).Push(0xc000583e30,
 {0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/transferia@v0.0.2/pkg/middlewares/nonrow_separator.go:50
 +0x37f
     
github.com/transferia/transferia/pkg/middlewares.(*inputDataMetering).Push(0xc000af18f0?,
 {0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/transferia@v0.0.2/pkg/middlewares/metering.go:43
 +0x2a
     
github.com/transferia/transferia/pkg/middlewares/async.(*synchronizer).AsyncPush(0xc001852840,
 {0xc0015539e0?, 0x1, 0xc0001bc060?})
        
/home/runner/go/pkg/mod/github.com/transferia/transferia@v0.0.2/pkg/middlewares/async/synchronizer.go:61
 +0xe5
     
github.com/transferia/transferia/pkg/middlewares/async.(*measurer).AsyncPush(0xc000c8b820,
 {0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/transferia@v0.0.2/pkg/middlewares/async/measurer.go:59
 +0x146
     
github.com/transferia/transferia/pkg/parsequeue.(*ParseQueue[...]).pushLoop(0x69c73e0)
        
/home/runner/go/pkg/mod/github.com/transferia/transferia@v0.0.2/pkg/parsequeue/parsequeue.go:88
 +0x13a
     created by github.com/transferia/transferia/pkg/parsequeue.New[...] in 
goroutine 405
        
/home/runner/go/pkg/mod/github.com/transferia/transferia@v0.0.2/pkg/parsequeue/parsequeue.go:161
 +0x1f7
   
   ```
   
   Here is some diagnostic that I did collect:
   
   ```
   Converting 1 items to Arrow Record with schema: schema:
     fields: 8
       - id: type=int32, nullable
       - level: type=utf8, nullable
       - caller: type=utf8, nullable
       - msg: type=utf8, nullable
       - _timestamp: type=timestamp[us, tz=UTC]
       - _partition: type=binary
       - _offset: type=int64
       - _idx: type=int64
   Processing field 0: id (type: int32)
   Item 0, Field id: Value type is int32
   Processing field 1: level (type: utf8)
   Item 0, Field level: Value type is string
   Processing field 2: caller (type: utf8)
   Item 0, Field caller: Value type is string
   Processing field 3: msg (type: utf8)
   Item 0, Field msg: Value type is string
   Processing field 4: _timestamp (type: timestamp[us, tz=UTC])
   Item 0, Field _timestamp: Value type is time.Time
   Processing field 5: _partition (type: binary)
   Item 0, Field _partition: Value type is string
   Processing field 6: _offset (type: int64)
   Item 0, Field _offset: Value type is uint64
   Processing field 7: _idx (type: int64)
   Item 0, Field _idx: Value type is uint32
   Writing record with 1 rows and 8 columns to 
s3://warehouse/streaming/topic1/data/00000-0-7711175e-7cbe-48a0-a534-4142d4bacede-0-00003.parquet
   Recovered from panic in Write: 
   runtime error: invalid memory address or nil pointer dereferenceRecord 
details:
     NumRows: 1
     NumCols: 8
     Column 0: id (type: int32)
     Column 1: level (type: utf8)
     Column 2: caller (type: utf8)
     Column 3: msg (type: utf8)
     Column 4: _timestamp (type: timestamp[us, tz=UTC])
     Column 5: _partition (type: binary)
     Column 6: _offset (type: int64)
     Column 7: _idx (type: int64)
   ```
   
   ### Component(s)
   
   Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to