tschaub opened a new issue, #37968:
URL: https://github.com/apache/arrow/issues/37968

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   I'm running into an issue using a record reader to read Parquet data from 
https://github.com/OvertureMaps/data.
   
   Here is a test that demonstrates the panic:
   ```go
   func TestOvertureRead(t *testing.T) {
        reader, err := os.Open("testdata/overture.parquet")
        require.NoError(t, err)
   
        fileReader, err := file.NewParquetReader(reader)
        require.NoError(t, err)
   
        arrowReader, err := pqarrow.NewFileReader(fileReader, 
pqarrow.ArrowReadProperties{BatchSize: 1024}, memory.DefaultAllocator)
        require.NoError(t, err)
   
        recordReader, err := arrowReader.GetRecordReader(context.Background(), 
nil, nil)
        require.NoError(t, err)
   
        rowsRead := int64(0)
        for {
                rec, err := recordReader.Read()
                if err == io.EOF {
                        break
                }
                require.NoError(t, err)
                rowsRead += rec.NumRows()
        }
   
        assert.Equal(t, fileReader.NumRows(), rowsRead)
   }
   ```
   
   The `testdata/overture.parquet` file is from 
https://storage.googleapis.com/open-geodata/ch/20230725_211237_00132_5p54t_3b7d7eb3-dd9c-442a-a9b9-404dc936c5d9
   
   Here is the output
   ```bash
   # go test -timeout 30s -run ^TestOvertureRead$ 
github.com/apache/arrow/go/v14/parquet/pqarrow
   
   panic: runtime error: slice bounds out of range [:160] with capacity 0
   
   goroutine 99 [running]:
   
github.com/apache/arrow/go/v14/parquet/internal/encoding.(*PlainByteArrayDecoder).DecodeSpaced(0x0?,
 {0x0?, 0x140005ffce8?, 0x105304140?}, 0x105f2c2d8?, {0x14000402dc0?, 
0x5ffc01?, 0x401?}, 0x7000100000400?)
        
/Users/tim/projects/arrow/go/parquet/internal/encoding/byte_array_decoder.go:83 
+0x130
   
github.com/apache/arrow/go/v14/parquet/file.(*byteArrayRecordReader).ReadValuesSpaced(0x140005b4900,
 0x0, 0x800?)
        /Users/tim/projects/arrow/go/parquet/file/record_reader.go:841 +0x134
   
github.com/apache/arrow/go/v14/parquet/file.(*recordReader).ReadRecordData(0x140005c55c0,
 0x400)
        /Users/tim/projects/arrow/go/parquet/file/record_reader.go:548 +0x288
   
github.com/apache/arrow/go/v14/parquet/file.(*recordReader).ReadRecords(0x140005c55c0,
 0x400)
        /Users/tim/projects/arrow/go/parquet/file/record_reader.go:632 +0x32c
   
github.com/apache/arrow/go/v14/parquet/pqarrow.(*leafReader).LoadBatch(0x140005c5620,
 0x400)
        /Users/tim/projects/arrow/go/parquet/pqarrow/column_readers.go:104 +0xd8
   
github.com/apache/arrow/go/v14/parquet/pqarrow.(*structReader).LoadBatch.func1()
        /Users/tim/projects/arrow/go/parquet/pqarrow/column_readers.go:242 +0x30
   golang.org/x/sync/errgroup.(*Group).Go.func1()
        /Users/tim/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75 
+0x58
   created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 97
        /Users/tim/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:72 
+0x98
   FAIL github.com/apache/arrow/go/v14/parquet/pqarrow  0.546s
   FAIL
   ```
   
   This is using the latest commit from this repo 
(a381c05d596cddd341437de6b277520345f9bb8e).  It appears that the issue is due 
to the encoding of the `geometry` column (a `BYTE_ARRAY`).  I'll try to dig 
more to narrow down the issue.
   
   ### Component(s)
   
   Go, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to