This is an automated email from the ASF dual-hosted git repository.
zeroshade pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-go.git
The following commit(s) were added to refs/heads/main by this push:
new f6891a2 fix(parquet/pqarrow): supress io.EOF in RecordReader.Err()
(#452)
f6891a2 is described below
commit f6891a23c906aef83a701907c7c5aa60ce8469a9
Author: Ryan Schneider <[email protected]>
AuthorDate: Tue Aug 5 09:29:53 2025 -0700
fix(parquet/pqarrow): supress io.EOF in RecordReader.Err() (#452)
Fixes #451.
### Rationale for this change
As mentioned in #451, the `RecordReader` returned from
`pqarrow.NewFileReader(...).GetRecordReader(...)` sets
`RecordReader.Err()` to `io.EOF` once all the records are read. This in
turn causes `flight.StreamChunksFromReader(...)` to propagate the
`io.EOF` err in a `StreamChunk{err: io.EOF}` at the end of the stream,
leading to spurious errors.
### What changes are included in this PR?
At first I thought the fix was to change the implementation of
`recordReader.next(...)` however doing so lead to other issues, so I
think the cleanest change is to simply surpress returning `io.EOF` from
`recordReader.Err()` as in this PR.
### Are these changes tested?
Yes, I tested this changes w/ the server mentioned in #451 and by
running `PARQUET_TEST_DATA="$PWD/parquet-testing/data" go test ./...`
with a recursive clone of the repo.
### Are there any user-facing changes?
I guess this could be considered user-facing since the behavior of the
returned `RecordReader.Err()` is changing, it's possible consumers are
expecting this to return `io.EOF` at the end.
---
parquet/pqarrow/file_reader.go | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/parquet/pqarrow/file_reader.go b/parquet/pqarrow/file_reader.go
index 8fb114c..d19177d 100644
--- a/parquet/pqarrow/file_reader.go
+++ b/parquet/pqarrow/file_reader.go
@@ -881,7 +881,12 @@ func (r *recordReader) Next() bool {
func (r *recordReader) Record() arrow.Record { return r.cur }
-func (r *recordReader) Err() error { return r.err }
+func (r *recordReader) Err() error {
+ if r.err == io.EOF {
+ return nil
+ }
+ return r.err
+}
func (r *recordReader) Read() (arrow.Record, error) {
if r.cur != nil {