kaka11chen opened a new pull request, #309:
URL: https://github.com/apache/doris-thirdparty/pull/309

   When using an older version of pyorc (e.g., pyorc-0.3.0), If there are null 
values in the data, a present stream will be generated for the top level struct 
column.
   However, this behavior does not occur in newer versions of pyorc (e.g., 
pyorc-0.10.0) or in ORC files generated by tools like Hive or Spark.
   Therefore, the present stream generated by the older version causes the ORC 
file to be read twice during late materialization, resulting in an error 'bad 
read in next buffer' during the second read. The current solution is to avoid 
reading the present stream if it is in the top level struct column.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to