gnehil opened a new pull request, #273:
URL: https://github.com/apache/doris-spark-connector/pull/273

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   When reading data without specifying the columns to read, the following 
exception is thrown:
   ```java
   java.lang.ArrayIndexOutOfBoundsException: 0
        at 
org.apache.doris.spark.read.DorisPartitionReader.$anonfun$get$1(DorisPartitionReader.scala:60)
        at 
org.apache.doris.spark.read.DorisPartitionReader.$anonfun$get$1$adapted(DorisPartitionReader.scala:56)
        at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
        at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
        at 
org.apache.doris.spark.read.DorisPartitionReader.get(DorisPartitionReader.scala:56)
        at 
org.apache.doris.spark.read.DorisPartitionReader.get(DorisPartitionReader.scala:31)
        at 
org.apache.spark.sql.execution.datasources.v2.PartitionIterator.next(DataSourceRDD.scala:107)
        at 
org.apache.spark.sql.execution.datasources.v2.MetricsRowIterator.next(DataSourceRDD.scala:142)
        at 
org.apache.spark.sql.execution.datasources.v2.MetricsRowIterator.next(DataSourceRDD.scala:139)
        at 
org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:40)
        at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
   ```
   
   Since `ScanBuilder` implements `SupportsPushDownRequiredColumns`, when no 
columns are specified for reading, the array of columns that need to be pruned 
is empty, and the table schema obtained after error handling is also empty.
   Therefore, we add a process to check whether the column to be pruned is 
empty and determine whether the pruned column is included in the table, 
otherwise an exception will be thrown.
   Also added support for the `doris.read.fields` parameter, which processes 
the table schema before the optimizer prunes columns.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to