wypoon commented on PR #11661:
URL: https://github.com/apache/iceberg/pull/11661#issuecomment-2828525455

   @pvary I ran an existing benchmark, 
`VectorizedReadDictionaryEncodedFlatParquetDataBenchmark`, which exercises the 
`RLE` case (but not the `PACKED` case) of the refactored code. It does exercise 
both arms of the if-else in
   ```
               if (valuesReader instanceof ValuesAsBytesReader) {
                 nextRleBatch(...);
               } else if (valuesReader instanceof 
VectorizedDictionaryEncodedParquetValuesReader) {
                 nextRleDictEncodedBatch(...);
               }
   ```
   so the instanceof is being exercised.
   I ran the benchmark on main (without this change) and on this branch after 
rebasing on main.
   The results are:
   main:
   ```
   Benchmark                                                                    
               Mode  Cnt   Score   Error  Units
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readBigDecimalsIcebergVectorized5k
    ss    5  15.490 ± 1.897   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readBigDecimalsSparkVectorized5k
      ss    5  15.988 ± 1.314   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesIcebergVectorized5k
          ss    5   5.979 ± 0.286   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesSparkVectorized5k
            ss    5   5.057 ± 0.501   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k
       ss    5   9.116 ± 1.352   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsSparkVectorized5k
         ss    5   8.738 ± 0.375   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesIcebergVectorized5k
        ss    5   7.617 ± 0.522   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesSparkVectorized5k
          ss    5   8.292 ± 1.026   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsIcebergVectorized5k
         ss    5   4.818 ± 0.283   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsSparkVectorized5k
           ss    5   4.069 ± 0.630   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersIcebergVectorized5k
       ss    5   5.510 ± 0.249   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersSparkVectorized5k
         ss    5   5.604 ± 0.933   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsIcebergVectorized5k
          ss    5   4.565 ± 0.253   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsSparkVectorized5k
            ss    5   4.604 ± 0.769   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsIcebergVectorized5k
        ss    5   6.674 ± 0.337   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsSparkVectorized5k
          ss    5   7.390 ± 1.092   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k
     ss    5   5.373 ± 0.351   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsSparkVectorized5k
       ss    5   4.855 ± 0.594   s/op
   ```
   this branch:
   ```
   Benchmark                                                                    
               Mode  Cnt   Score   Error  Units
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readBigDecimalsIcebergVectorized5k
    ss    5  14.120 ± 0.898   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readBigDecimalsSparkVectorized5k
      ss    5  14.878 ± 0.543   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesIcebergVectorized5k
          ss    5   4.006 ± 0.311   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesSparkVectorized5k
            ss    5   4.965 ± 1.272   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k
       ss    5   4.976 ± 0.847   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsSparkVectorized5k
         ss    5   5.509 ± 0.935   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesIcebergVectorized5k
        ss    5   5.200 ± 0.201   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesSparkVectorized5k
          ss    5   5.049 ± 0.617   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsIcebergVectorized5k
         ss    5   4.910 ± 0.282   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsSparkVectorized5k
           ss    5   4.272 ± 1.881   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersIcebergVectorized5k
       ss    5   5.431 ± 0.137   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersSparkVectorized5k
         ss    5   4.450 ± 1.899   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsIcebergVectorized5k
          ss    5   4.161 ± 0.219   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsSparkVectorized5k
            ss    5   4.633 ± 0.874   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsIcebergVectorized5k
        ss    5   6.038 ± 0.269   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsSparkVectorized5k
          ss    5   7.911 ± 0.378   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k
     ss    5   5.517 ± 0.400   s/op
   
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsSparkVectorized5k
       ss    5   5.087 ± 0.811   s/op
   ```
   The refactor does not appear to make the performance worse.
     


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to