richardstartin commented on pull request #7820:
URL: https://github.com/apache/pinot/pull/7820#issuecomment-984548475


   @Jackie-Jiang to demonstrate an open mind, I've implemented your suggestion 
here: 
https://github.com/apache/pinot/pull/7820/commits/f75135887001394abb4097a9ddae486e5cea1c48
   
   I'm not convinced this is necessarily cleaner (`ProjectionBlock` is though) 
and I think it was better to keep the "tunnel" for evaluators separate from the 
regular path with caching. Writing the array copies from the enormous array 
maintained by `DataBlockCache` (because it doesn't have enough information to 
size the result properly) to the correctly sized arrays in 
`JsonExtractScalarFunction` made me want to cry ever so slightly. 
   
   Please pay attention to the changes to the cache key in `DataBlockCache` - 
JSONPath evaluations need to be discriminated by both the JSON Path itself and 
the default value, on top of the column name and type of the result. This is 
probably better than having large JSON documents lingering in the cache, but 
I'm struggling to see this cache as an unequivocal benefit for every possible 
transformation. 
   
   Could we avoid caching JSON path evaluations at the block level and revert 
the commit? If we find that we spend a lot of time on duplicate JSON path 
evaluations, could we think about avoiding the planning of duplicate 
evaluations instead?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to