zhangfengcdt commented on PR #2831:
URL: https://github.com/apache/sedona/pull/2831#issuecomment-4261431753

   > Cool!
   > 
   > In s2geography there's quite a bit of code to avoid the shape index build 
at all for ST_Distance() and each individual predicate, which may bring your 
results closer to on par. Notably, there's the possible intersection check 
which should help the "point is not in polygon" case.
   
   Thanks for pointing to the s2geography optimizations. I implemented the 
following two optimization for now:
   
   1. point container to false directly early stop
   2. pointTarget for distance: avoids building ShapeIndex for the point side
   
   I also tried to implement the MayIntersect that might help with the contains 
false case, but looks like the MayIntersects path (~700ms) is more expensive 
than the cached ShapeIndex path (~220ns) in a single call. 
   
   In this PR, I was thinking to optimize the hot path, which may benefit more 
with cached ShapeIndex (good for joins). For the code path optimizations, 
adding mayIntersects pre-filters may be better; potentially adding checking 
covering intersection pre-filtering as well. I'd prefer to add them in 
following PRs with a configurable flags to avoid the hot path gets slower. 
   
   There seems to be tradeoffs we need to make default and also custom sedona 
configs for users to tune these optimizations since the geography has too much 
overhead compared to the geometry functions. 
   
   What do you think?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to