zhangfengcdt commented on PR #2831: URL: https://github.com/apache/sedona/pull/2831#issuecomment-4261431753
> Cool! > > In s2geography there's quite a bit of code to avoid the shape index build at all for ST_Distance() and each individual predicate, which may bring your results closer to on par. Notably, there's the possible intersection check which should help the "point is not in polygon" case. Thanks for pointing to the s2geography optimizations. I implemented the following two optimization for now: 1. point container to false directly early stop 2. pointTarget for distance: avoids building ShapeIndex for the point side I also tried to implement the MayIntersect that might help with the contains false case, but looks like the MayIntersects path (~700ms) is more expensive than the cached ShapeIndex path (~220ns) in a single call. In this PR, I was thinking to optimize the hot path, which may benefit more with cached ShapeIndex (good for joins). For the code path optimizations, adding mayIntersects pre-filters may be better; potentially adding checking covering intersection pre-filtering as well. I'd prefer to add them in following PRs with a configurable flags to avoid the hot path gets slower. There seems to be tradeoffs we need to make default and also custom sedona configs for users to tune these optimizations since the geography has too much overhead compared to the geometry functions. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
