pwrliang commented on PR #722:
URL: https://github.com/apache/sedona-db/pull/722#issuecomment-4101910064

   > Thank you for working on this!
   > 
   > My main concern with this PR is that the GPU details are very entangled 
with the non-GPU details. How about:
   > 
   > * `sedona-spatial-join-common` (defines the traits you nicely separated in 
the last PR and a `SedonaSpatialJoinExtension` trait)
   > * `sedona-spatial-join-gpu` (implements the traits in 
sedona-spatial-join-common)
   > * `sedona-spatial-join` (uses an `Arc<dyn SedonaSpatialJoinExtension>` 
object to decide if an extension join should be used in place of the normal 
join). The hook to register a join extension can be a global function for now 
and you can use a mock join extension built from the default implementation 
index/refiner to test it.
   > * `sedona` (when built with the gpu feature, registers the Gpu join with 
sedona-spatial-join)
   > 
   > There are a few things I think this will help with:
   > 
   > * We may want to add other join accelerators for other hardware (e.g., 
Apple GPU) or other data types (Geography x H3 cell join), or possibly 
Raster/Vector join.
   > * It will make it easier for you to have clear ownership over a 
subdirectory so you can iterate faster.
   > * It makes the `sedona` crate the single place where features are 
assembled (to avoid two levels of feature flags, the consequences of which 
you've started to identify here with the "all except GPU" feature).
   > 
   > I do think it's important to get this right early while we are all here 
and have time to dedicate to this!
   
   Thanks for the suggestions. If I understand correctly, this design implies 
that `sedona-spatial-join-gpu` will execute end-to-end spatial joins. This 
requires several components to work together, such as the optimizer, planner, 
and `SpatialJoinStream`. We would either need to export all these components to 
the GPU module or copy/paste the code. My concern is that this design could 
introduce a lot of redundant code. Since `libgpuspatial` functions as a 
refinement backend (similar to GEOS, GEO, or TG), I'm not sure why adding one 
more backend requiring such significant architectural changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to