jiayuasu opened a new issue, #2861:
URL: https://github.com/apache/sedona/issues/2861

   ## Description
   
   `Catalog.scala` currently registers ~340 functions in a single flat 
`Seq[FunctionDescription]`, with comments delimiting groups (`// Expression for 
vectors`, `// Expression for rasters`, `// geom <-> geog conversion 
functions`). The flat structure has two practical drawbacks:
   
   1. **Categorization drifts.** The comment-based grouping is loose, and over 
time predicates, accessors, and operations end up interleaved in the vector 
section. New contributors don't have a clear hint about where to add a new 
function.
   2. **Hard to reuse the grouping.** Downstream consumers that want 
category-level information (for telemetry buckets, docs generation, or registry 
partitioning) have to maintain a parallel mapping that drifts from the 
canonical list.
   
   ## Proposal
   
   Split the flat `expressions` list into named category sequences and 
concatenate them, so the file's structure encodes the categorization explicitly:
   
   ```scala
   val stConstructorExprs: Seq[FunctionDescription] = Seq(...)
   val stPredicateExprs: Seq[FunctionDescription] = Seq(...)
   val stAccessorExprs: Seq[FunctionDescription] = Seq(...)
   val stOperationExprs: Seq[FunctionDescription] = Seq(...)
   val stSerializationExprs: Seq[FunctionDescription] = Seq(...)
   val stIndexingExprs: Seq[FunctionDescription] = Seq(...)
   val stJoinExprs: Seq[FunctionDescription] = Seq(...)
   val stGeographyExprs: Seq[FunctionDescription] = Seq(...)
   val otherExprs: Seq[FunctionDescription] = Seq(...)
   
   val rsConstructorExprs: Seq[FunctionDescription] = Seq(...)
   val rsAccessorExprs: Seq[FunctionDescription] = Seq(...)
   val rsOperationExprs: Seq[FunctionDescription] = Seq(...)
   val rsOutputExprs: Seq[FunctionDescription] = Seq(...)
   
   override val expressions: Seq[FunctionDescription] =
     stConstructorExprs ++ stGeographyExprs ++ stPredicateExprs ++
       stAccessorExprs ++ stOperationExprs ++ stSerializationExprs ++
       stIndexingExprs ++ stJoinExprs ++ otherExprs ++
       rsConstructorExprs ++ rsAccessorExprs ++ rsOperationExprs ++
       rsOutputExprs ++ geoStatsFunctions()
   ```
   
   ### Benefits
   
   - **Explicit categorization** at the type/code level, not just in comments. 
Adding a new function requires picking a category sequence, which is a much 
clearer hint than "add it somewhere in this 340-line list".
   - **Reusable for downstream needs.** Anyone wanting category-level 
information (e.g., for usage telemetry buckets, docs generation, or selective 
registration) can map over the named sequences directly.
   - **Preserved registration order.** Concatenating in the same order as today 
keeps registration semantics identical, so there is no behavior change.
   
   ### Non-goals
   
   - No new functions, no removals, no signature changes. Pure code 
organization.
   - The category names are not part of any public API and can be tuned during 
review.
   
   ## Categories (proposed)
   
   | Category | Examples |
   |----------|----------|
   | `stConstructorExprs` | ST_Point, ST_GeomFromText, ST_MakeLine |
   | `stGeographyExprs` | ST_GeogFromText, ST_GeogFromWKB, ST_GeomToGeography |
   | `stPredicateExprs` | ST_Intersects, ST_Contains, ST_Within, ST_DWithin |
   | `stAccessorExprs` | ST_Area, ST_Length, ST_Envelope, ST_X, ST_Y |
   | `stOperationExprs` | ST_Buffer, ST_Union, ST_Transform, ST_Simplify |
   | `stSerializationExprs` | ST_AsText, ST_AsGeoJSON, ST_GeoHash |
   | `stIndexingExprs` | ST_H3CellIDs, ST_S2CellIDs, ST_BingTile |
   | `stJoinExprs` | ST_KNN |
   | `otherExprs` | ExpandAddress, ParseAddress, Barrier |
   | `rsConstructorExprs` | RS_FromGeoTiff, RS_MakeRaster, RS_AsRaster |
   | `rsAccessorExprs` | RS_Envelope, RS_Metadata, RS_Value |
   | `rsOperationExprs` | RS_MapAlgebra, RS_Add, RS_Clip, RS_Tile |
   | `rsOutputExprs` | RS_AsGeoTiff, RS_AsPNG, RS_AsBase64 |
   
   ## Backward compatibility
   
   None affected. `expressions` is still a `Seq[FunctionDescription]` of the 
same size and order; `registerAll` is unchanged.
   
   I'd like to send a PR for this if there is interest. Happy to take feedback 
on the category names and granularity before coding it up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to