Kontinuation commented on code in PR #697:
URL: https://github.com/apache/sedona-db/pull/697#discussion_r3068158187


##########
c/sedona-gdal/src/gdal.rs:
##########
@@ -0,0 +1,237 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! High-level convenience wrapper around [`GdalApi`].
+//!
+//! [`Gdal`] bundles a `&'static GdalApi` reference and exposes ergonomic
+//! methods that delegate to the lower-level constructors and free functions
+//! scattered across the crate, eliminating the need to pass `api` explicitly
+//! at every call site.
+
+use crate::config;
+use crate::dataset::Dataset;
+use crate::driver::{Driver, DriverManager};
+use crate::errors::Result;
+use crate::gdal_api::GdalApi;
+use crate::gdal_dyn_bindgen::{GDALOpenFlags, OGRFieldType};
+use crate::mem::create_mem_dataset;
+use crate::raster::polygonize::{polygonize, PolygonizeOptions};
+use crate::raster::rasterband::RasterBand;
+use crate::raster::rasterize::{rasterize, RasterizeOptions};
+use crate::raster::rasterize_affine::rasterize_affine;
+use crate::raster::types::DatasetOptions;
+use crate::raster::types::GdalDataType;
+use crate::spatial_ref::SpatialRef;
+use crate::vector::feature::FieldDefn;
+use crate::vector::geometry::Geometry;
+use crate::vector::layer::Layer;
+use crate::vrt::VrtDataset;
+use crate::vsi;
+
+/// High-level convenience wrapper around [`GdalApi`].
+///
+/// Stores a `&'static GdalApi` reference and provides ergonomic methods that
+/// delegate to the various constructors and free functions in the crate.
+pub struct Gdal {
+    api: &'static GdalApi,
+}
+
+impl Gdal {
+    /// Create a `Gdal` wrapper for the given API reference.
+    pub(crate) fn new(api: &'static GdalApi) -> Self {
+        Self { api }
+    }
+
+    // -- Info ----------------------------------------------------------------
+
+    /// Return the name of the loaded GDAL library.
+    pub fn name(&self) -> &str {
+        self.api.name()
+    }
+
+    /// Fetch GDAL version information for a standard request key.
+    pub fn version_info(&self, request: &str) -> String {

Review Comment:
   Added "See also" style pointers to more comprehensive docs.



##########
c/sedona-gdal/src/gdal.rs:
##########
@@ -0,0 +1,237 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! High-level convenience wrapper around [`GdalApi`].
+//!
+//! [`Gdal`] bundles a `&'static GdalApi` reference and exposes ergonomic
+//! methods that delegate to the lower-level constructors and free functions
+//! scattered across the crate, eliminating the need to pass `api` explicitly
+//! at every call site.
+
+use crate::config;
+use crate::dataset::Dataset;
+use crate::driver::{Driver, DriverManager};
+use crate::errors::Result;
+use crate::gdal_api::GdalApi;
+use crate::gdal_dyn_bindgen::{GDALOpenFlags, OGRFieldType};
+use crate::mem::create_mem_dataset;
+use crate::raster::polygonize::{polygonize, PolygonizeOptions};
+use crate::raster::rasterband::RasterBand;
+use crate::raster::rasterize::{rasterize, RasterizeOptions};
+use crate::raster::rasterize_affine::rasterize_affine;
+use crate::raster::types::DatasetOptions;
+use crate::raster::types::GdalDataType;
+use crate::spatial_ref::SpatialRef;
+use crate::vector::feature::FieldDefn;
+use crate::vector::geometry::Geometry;
+use crate::vector::layer::Layer;
+use crate::vrt::VrtDataset;
+use crate::vsi;
+
+/// High-level convenience wrapper around [`GdalApi`].
+///
+/// Stores a `&'static GdalApi` reference and provides ergonomic methods that
+/// delegate to the various constructors and free functions in the crate.
+pub struct Gdal {
+    api: &'static GdalApi,
+}
+
+impl Gdal {

Review Comment:
   This `Gdal` facade is to avoid passing the `api: GdalApi` value every time 
we call a GDAL safe API. I have also found that exposing `with_global_gdal` 
instead of `with_global_gdal_api` makes the users of the gdal crate much easier 
to discover GDAL APIs, since all almost all GDAL APIs were grouped in the same 
place instead of scattering in different modules.



##########
c/sedona-gdal/src/mem.rs:
##########
@@ -0,0 +1,441 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! High-level builder for creating in-memory (MEM) GDAL datasets.
+//!
+//! [`MemDatasetBuilder`] provides a fluent, type-safe API for constructing 
GDAL MEM
+//! datasets with zero-copy band attachment, optional geo-transform, 
projection, and
+//! per-band nodata values.
+
+use crate::dataset::Dataset;
+use crate::errors::Result;
+use crate::gdal::Gdal;
+use crate::gdal_api::{call_gdal_api, GdalApi};
+use crate::gdal_dyn_bindgen::CE_Failure;
+use crate::raster::types::GdalDataType;
+
+/// Nodata value for a raster band.
+///
+/// GDAL has three separate APIs for setting nodata depending on the band data 
type:
+/// - [`f64`] for most types (UInt8 through Float64, excluding Int64/UInt64)
+/// - [`i64`] for Int64 bands
+/// - [`u64`] for UInt64 bands
+///
+/// This enum encapsulates the three nodata value representations exposed by 
GDAL.
+#[derive(Debug, Clone, Copy, PartialEq)]
+pub enum Nodata {
+    F64(f64),
+    I64(i64),
+    U64(u64),
+}
+
+/// A band specification for [`MemDatasetBuilder`].
+struct MemBand {
+    data_type: GdalDataType,
+    data_ptr: *mut u8,
+    pixel_offset: Option<i64>,
+    line_offset: Option<i64>,
+    nodata: Option<Nodata>,
+}
+
+/// A builder for constructing in-memory (MEM) GDAL datasets.
+///
+/// This creates datasets using `MEMDataset::Create` (bypassing GDAL's 
open-dataset-list
+/// mutex for better concurrency) and attaches bands via `GDALAddBand` with 
`DATAPOINTER`
+/// options for zero-copy operation.
+///
+/// # Safety
+///
+/// All `add_band*` methods are `unsafe` because the caller must ensure that 
the
+/// provided data pointers remain valid for the lifetime of the built 
[`Dataset`],
+/// satisfy the alignment requirements of the band data type, and refer to 
writable
+/// memory if GDAL may write through the attached `DATAPOINTER` band.
+pub struct MemDatasetBuilder {
+    width: usize,
+    height: usize,
+    n_owned_bands: usize,
+    owned_bands_data_type: Option<GdalDataType>,
+    bands: Vec<MemBand>,
+    geo_transform: Option<[f64; 6]>,
+    projection: Option<String>,
+}
+
+impl MemDatasetBuilder {
+    /// Create a builder for a MEM dataset with the given dimensions.
+    pub fn new(width: usize, height: usize) -> Self {
+        Self {
+            width,
+            height,
+            n_owned_bands: 0,
+            owned_bands_data_type: None,
+            bands: Vec::new(),
+            geo_transform: None,
+            projection: None,
+        }
+    }
+
+    /// Create a builder for a MEM dataset with GDAL-owned bands.
+    pub fn new_with_owned_bands(
+        width: usize,
+        height: usize,
+        n_owned_bands: usize,
+        owned_bands_data_type: GdalDataType,
+    ) -> Self {
+        Self {
+            width,
+            height,
+            n_owned_bands,
+            owned_bands_data_type: Some(owned_bands_data_type),
+            bands: Vec::new(),
+            geo_transform: None,
+            projection: None,
+        }
+    }
+
+    /// Create a MEM dataset with GDAL-owned bands.
+    /// This is a safe shortcut for `new_with_owned_bands(...).build(gdal)`.
+    pub fn create(
+        gdal: &Gdal,
+        width: usize,
+        height: usize,
+        n_owned_bands: usize,
+        owned_bands_data_type: GdalDataType,
+    ) -> Result<Dataset> {
+        // SAFETY: `new_with_owned_bands` creates a builder with zero external 
bands,
+        // so no data pointers need to outlive the dataset.
+        unsafe {
+            Self::new_with_owned_bands(width, height, n_owned_bands, 
owned_bands_data_type)
+                .build(gdal)
+        }
+    }
+
+    /// Add a zero-copy band from a raw data pointer.
+    /// Use default contiguous row-major pixel and line offsets.
+    ///
+    /// # Safety
+    ///
+    /// The caller must ensure `data_ptr` points to a valid buffer of at least
+    /// `height * width * data_type.byte_size()` bytes, is properly aligned for
+    /// `data_type`, and outlives the built [`Dataset`].
+    pub unsafe fn add_band(self, data_type: GdalDataType, data_ptr: *mut u8) 
-> Self {
+        self.add_band_with_options(data_type, data_ptr, None, None, None)
+    }
+
+    /// Add a zero-copy band with custom offsets and optional nodata.
+    ///
+    /// # Safety
+    ///
+    /// The caller must ensure `data_ptr` points to a valid buffer of 
sufficient size
+    /// for the given dimensions and offsets, is properly aligned for 
`data_type`, and
+    /// outlives the built [`Dataset`].
+    pub unsafe fn add_band_with_options(
+        mut self,
+        data_type: GdalDataType,
+        data_ptr: *mut u8,
+        pixel_offset: Option<i64>,
+        line_offset: Option<i64>,
+        nodata: Option<Nodata>,
+    ) -> Self {
+        self.bands.push(MemBand {
+            data_type,
+            data_ptr,
+            pixel_offset,
+            line_offset,
+            nodata,
+        });
+        self
+    }
+
+    /// Set the dataset geotransform coefficients.
+    pub fn geo_transform(mut self, gt: [f64; 6]) -> Self {
+        self.geo_transform = Some(gt);
+        self
+    }
+
+    /// Set the dataset projection definition string.
+    pub fn projection(mut self, wkt: impl Into<String>) -> Self {
+        self.projection = Some(wkt.into());

Review Comment:
   It does not have to be a WKT. I have renamed the parameter to `projection` 
to avoid confusion. It calls `GDALSetProjection` to set the projection, which 
handles X/Y axis order automatically. The unique semantics of 
`GDALSetProjection` makes it a bit different from `GDALSetSpatialRef`, and I 
believe that `GDALSetProjection` is a more suitable API for sedona-db's axis 
order convention.



##########
c/sedona-gdal/src/dataset.rs:
##########
@@ -300,11 +300,11 @@ impl Dataset {
     ///
     /// # Safety
     ///
-    /// `data_ptr` must point to valid band data that outlives this dataset.
+    /// `data_ptr` must point to valid mutable band data that outlives this 
dataset.
     pub unsafe fn add_band_with_data(
         &self,
         data_type: RustGdalDataType,
-        data_ptr: *const u8,
+        data_ptr: *mut u8,

Review Comment:
   `*mut u8` is required because this crate exposes safe APIs like 
`RasterBand::write`, `rasterize`, and `rasterize_affine` that can write through 
the attached MEM band into the caller-provided backing buffer. If the builder 
accepted `*const u8`, safe Rust could later trigger writes into immutable 
memory.
   
   Actually one of the most important use case of the MEM dataset is for 
holding the result of rasterize, or the intermediate mask raster as a result of 
rasterize when running zonal stats. The underlying buffer will be mutated in 
such use cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to