jiayuasu commented on code in PR #2802:
URL: https://github.com/apache/sedona/pull/2802#discussion_r3006864664


##########
docs/tutorial/raster.md:
##########
@@ -503,47 +565,93 @@ Please refer to [Raster visualizer 
docs](../api/sql/Raster-Functions.md#raster-o
 
 ## Save to permanent storage
 
-Sedona has APIs that can save an entire raster column to files in a specified 
location. Before saving, the raster type column needs to be converted to a 
binary format. Sedona provides several functions to convert a raster column 
into a binary column suitable for file storage. Once in binary format, the 
raster data can then be written to files on disk using the Sedona file storage 
APIs.
-
-```sparksql
-rasterDf.write.format("raster").option("rasterField", 
"raster").option("fileExtension", 
".tiff").mode(SaveMode.Overwrite).save(dirPath)
-```
+Saving raster data is a two-step process: (1) convert the Raster column to 
binary format using an `RS_AsXXX` function, and (2) write the binary DataFrame 
to files using Sedona's `raster` data source writer.
 
-Sedona has a few writer functions that create the binary DataFrame necessary 
for saving the raster images.
+### Step 1: Convert to binary format
 
-### As Arc Grid
+Choose one of the following output format functions:
 
-Use [RS_AsArcGrid](../api/sql/Raster-writer.md#rs_asarcgrid) to get the binary 
Dataframe of the raster in Arc Grid format.
+| Function | Format | Description |
+| :--- | :--- | :--- |
+| [RS_AsGeoTiff](../api/sql/Raster-Output/RS_AsGeoTiff.md) | GeoTiff | 
General-purpose raster format with optional compression |
+| [RS_AsCOG](../api/sql/Raster-Output/RS_AsCOG.md) | Cloud Optimized GeoTiff | 
Ideal for cloud storage with efficient range-read access |
+| [RS_AsArcGrid](../api/sql/Raster-Output/RS_AsArcGrid.md) | Arc Grid | 
ASCII-based format, single band only |
+| [RS_AsPNG](../api/sql/Raster-Output/RS_AsPNG.md) | PNG | Image format, 
unsigned integer pixel types only |
 
 ```sql
-SELECT RS_AsArcGrid(raster)
+SELECT RS_AsGeoTiff(rast) AS raster_binary FROM rasterDf
 ```
 
-### As GeoTiff
+### Step 2: Write to files
 
-Use [RS_AsGeoTiff](../api/sql/Raster-writer.md#rs_asgeotiff) to get the binary 
Dataframe of the raster in GeoTiff format.
+Use Sedona's built-in `raster` data source to write the binary DataFrame:
 
-```sql
-SELECT RS_AsGeoTiff(raster)
-```
+=== "Scala"
+    ```scala
+    rasterDf.withColumn("raster_binary", expr("RS_AsGeoTiff(rast)"))
+      .write.format("raster").mode("overwrite").save("my_raster_file")
+    ```
 
-### As Cloud Optimized GeoTiff
+=== "Python"
+    ```python
+    rasterDf.withColumn("raster_binary", 
expr("RS_AsGeoTiff(rast)")).write.format(
+        "raster"
+    ).mode("overwrite").save("my_raster_file")
+    ```
 
-Use [RS_AsCOG](../api/sql/Raster-writer.md#rs_ascog) to get the binary 
Dataframe of the raster in [Cloud Optimized GeoTiff](https://www.cogeo.org/) 
(COG) format. COG is ideal for cloud-hosted raster data because it supports 
efficient range-read access over HTTP.
+The writer data source options are:
 
-```sql
-SELECT RS_AsCOG(raster)
-```
+| Option | Default | Description |
+| :--- | :--- | :--- |
+| `rasterField` | The `binary` type column | The name of the binary column to 
write. Required if the DataFrame has multiple binary columns. |
+| `fileExtension` | `.tiff` | File extension for output files (e.g., `.png`, 
`.asc`). |
+| `pathField` | None | Column name containing the output file paths. If not 
set, each file gets a random UUID name. |

Review Comment:
   Fixed in 5b46a2b. Verified against source code 
(`RasterFileFormat.scala:130,146-162`): `.getName()` strips directory 
components, and the extension is stripped at `lastIndexOf('.')` before 
appending `fileExtension`.



##########
docs/tutorial/raster.md:
##########
@@ -503,47 +565,93 @@ Please refer to [Raster visualizer 
docs](../api/sql/Raster-Functions.md#raster-o
 
 ## Save to permanent storage
 
-Sedona has APIs that can save an entire raster column to files in a specified 
location. Before saving, the raster type column needs to be converted to a 
binary format. Sedona provides several functions to convert a raster column 
into a binary column suitable for file storage. Once in binary format, the 
raster data can then be written to files on disk using the Sedona file storage 
APIs.
-
-```sparksql
-rasterDf.write.format("raster").option("rasterField", 
"raster").option("fileExtension", 
".tiff").mode(SaveMode.Overwrite).save(dirPath)
-```
+Saving raster data is a two-step process: (1) convert the Raster column to 
binary format using an `RS_AsXXX` function, and (2) write the binary DataFrame 
to files using Sedona's `raster` data source writer.
 
-Sedona has a few writer functions that create the binary DataFrame necessary 
for saving the raster images.
+### Step 1: Convert to binary format
 
-### As Arc Grid
+Choose one of the following output format functions:
 
-Use [RS_AsArcGrid](../api/sql/Raster-writer.md#rs_asarcgrid) to get the binary 
Dataframe of the raster in Arc Grid format.
+| Function | Format | Description |
+| :--- | :--- | :--- |
+| [RS_AsGeoTiff](../api/sql/Raster-Output/RS_AsGeoTiff.md) | GeoTiff | 
General-purpose raster format with optional compression |
+| [RS_AsCOG](../api/sql/Raster-Output/RS_AsCOG.md) | Cloud Optimized GeoTiff | 
Ideal for cloud storage with efficient range-read access |
+| [RS_AsArcGrid](../api/sql/Raster-Output/RS_AsArcGrid.md) | Arc Grid | 
ASCII-based format, single band only |
+| [RS_AsPNG](../api/sql/Raster-Output/RS_AsPNG.md) | PNG | Image format, 
unsigned integer pixel types only |
 
 ```sql
-SELECT RS_AsArcGrid(raster)
+SELECT RS_AsGeoTiff(rast) AS raster_binary FROM rasterDf
 ```
 
-### As GeoTiff
+### Step 2: Write to files
 
-Use [RS_AsGeoTiff](../api/sql/Raster-writer.md#rs_asgeotiff) to get the binary 
Dataframe of the raster in GeoTiff format.
+Use Sedona's built-in `raster` data source to write the binary DataFrame:
 
-```sql
-SELECT RS_AsGeoTiff(raster)
-```
+=== "Scala"
+    ```scala
+    rasterDf.withColumn("raster_binary", expr("RS_AsGeoTiff(rast)"))
+      .write.format("raster").mode("overwrite").save("my_raster_file")
+    ```
 
-### As Cloud Optimized GeoTiff
+=== "Python"
+    ```python
+    rasterDf.withColumn("raster_binary", 
expr("RS_AsGeoTiff(rast)")).write.format(
+        "raster"
+    ).mode("overwrite").save("my_raster_file")
+    ```
 
-Use [RS_AsCOG](../api/sql/Raster-writer.md#rs_ascog) to get the binary 
Dataframe of the raster in [Cloud Optimized GeoTiff](https://www.cogeo.org/) 
(COG) format. COG is ideal for cloud-hosted raster data because it supports 
efficient range-read access over HTTP.
+The writer data source options are:
 
-```sql
-SELECT RS_AsCOG(raster)
-```
+| Option | Default | Description |
+| :--- | :--- | :--- |
+| `rasterField` | The `binary` type column | The name of the binary column to 
write. Required if the DataFrame has multiple binary columns. |

Review Comment:
   Fixed in 5b46a2b. Verified against source code 
(`RasterFileFormat.scala:111-120`): the loop iterates all schema fields and 
overwrites on each binary match, so it picks the last binary column.



##########
docs/tutorial/raster.md:
##########
@@ -503,47 +565,93 @@ Please refer to [Raster visualizer 
docs](../api/sql/Raster-Functions.md#raster-o
 
 ## Save to permanent storage
 
-Sedona has APIs that can save an entire raster column to files in a specified 
location. Before saving, the raster type column needs to be converted to a 
binary format. Sedona provides several functions to convert a raster column 
into a binary column suitable for file storage. Once in binary format, the 
raster data can then be written to files on disk using the Sedona file storage 
APIs.
-
-```sparksql
-rasterDf.write.format("raster").option("rasterField", 
"raster").option("fileExtension", 
".tiff").mode(SaveMode.Overwrite).save(dirPath)
-```
+Saving raster data is a two-step process: (1) convert the Raster column to 
binary format using an `RS_AsXXX` function, and (2) write the binary DataFrame 
to files using Sedona's `raster` data source writer.
 
-Sedona has a few writer functions that create the binary DataFrame necessary 
for saving the raster images.
+### Step 1: Convert to binary format
 
-### As Arc Grid
+Choose one of the following output format functions:
 
-Use [RS_AsArcGrid](../api/sql/Raster-writer.md#rs_asarcgrid) to get the binary 
Dataframe of the raster in Arc Grid format.
+| Function | Format | Description |
+| :--- | :--- | :--- |
+| [RS_AsGeoTiff](../api/sql/Raster-Output/RS_AsGeoTiff.md) | GeoTiff | 
General-purpose raster format with optional compression |
+| [RS_AsCOG](../api/sql/Raster-Output/RS_AsCOG.md) | Cloud Optimized GeoTiff | 
Ideal for cloud storage with efficient range-read access |
+| [RS_AsArcGrid](../api/sql/Raster-Output/RS_AsArcGrid.md) | Arc Grid | 
ASCII-based format, single band only |
+| [RS_AsPNG](../api/sql/Raster-Output/RS_AsPNG.md) | PNG | Image format, 
unsigned integer pixel types only |
 
 ```sql
-SELECT RS_AsArcGrid(raster)
+SELECT RS_AsGeoTiff(rast) AS raster_binary FROM rasterDf
 ```
 
-### As GeoTiff
+### Step 2: Write to files
 
-Use [RS_AsGeoTiff](../api/sql/Raster-writer.md#rs_asgeotiff) to get the binary 
Dataframe of the raster in GeoTiff format.
+Use Sedona's built-in `raster` data source to write the binary DataFrame:
 
-```sql
-SELECT RS_AsGeoTiff(raster)
-```
+=== "Scala"
+    ```scala
+    rasterDf.withColumn("raster_binary", expr("RS_AsGeoTiff(rast)"))
+      .write.format("raster").mode("overwrite").save("my_raster_file")
+    ```
 
-### As Cloud Optimized GeoTiff
+=== "Python"
+    ```python
+    rasterDf.withColumn("raster_binary", 
expr("RS_AsGeoTiff(rast)")).write.format(
+        "raster"
+    ).mode("overwrite").save("my_raster_file")
+    ```
 
-Use [RS_AsCOG](../api/sql/Raster-writer.md#rs_ascog) to get the binary 
Dataframe of the raster in [Cloud Optimized GeoTiff](https://www.cogeo.org/) 
(COG) format. COG is ideal for cloud-hosted raster data because it supports 
efficient range-read access over HTTP.
+The writer data source options are:
 
-```sql
-SELECT RS_AsCOG(raster)
-```
+| Option | Default | Description |
+| :--- | :--- | :--- |
+| `rasterField` | The `binary` type column | The name of the binary column to 
write. Required if the DataFrame has multiple binary columns. |
+| `fileExtension` | `.tiff` | File extension for output files (e.g., `.png`, 
`.asc`). |
+| `pathField` | None | Column name containing the output file paths. If not 
set, each file gets a random UUID name. |
+| `useDirectCommitter` | `true` | If `true`, files are written directly to the 
target location. If `false`, files are written to a temp location first. 
Writing with `false` is slower, especially on object stores like S3. |
 
-### As PNG
+Example with all options:
 
-Use [RS_AsPNG](../api/sql/Raster-writer.md#rs_aspng) to get the binary 
Dataframe of the raster in PNG format.
+=== "Scala"
+    ```scala
+    rasterDf.withColumn("raster_binary", expr("RS_AsGeoTiff(rast)"))
+      .write.format("raster")
+      .option("rasterField", "raster_binary")
+      .option("pathField", "path")
+      .option("fileExtension", ".tiff")
+      .mode("overwrite")
+      .save("my_raster_file")

Review Comment:
   Fixed in 5b46a2b. Changed to `"name"` which is the column produced by the 
`raster` data source.



##########
docs/api/sql/Raster-Output/RS_AsGeoTiff.md:
##########
@@ -0,0 +1,65 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# RS_AsGeoTiff
+
+Introduction: Returns a binary DataFrame from a Raster DataFrame. Each raster 
object in the resulting DataFrame is a GeoTiff image in binary format.

Review Comment:
   Fixed in 5b46a2b. Reworded to describe per-row return value.



##########
docs/api/sql/Raster-Output/RS_AsArcGrid.md:
##########
@@ -0,0 +1,63 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# RS_AsArcGrid
+
+Introduction: Returns a binary DataFrame from a Raster DataFrame. Each raster 
object in the resulting DataFrame is an ArcGrid image in binary format. ArcGrid 
only takes 1 source band. If your raster has multiple bands, you need to 
specify which band you want to use as the source.

Review Comment:
   Fixed in 5b46a2b.



##########
docs/api/sql/Raster-Output/RS_AsPNG.md:
##########
@@ -0,0 +1,62 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# RS_AsPNG
+
+Introduction: Returns a PNG byte array, that can be written to raster files as 
PNGs using the Sedona raster data source writer. This function can only accept 
pixel data type of unsigned integer. PNG can accept 1 or 3 bands of data from 
the raster, refer to [RS_Band](../Raster-Band-Accessors/RS_Band.md) for more 
details.

Review Comment:
   RS_AsPNG intro was already correctly worded ("Returns a PNG byte array"). No 
change needed.



##########
docs/api/sql/Raster-Functions.md:
##########
@@ -197,3 +198,7 @@ These functions convert raster data to various output 
formats for visualization.
 | [RS_AsBase64](Raster-Output/RS_AsBase64.md) | String | Returns a base64 
encoded string of the given raster. If the datatype is integral then this 
function internally takes the first 4 bands as RGBA, and converts them to the 
PNG format, finally produces... | v1.5.0 |
 | [RS_AsImage](Raster-Output/RS_AsImage.md) | String | Returns a HTML that 
when rendered using an HTML viewer or via a Jupyter Notebook, displays the 
raster as a square image of side length `imageWidth`. Optionally, an imageWidth 
parameter can be passe... | v1.5.0 |
 | [RS_AsMatrix](Raster-Output/RS_AsMatrix.md) | String | Returns a string, 
that when printed, outputs the raster band as a pretty printed 2D matrix. All 
the values of the raster are cast to double for the string. RS_AsMatrix allows 
specifying the number ... |  |
+| [RS_AsArcGrid](Raster-Output/RS_AsArcGrid.md) | Binary | Returns a binary 
DataFrame from a Raster DataFrame. Each raster object is an ArcGrid image in 
binary format. | v1.4.1 |
+| [RS_AsGeoTiff](Raster-Output/RS_AsGeoTiff.md) | Binary | Returns a binary 
DataFrame from a Raster DataFrame. Each raster object is a GeoTiff image in 
binary format. | v1.4.1 |

Review Comment:
   Fixed in 5b46a2b. Updated all descriptions in the index table.



##########
docs/tutorial/raster.md:
##########
@@ -560,11 +668,7 @@ The raster objects are represented as `SedonaRaster` 
objects in Python, which ca
     ```
 
 ```python
-df_raster = (
-    sedona.read.format("binaryFile")
-    .load("/path/to/raster.tif")
-    .selectExpr("RS_FromGeoTiff(content) as rast")
-)
+df_raster = sedona.read.format("raster").load("/path/to/raster.tif")
 rows = df_raster.collect()
 raster = rows[0].rast
 raster  # <sedona.raster.sedona_raster.InDbSedonaRaster at 0x1618fb1f0>

Review Comment:
   Fixed in 5b46a2b. Added `.option("retile", "false")` to the collect example.



##########
docs/api/sql/Raster-Output/RS_AsCOG.md:
##########
@@ -0,0 +1,85 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# RS_AsCOG
+
+Introduction: Returns a binary DataFrame from a Raster DataFrame. Each raster 
object in the resulting DataFrame is a [Cloud Optimized 
GeoTIFF](https://www.cogeo.org/) (COG) image in binary format. COG is a GeoTIFF 
that is internally organized to enable efficient range-read access over HTTP, 
making it ideal for cloud-hosted raster data.

Review Comment:
   Fixed in 5b46a2b.



##########
docs/api/sql/Raster-Operators/RS_AsRaster.md:
##########
@@ -0,0 +1,116 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# RS_AsRaster
+
+Introduction: `RS_AsRaster` converts a vector geometry into a raster dataset 
by assigning a specified value to all pixels covered by the geometry. Unlike 
`RS_Clip`, which extracts a subset of an existing raster while preserving its 
original values, `RS_AsRaster` generates a new raster where the geometry is 
rasterized onto a raster grid. The function supports all geometry types and 
takes the following parameters:
+
+* `geom`: The geometry to be rasterized.
+* `raster`: The reference raster to be used for overlaying the `geom` on.
+* `pixelType`: Defines data type of the output raster. This can be one of the 
following, D (double), F (float), I (integer), S (short), US (unsigned short) 
or B (byte).
+* `allTouched` (Since: `v1.7.1`): Decides the pixel selection criteria. If set 
to `true`, the function selects all pixels touched by the geometry, else, 
selects only pixels whose centroids intersect the geometry. Defaults to `false`.
+* `value`: The value to be used for assigning pixels covered by the geometry. 
Defaults to using `1.0` if not provided.
+* `noDataValue`: Used for assigning the no data value of the resultant raster. 
Defaults to `null` if not provided.
+* `useGeometryExtent`: Defines the extent of the resultant raster. When set to 
`true`, it corresponds to the extent of `geom`, and when set to false, it 
corresponds to the extent of `raster`. Default value is `true` if not set.
+
+Format:
+
+```
+RS_AsRaster(geom: Geometry, raster: Raster, pixelType: String, allTouched: 
Boolean, value: Double, noDataValue: Double, useGeometryExtent: Boolean)
+```
+
+```
+RS_AsRaster(geom: Geometry, raster: Raster, pixelType: String, allTouched: 
Boolean, value: Double, noDataValue: Double)
+```
+
+```
+RS_AsRaster(geom: Geometry, raster: Raster, pixelType: String, allTouched: 
Boolean, value: Double)
+```
+
+```
+RS_AsRaster(geom: Geometry, raster: Raster, pixelType: String, allTouched: 
Boolean)
+```
+
+```
+RS_AsRaster(geom: Geometry, raster: Raster, pixelType: String)
+```
+
+Return type: `Raster`
+
+Since: `v1.5.0`
+
+!!!note
+    The function doesn't support rasters that have any one of the following 
properties:
+    ```
+    ScaleX < 0
+    ScaleY > 0
+    SkewX != 0
+    SkewY != 0
+    ```
+    If a raster is provided with anyone of these properties then 
IllegalArgumentException is thrown.

Review Comment:
   Fixed in 5b46a2b.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to