This is an automated email from the ASF dual-hosted git repository.
dongjoon-hyun pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/spark-connect-swift.git
The following commit(s) were added to refs/heads/main by this push:
new ffe3de2 [SPARK-57090] Make `Documentation` up-to-date
ffe3de2 is described below
commit ffe3de220f97cb363d27a1cd091597b96e6b7169
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Tue May 26 15:48:31 2026 -0700
[SPARK-57090] Make `Documentation` up-to-date
### What changes were proposed in this pull request?
This PR updates `Sources/SparkConnect/Documentation.docc` to surface the
public APIs added during SPARK-57044 ~ SPARK-57087.
- Add `Catalog.md`, `DataFrameReader.md`, `DataFrameWriter.md`
topic-curation pages.
- Expand the top-level `## Topics` in `SparkConnect.md` to cover all major
public types.
- Add missing members (`version`, `newSession()`, `table(_:)`,
`readStream`, `addArtifact`, `executeCommand`, `time`, `streams`, etc.) to
`SparkSession.md`.
- Add a "Catalog Operations" example to `GettingStarted.md`.
### Why are the changes needed?
The DocC bundle had not been updated for the recently added `Catalog`
methods and `DataFrameReader` overloads, leaving them either unlisted or
auto-rendered without category structure.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
- `swift build` passes.
- Every DocC symbol reference in the new/edited `.md` files was
cross-checked against the actual public signatures in the source.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (claude-opus-4-7)
Closes #396 from dongjoon-hyun/SPARK-57090.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
Sources/SparkConnect/Documentation.docc/Catalog.md | 86 ++++++++++++++++++++++
.../Documentation.docc/DataFrameReader.md | 62 ++++++++++++++++
.../Documentation.docc/DataFrameWriter.md | 67 +++++++++++++++++
.../Documentation.docc/GettingStarted.md | 19 +++++
.../Documentation.docc/SparkConnect.md | 27 ++++++-
.../Documentation.docc/SparkSession.md | 22 ++++++
6 files changed, 282 insertions(+), 1 deletion(-)
diff --git a/Sources/SparkConnect/Documentation.docc/Catalog.md
b/Sources/SparkConnect/Documentation.docc/Catalog.md
new file mode 100644
index 0000000..fb8b338
--- /dev/null
+++ b/Sources/SparkConnect/Documentation.docc/Catalog.md
@@ -0,0 +1,86 @@
+# ``SparkConnect/Catalog``
+
+Interface for managing catalogs, databases, tables, views, functions,
partitions, and caching.
+
+## Overview
+
+`Catalog` is accessible via ``SparkSession/catalog`` and provides a
programmatic way to query and manipulate the metadata layer of a Spark cluster
— including catalog selection, database lifecycle, table/view discovery,
function lookup, partition recovery, and table-level caching and analysis.
+
+```swift
+let spark = try await SparkSession.builder.getOrCreate()
+
+// Discover and switch
+let dbs = try await spark.catalog.listDatabases()
+try await spark.catalog.setCurrentDatabase("analytics")
+
+// Create / drop
+try await spark.catalog.createDatabase("demo", ifNotExists: true)
+try await spark.catalog.dropDatabase("demo", ifExists: true, cascade: true)
+
+// Inspect a function
+let fn = try await spark.catalog.getFunction("to_date")
+```
+
+## Topics
+
+### Catalog Management
+
+- ``currentCatalog()``
+- ``setCurrentCatalog(_:)``
+- ``listCatalogs(pattern:)``
+
+### Database Operations
+
+- ``currentDatabase()``
+- ``setCurrentDatabase(_:)``
+- ``createDatabase(_:ifNotExists:properties:)``
+- ``dropDatabase(_:ifExists:cascade:)``
+- ``listDatabases(pattern:)``
+- ``getDatabase(_:)``
+- ``databaseExists(_:)``
+
+### Table Operations
+
+- ``listTables(dbName:pattern:)``
+- ``getTable(_:)``
+- ``getTableProperties(_:)``
+- ``getCreateTableString(_:asSerde:)``
+- ``tableExists(_:)``
+- ``tableExists(_:_:)``
+- ``createTable(_:_:source:description:options:)``
+- ``dropTable(_:ifExists:purge:)``
+- ``truncateTable(_:)``
+
+### View Operations
+
+- ``listViews(dbName:pattern:)``
+- ``dropView(_:ifExists:)``
+- ``dropTempView(_:)``
+- ``dropGlobalTempView(_:)``
+
+### Function Operations
+
+- ``listFunctions(dbName:pattern:)``
+- ``getFunction(_:)``
+- ``getFunction(_:_:)``
+- ``functionExists(_:)``
+- ``functionExists(_:_:)``
+
+### Column & Partition Operations
+
+- ``listColumns(_:)``
+- ``listPartitions(_:)``
+- ``recoverPartitions(_:)``
+
+### Caching
+
+- ``cacheTable(_:_:)``
+- ``isCached(_:)``
+- ``uncacheTable(_:)``
+- ``clearCache()``
+- ``refreshTable(_:)``
+- ``refreshByPath(_:)``
+
+### Table Analysis
+
+- ``analyzeTable(_:noScan:)``
diff --git a/Sources/SparkConnect/Documentation.docc/DataFrameReader.md
b/Sources/SparkConnect/Documentation.docc/DataFrameReader.md
new file mode 100644
index 0000000..8d32426
--- /dev/null
+++ b/Sources/SparkConnect/Documentation.docc/DataFrameReader.md
@@ -0,0 +1,62 @@
+# ``SparkConnect/DataFrameReader``
+
+Interface for loading a ``DataFrame`` from external storage systems.
+
+## Overview
+
+`DataFrameReader` is obtained via ``SparkSession/read``. Configure it with
``format(_:)``, ``option(_:_:)``, and ``schema(_:)``, then call a
format-specific loader (e.g., ``csv(_:)``, ``orc(_:)``) or the generic
``load()`` / ``load(_:)``.
+
+```swift
+// Format-specific reader
+let csvDf = spark.read
+ .option("header", "true")
+ .option("inferSchema", "true")
+ .csv("path/to/data.csv")
+
+// Read from another DataFrame (CSV strings per row)
+let lines: DataFrame = ...
+let parsed = await spark.read.option("header", "true").csv(lines)
+
+// Generic reader
+let df = spark.read
+ .format("orc")
+ .load("path/to/data")
+```
+
+## Topics
+
+### Configuration
+
+- ``format(_:)``
+- ``option(_:_:)``
+- ``schema(_:)``
+
+### Generic Loading
+
+- ``load()``
+- ``load(_:)``
+- ``table(_:)``
+
+### CSV
+
+- ``csv(_:)``
+
+### JSON
+
+- ``json(_:)``
+
+### XML
+
+- ``xml(_:)``
+
+### Parquet
+
+- ``parquet(_:)``
+
+### ORC
+
+- ``orc(_:)``
+
+### JDBC
+
+- ``jdbc(_:_:_:)``
diff --git a/Sources/SparkConnect/Documentation.docc/DataFrameWriter.md
b/Sources/SparkConnect/Documentation.docc/DataFrameWriter.md
new file mode 100644
index 0000000..0c47d38
--- /dev/null
+++ b/Sources/SparkConnect/Documentation.docc/DataFrameWriter.md
@@ -0,0 +1,67 @@
+# ``SparkConnect/DataFrameWriter``
+
+Interface for writing a ``DataFrame`` to external storage systems.
+
+## Overview
+
+`DataFrameWriter` is obtained via ``DataFrame/write``. Configure it with
``format(_:)``, ``mode(_:)``, ``option(_:_:)``, and partitioning helpers, then
call a format-specific writer (e.g., ``orc(_:)``, ``csv(_:)``), ``save()``,
``saveAsTable(_:)``, or ``insertInto(_:)``.
+
+```swift
+// Format-specific writer
+try await df.write
+ .mode("overwrite")
+ .partitionBy("year", "month")
+ .orc("path/to/output")
+
+// Save as a managed table
+try await df.write
+ .mode("append")
+ .saveAsTable("events")
+```
+
+## Topics
+
+### Configuration
+
+- ``format(_:)``
+- ``mode(_:)``
+- ``option(_:_:)``
+- ``partitionBy(_:)``
+- ``bucketBy(numBuckets:_:)``
+- ``sortBy(_:)``
+- ``clusterBy(_:)``
+
+### Saving Data
+
+- ``save()``
+- ``save(_:)``
+- ``saveAsTable(_:)``
+- ``insertInto(_:)``
+
+### CSV
+
+- ``csv(_:)``
+
+### JSON
+
+- ``json(_:)``
+
+### XML
+
+- ``xml(_:)``
+
+### ORC
+
+- ``orc(_:)``
+
+### Parquet
+
+- ``parquet(_:)``
+
+### Text
+
+- ``text(_:)``
+
+### JDBC
+
+- ``jdbc(_:_:_:)``
diff --git a/Sources/SparkConnect/Documentation.docc/GettingStarted.md
b/Sources/SparkConnect/Documentation.docc/GettingStarted.md
index 7397690..78e3300 100644
--- a/Sources/SparkConnect/Documentation.docc/GettingStarted.md
+++ b/Sources/SparkConnect/Documentation.docc/GettingStarted.md
@@ -99,3 +99,22 @@ csvDf.write
.mode("overwrite")
.orc("path/to/output")
```
+
+### 5. Catalog Operations
+
+```swift
+// Create / drop databases
+try await spark.catalog.createDatabase("demo", ifNotExists: true)
+
+// Discover tables, views, and functions
+let tables = try await spark.catalog.listTables(pattern: "*")
+let views = try await spark.catalog.listViews()
+let funcs = try await spark.catalog.listFunctions(pattern: "to_*")
+
+// Inspect a specific function
+let fn = try await spark.catalog.getFunction("to_date")
+
+// Partition maintenance and table statistics
+try await spark.catalog.recoverPartitions("my_partitioned_table")
+try await spark.catalog.analyzeTable("my_table", noScan: true)
+```
diff --git a/Sources/SparkConnect/Documentation.docc/SparkConnect.md
b/Sources/SparkConnect/Documentation.docc/SparkConnect.md
index 6c1f49d..2aff7da 100644
--- a/Sources/SparkConnect/Documentation.docc/SparkConnect.md
+++ b/Sources/SparkConnect/Documentation.docc/SparkConnect.md
@@ -18,8 +18,33 @@ SparkConnect is a modern Swift library that provides a
native interface to Apach
### Getting Started
- <doc:GettingStarted>
+- <doc:Examples>
+
+### Sessions
+
- ``SparkSession``
-### DataFrame Operations
+### DataFrames
- ``DataFrame``
+- ``GroupedData``
+- ``Row``
+- ``StorageLevel``
+
+### Data I/O
+
+- ``DataFrameReader``
+- ``DataFrameWriter``
+- ``MergeIntoWriter``
+
+### Catalog & Configuration
+
+- ``Catalog``
+- ``RuntimeConf``
+
+### Streaming
+
+- ``DataStreamReader``
+- ``DataStreamWriter``
+- ``StreamingQuery``
+- ``StreamingQueryManager``
diff --git a/Sources/SparkConnect/Documentation.docc/SparkSession.md
b/Sources/SparkConnect/Documentation.docc/SparkSession.md
index 7c2482c..914a686 100644
--- a/Sources/SparkConnect/Documentation.docc/SparkSession.md
+++ b/Sources/SparkConnect/Documentation.docc/SparkSession.md
@@ -34,17 +34,25 @@ let csvDf = spark.read.csv("path/to/file.csv")
### Creating Sessions
- ``builder``
+- ``newSession()``
- ``stop()``
+### Session Information
+
+- ``version``
+
### DataFrame Operations
- ``emptyDataFrame``
+- ``range(_:)``
- ``range(_:_:_:)``
- ``sql(_:)``
+- ``table(_:)``
### Data I/O
- ``read``
+- ``readStream``
### Configuration
@@ -63,3 +71,17 @@ let csvDf = spark.read.csv("path/to/file.csv")
- ``interruptAll()``
- ``interruptTag(_:)``
- ``interruptOperation(_:)``
+
+### Artifacts & External Commands
+
+- ``addArtifact(_:)``
+- ``addArtifacts(_:)``
+- ``executeCommand(_:_:_:)``
+
+### Streaming
+
+- ``streams``
+
+### Utilities
+
+- ``time(_:)``
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]