bitsondatadev commented on code in PR #9836:
URL: https://github.com/apache/iceberg/pull/9836#discussion_r1508089551


##########
docs/docs/daft.md:
##########
@@ -0,0 +1,148 @@
+---
+title: "Daft"
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+# Daft
+
+[Daft](www.getdaft.io) is a Python/Rust-based distributed query engine with a 
Python DataFrame API.

Review Comment:
   Rewording for some context and links to the API.
   
   ```suggestion
   [Daft](www.getdaft.io) is a distributed query engine written in Python and 
Rust, two fast-growing ecosystems in the data engineering and machine learning 
industry.
   It exposes it's flavor of the widely adopted [DataFrame 
API](https://www.getdaft.io/projects/docs/en/latest/api_docs/dataframe.html) 
akin to many existing Python libraries.
   ```



##########
docs/docs/daft.md:
##########
@@ -0,0 +1,148 @@
+---
+title: "Daft"
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+# Daft
+
+[Daft](www.getdaft.io) is a Python/Rust-based distributed query engine with a 
Python DataFrame API.
+
+Iceberg supports reading of Iceberg tables into Daft DataFrames by using the 
Python client library [PyIceberg](https://py.iceberg.apache.org/).
+
+For Python users, Daft is complementary to PyIceberg as a query engine layer:
+
+* **PyIceberg:** catalog/table management tasks (e.g. creation of tables, 
modifying table schemas)
+* **Daft:** querying tables (e.g. previewing tables, data ETL and analysis)

Review Comment:
   This section didn't say much of anything new for Iceberg users, and really 
missed the value that the Daft integration offers them. Feel free to counter my 
suggestions with another one. Also, try to dig more into Daft features that 
connect Iceberg from Data Engineers to Data Scientists/ML engineers.
   
   ```suggestion
   [PyIceberg](https://py.iceberg.apache.org/) supports reading of Iceberg 
tables into Daft DataFrames, which simplifies running transformation and 
machine learning workloads in the Python ecosystem. This offers a novel 
experience for data consumers to migrate their models in-place using Iceberg's 
catalog and table management, while 
   utilizing Daft's compute engine capabilities for use cases from traditional 
analysis, to advanced feature training.
   ```



##########
docs/docs/daft.md:
##########
@@ -0,0 +1,148 @@
+---
+title: "Daft"
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+# Daft
+
+[Daft](www.getdaft.io) is a Python/Rust-based distributed query engine with a 
Python DataFrame API.
+
+Iceberg supports reading of Iceberg tables into Daft DataFrames by using the 
Python client library [PyIceberg](https://py.iceberg.apache.org/).
+
+For Python users, Daft is complementary to PyIceberg as a query engine layer:
+
+* **PyIceberg:** catalog/table management tasks (e.g. creation of tables, 
modifying table schemas)
+* **Daft:** querying tables (e.g. previewing tables, data ETL and analysis)
+
+In database terms, PyIceberg is the Data Description Language (DDL) for 
database administration and Daft is the Data Manipulation Language (DML) for 
querying data.
+
+## Enabling Iceberg support in Daft
+
+To use Iceberg with Daft, simply ensure that the 
[PyIceberg](https://py.iceberg.apache.org/) library is also installed in your 
current Python environment.
+
+```
+pip install getdaft pyiceberg
+```
+
+## Querying Iceberg using Daft
+
+### Reading PyIceberg tables
+
+Daft interacts natively with [PyIceberg](https://py.iceberg.apache.org/) to 
read from Iceberg.
+
+Simply load a PyIceberg table and pass it into Daft as follows:

Review Comment:
   Avoid empty lines between headings, at least add a single sentence to 
visually break things up. Avoid saying reading from Iceberg, as Iceberg is 
mainly a table spec and some opinionated libraries, not a running system. 
Remove "Simply".
   
   ```suggestion
   ## Querying Iceberg using Daft
   
   Daft interacts natively with [PyIceberg](https://py.iceberg.apache.org/) to 
read Iceberg tables.
   
   ### Reading Iceberg tables
   
   Create an Iceberg table following [the spark-quickstart 
tutorial](https://iceberg.apache.org/spark-quickstart/). 
   
   Load the Iceberg table `demo.nyc.taxis` it into Daft, limiting to the first 
three columns.
   ```



##########
docs/docs/daft.md:
##########
@@ -0,0 +1,148 @@
+---
+title: "Daft"
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+# Daft
+
+[Daft](www.getdaft.io) is a Python/Rust-based distributed query engine with a 
Python DataFrame API.
+
+Iceberg supports reading of Iceberg tables into Daft DataFrames by using the 
Python client library [PyIceberg](https://py.iceberg.apache.org/).
+
+For Python users, Daft is complementary to PyIceberg as a query engine layer:
+
+* **PyIceberg:** catalog/table management tasks (e.g. creation of tables, 
modifying table schemas)
+* **Daft:** querying tables (e.g. previewing tables, data ETL and analysis)
+
+In database terms, PyIceberg is the Data Description Language (DDL) for 
database administration and Daft is the Data Manipulation Language (DML) for 
querying data.
+
+## Enabling Iceberg support in Daft
+
+To use Iceberg with Daft, simply ensure that the 
[PyIceberg](https://py.iceberg.apache.org/) library is also installed in your 
current Python environment.
+
+```
+pip install getdaft pyiceberg
+```
+
+## Querying Iceberg using Daft
+
+### Reading PyIceberg tables
+
+Daft interacts natively with [PyIceberg](https://py.iceberg.apache.org/) to 
read from Iceberg.
+
+Simply load a PyIceberg table and pass it into Daft as follows:
+
+``` py
+import daft
+from pyiceberg import load_catalog
+
+table = load_catalog("my_catalog").load_table("my_tpch_namespace.lineitem")
+df = daft.read_iceberg(table)
+df = df.select("L_SHIPDATE", "L_ORDERKEY", "L_COMMENT")
+df.show()

Review Comment:
   We should use the taxi cab data as it's consistent with much of the other 
documentation. 
   
   https://iceberg.apache.org/spark-quickstart/?h=catalog#creating-a-table
   
   ```suggestion
   table = load_catalog("demo").load_table("nyc.taxis")
   df = daft.read_iceberg(table)
   df = df.select("vendor_id", "trip_id", "trip_distance")
   df.show()
   ```



##########
docs/docs/daft.md:
##########
@@ -0,0 +1,148 @@
+---
+title: "Daft"
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+# Daft
+
+[Daft](www.getdaft.io) is a Python/Rust-based distributed query engine with a 
Python DataFrame API.
+
+Iceberg supports reading of Iceberg tables into Daft DataFrames by using the 
Python client library [PyIceberg](https://py.iceberg.apache.org/).
+
+For Python users, Daft is complementary to PyIceberg as a query engine layer:
+
+* **PyIceberg:** catalog/table management tasks (e.g. creation of tables, 
modifying table schemas)
+* **Daft:** querying tables (e.g. previewing tables, data ETL and analysis)
+
+In database terms, PyIceberg is the Data Description Language (DDL) for 
database administration and Daft is the Data Manipulation Language (DML) for 
querying data.
+
+## Enabling Iceberg support in Daft
+
+To use Iceberg with Daft, simply ensure that the 
[PyIceberg](https://py.iceberg.apache.org/) library is also installed in your 
current Python environment.

Review Comment:
   We'll eventually be adding the Microsoft styling guide, but try to avoid 
"simple"/"simply"term in general unless used in a context. It can come of as 
condescending, especially to a beginner.
   
   
https://learn.microsoft.com/en-us/style-guide/a-z-word-list-term-collections/s/simply
   
   ```suggestion
   To use Iceberg with Daft, ensure that the 
[PyIceberg](https://py.iceberg.apache.org/) library is also installed in your 
current Python environment.
   ```



##########
docs/docs/daft.md:
##########
@@ -0,0 +1,148 @@
+---
+title: "Daft"
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+# Daft
+
+[Daft](www.getdaft.io) is a Python/Rust-based distributed query engine with a 
Python DataFrame API.
+
+Iceberg supports reading of Iceberg tables into Daft DataFrames by using the 
Python client library [PyIceberg](https://py.iceberg.apache.org/).
+
+For Python users, Daft is complementary to PyIceberg as a query engine layer:
+
+* **PyIceberg:** catalog/table management tasks (e.g. creation of tables, 
modifying table schemas)
+* **Daft:** querying tables (e.g. previewing tables, data ETL and analysis)
+
+In database terms, PyIceberg is the Data Description Language (DDL) for 
database administration and Daft is the Data Manipulation Language (DML) for 
querying data.
+
+## Enabling Iceberg support in Daft
+
+To use Iceberg with Daft, simply ensure that the 
[PyIceberg](https://py.iceberg.apache.org/) library is also installed in your 
current Python environment.
+
+```
+pip install getdaft pyiceberg
+```
+
+## Querying Iceberg using Daft
+
+### Reading PyIceberg tables
+
+Daft interacts natively with [PyIceberg](https://py.iceberg.apache.org/) to 
read from Iceberg.
+
+Simply load a PyIceberg table and pass it into Daft as follows:
+
+``` py
+import daft
+from pyiceberg import load_catalog
+
+table = load_catalog("my_catalog").load_table("my_tpch_namespace.lineitem")
+df = daft.read_iceberg(table)
+df = df.select("L_SHIPDATE", "L_ORDERKEY", "L_COMMENT")
+df.show()
+```
+
+```
+╭────────────┬────────────┬────────────────────────────────╮
+│ L_SHIPDATE ┆ L_ORDERKEY ┆ L_COMMENT                      │
+│ ---        ┆ ---        ┆ ---                            │
+│ Date       ┆ Int64      ┆ Utf8                           │
+╞════════════╪════════════╪════════════════════════════════╡
+│ 1992-01-02 ┆ 2186280097 ┆ ions sleep about the si        │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 175366628  ┆ gular accoun                   │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186602151 ┆ blithely even                  │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3937663654 ┆ ake boldly among the ideas. s… │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186781220 ┆ thely. slyly pending ideas ar… │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3937999493 ┆  haggle at the regular, pen    │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186933061 ┆ ickly. slyly                   │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3938167204 ┆ carefully silent instructions… │
+╰────────────┴────────────┴────────────────────────────────╯
+
+(Showing first 8 rows)
+```
+
+Any subsequent filter operations on the Daft `df` DataFrame object will be 
correctly optimized to take advantage of Iceberg features such as hidden 
partitioning and file-level statistics for efficient reads.

Review Comment:
   ```suggestion
   Any filter operations on the Daft dataframe, `df`, will [push down the 
filters](https://iceberg.apache.org/docs/latest/performance/#data-filtering), 
effectuate [hidden 
partitioning](https://iceberg.apache.org/docs/latest/partitioning/), and 
utilize [table statistics to inform query 
planning](https://iceberg.apache.org/docs/latest/performance/#scan-planning) 
for efficient reads.
   ```



##########
docs/docs/daft.md:
##########
@@ -0,0 +1,148 @@
+---
+title: "Daft"
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+# Daft
+
+[Daft](www.getdaft.io) is a Python/Rust-based distributed query engine with a 
Python DataFrame API.
+
+Iceberg supports reading of Iceberg tables into Daft DataFrames by using the 
Python client library [PyIceberg](https://py.iceberg.apache.org/).
+
+For Python users, Daft is complementary to PyIceberg as a query engine layer:
+
+* **PyIceberg:** catalog/table management tasks (e.g. creation of tables, 
modifying table schemas)
+* **Daft:** querying tables (e.g. previewing tables, data ETL and analysis)
+
+In database terms, PyIceberg is the Data Description Language (DDL) for 
database administration and Daft is the Data Manipulation Language (DML) for 
querying data.

Review Comment:
   Hard disagree to this statement, Iceberg is a spec that supports SQL in all 
categories, pyIceberg has a minimal implementation that supports DDL/DML, and 
theoretically engines will use the api, to standardize and not have to 
re-invent the wheel for all categories of SQL statements. 
   
   What were you actually trying to communicate here and let's think about how 
to word it? I'll make an educated guess and please feel free to rephrase.
   
   ```suggestion
   Daft's strength, is that it provides a DataFrame representation and 
transformation suite, while integrating between [Iceberg 
tables](https://www.getdaft.io/projects/docs/en/latest/api_docs/doc_gen/io_functions/daft.read_iceberg.html)
 other specialized systems such as 
[Ray](https://www.getdaft.io/projects/docs/en/latest/api_docs/doc_gen/io_functions/daft.from_ray_dataset.html),
 
[Dask](https://www.getdaft.io/projects/docs/en/latest/api_docs/doc_gen/io_functions/daft.from_dask_dataframe.html),
 
[Arrow](https://www.getdaft.io/projects/docs/en/latest/api_docs/doc_gen/io_functions/daft.from_arrow.html),
 
[Parquet](https://www.getdaft.io/projects/docs/en/latest/api_docs/doc_gen/io_functions/daft.read_parquet.html),
 and a few other formats. Combined with Iceberg's ability to manage table-level 
concerns like schema and partition evolution, this makes an ultimate pairing of 
storage and compute technologies to facilitate heavy processing and machine 
learning workloads.
   ```



##########
docs/docs/daft.md:
##########
@@ -0,0 +1,148 @@
+---
+title: "Daft"
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+# Daft
+
+[Daft](www.getdaft.io) is a Python/Rust-based distributed query engine with a 
Python DataFrame API.
+
+Iceberg supports reading of Iceberg tables into Daft DataFrames by using the 
Python client library [PyIceberg](https://py.iceberg.apache.org/).
+
+For Python users, Daft is complementary to PyIceberg as a query engine layer:
+
+* **PyIceberg:** catalog/table management tasks (e.g. creation of tables, 
modifying table schemas)
+* **Daft:** querying tables (e.g. previewing tables, data ETL and analysis)
+
+In database terms, PyIceberg is the Data Description Language (DDL) for 
database administration and Daft is the Data Manipulation Language (DML) for 
querying data.
+
+## Enabling Iceberg support in Daft
+
+To use Iceberg with Daft, simply ensure that the 
[PyIceberg](https://py.iceberg.apache.org/) library is also installed in your 
current Python environment.
+
+```
+pip install getdaft pyiceberg
+```
+
+## Querying Iceberg using Daft
+
+### Reading PyIceberg tables
+
+Daft interacts natively with [PyIceberg](https://py.iceberg.apache.org/) to 
read from Iceberg.
+
+Simply load a PyIceberg table and pass it into Daft as follows:
+
+``` py
+import daft
+from pyiceberg import load_catalog
+
+table = load_catalog("my_catalog").load_table("my_tpch_namespace.lineitem")
+df = daft.read_iceberg(table)
+df = df.select("L_SHIPDATE", "L_ORDERKEY", "L_COMMENT")
+df.show()
+```
+
+```
+╭────────────┬────────────┬────────────────────────────────╮
+│ L_SHIPDATE ┆ L_ORDERKEY ┆ L_COMMENT                      │
+│ ---        ┆ ---        ┆ ---                            │
+│ Date       ┆ Int64      ┆ Utf8                           │
+╞════════════╪════════════╪════════════════════════════════╡
+│ 1992-01-02 ┆ 2186280097 ┆ ions sleep about the si        │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 175366628  ┆ gular accoun                   │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186602151 ┆ blithely even                  │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3937663654 ┆ ake boldly among the ideas. s… │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186781220 ┆ thely. slyly pending ideas ar… │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3937999493 ┆  haggle at the regular, pen    │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186933061 ┆ ickly. slyly                   │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3938167204 ┆ carefully silent instructions… │
+╰────────────┴────────────┴────────────────────────────────╯
+
+(Showing first 8 rows)
+```

Review Comment:
   Regenerate this using the Spark demo: 
https://iceberg.apache.org/spark-quickstart/



##########
docs/docs/daft.md:
##########
@@ -0,0 +1,148 @@
+---
+title: "Daft"
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+# Daft
+
+[Daft](www.getdaft.io) is a Python/Rust-based distributed query engine with a 
Python DataFrame API.
+
+Iceberg supports reading of Iceberg tables into Daft DataFrames by using the 
Python client library [PyIceberg](https://py.iceberg.apache.org/).
+
+For Python users, Daft is complementary to PyIceberg as a query engine layer:
+
+* **PyIceberg:** catalog/table management tasks (e.g. creation of tables, 
modifying table schemas)
+* **Daft:** querying tables (e.g. previewing tables, data ETL and analysis)
+
+In database terms, PyIceberg is the Data Description Language (DDL) for 
database administration and Daft is the Data Manipulation Language (DML) for 
querying data.
+
+## Enabling Iceberg support in Daft
+
+To use Iceberg with Daft, simply ensure that the 
[PyIceberg](https://py.iceberg.apache.org/) library is also installed in your 
current Python environment.
+
+```
+pip install getdaft pyiceberg
+```
+
+## Querying Iceberg using Daft
+
+### Reading PyIceberg tables
+
+Daft interacts natively with [PyIceberg](https://py.iceberg.apache.org/) to 
read from Iceberg.
+
+Simply load a PyIceberg table and pass it into Daft as follows:
+
+``` py
+import daft
+from pyiceberg import load_catalog
+
+table = load_catalog("my_catalog").load_table("my_tpch_namespace.lineitem")
+df = daft.read_iceberg(table)
+df = df.select("L_SHIPDATE", "L_ORDERKEY", "L_COMMENT")
+df.show()
+```
+
+```
+╭────────────┬────────────┬────────────────────────────────╮
+│ L_SHIPDATE ┆ L_ORDERKEY ┆ L_COMMENT                      │
+│ ---        ┆ ---        ┆ ---                            │
+│ Date       ┆ Int64      ┆ Utf8                           │
+╞════════════╪════════════╪════════════════════════════════╡
+│ 1992-01-02 ┆ 2186280097 ┆ ions sleep about the si        │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 175366628  ┆ gular accoun                   │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186602151 ┆ blithely even                  │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3937663654 ┆ ake boldly among the ideas. s… │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186781220 ┆ thely. slyly pending ideas ar… │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3937999493 ┆  haggle at the regular, pen    │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186933061 ┆ ickly. slyly                   │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3938167204 ┆ carefully silent instructions… │
+╰────────────┴────────────┴────────────────────────────────╯
+
+(Showing first 8 rows)
+```
+
+Any subsequent filter operations on the Daft `df` DataFrame object will be 
correctly optimized to take advantage of Iceberg features such as hidden 
partitioning and file-level statistics for efficient reads.
+
+``` py
+import datetime
+
+# Filter which takes advantage of partition pruning capabilities of Iceberg
+df = df.where(df["L_SHIPDATE"] > datetime.date(1993, 1, 1))
+df.show()
+```
+
+```
+╭────────────┬────────────┬────────────────────────────────╮
+│ L_SHIPDATE ┆ L_ORDERKEY ┆ L_COMMENT                      │                   
                                                     
+│ ---        ┆ ---        ┆ ---                            │
+│ Date       ┆ Int64      ┆ Utf8                           │
+╞════════════╪════════════╪════════════════════════════════╡
+│ 1993-01-02 ┆ 5695313125 ┆  slyly special p               │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1993-01-02 ┆ 2701326853 ┆ ironic instru                  │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1993-01-02 ┆ 5695313766 ┆ ly according                   │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1993-01-02 ┆ 2701330720 ┆ y alongside of the blithely    │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1993-01-02 ┆ 5695315200 ┆ ckly final foxes haggle car    │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1993-01-02 ┆ 2701331524 ┆ ns doze slyly pending instruc… │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1993-01-02 ┆ 5695317377 ┆ re about the ironic, silen     │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1993-01-02 ┆ 2701342819 ┆ fully even pinto beans wa      │
+╰────────────┴────────────┴────────────────────────────────╯
+
+(Showing first 8 rows)
+```
+
+### Type compatibility
+
+Daft and Iceberg have compatible type systems. Here are how types are 
converted across the two systems.
+
+When reading from an Iceberg source into Daft:

Review Comment:
   Would this be different in the other direction? Maybe just delete this line?



##########
docs/docs/daft.md:
##########
@@ -0,0 +1,148 @@
+---
+title: "Daft"
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+# Daft
+
+[Daft](www.getdaft.io) is a Python/Rust-based distributed query engine with a 
Python DataFrame API.
+
+Iceberg supports reading of Iceberg tables into Daft DataFrames by using the 
Python client library [PyIceberg](https://py.iceberg.apache.org/).
+
+For Python users, Daft is complementary to PyIceberg as a query engine layer:
+
+* **PyIceberg:** catalog/table management tasks (e.g. creation of tables, 
modifying table schemas)
+* **Daft:** querying tables (e.g. previewing tables, data ETL and analysis)
+
+In database terms, PyIceberg is the Data Description Language (DDL) for 
database administration and Daft is the Data Manipulation Language (DML) for 
querying data.
+
+## Enabling Iceberg support in Daft
+
+To use Iceberg with Daft, simply ensure that the 
[PyIceberg](https://py.iceberg.apache.org/) library is also installed in your 
current Python environment.
+
+```
+pip install getdaft pyiceberg
+```
+
+## Querying Iceberg using Daft
+
+### Reading PyIceberg tables
+
+Daft interacts natively with [PyIceberg](https://py.iceberg.apache.org/) to 
read from Iceberg.
+
+Simply load a PyIceberg table and pass it into Daft as follows:
+
+``` py
+import daft
+from pyiceberg import load_catalog
+
+table = load_catalog("my_catalog").load_table("my_tpch_namespace.lineitem")
+df = daft.read_iceberg(table)
+df = df.select("L_SHIPDATE", "L_ORDERKEY", "L_COMMENT")
+df.show()
+```
+
+```
+╭────────────┬────────────┬────────────────────────────────╮
+│ L_SHIPDATE ┆ L_ORDERKEY ┆ L_COMMENT                      │
+│ ---        ┆ ---        ┆ ---                            │
+│ Date       ┆ Int64      ┆ Utf8                           │
+╞════════════╪════════════╪════════════════════════════════╡
+│ 1992-01-02 ┆ 2186280097 ┆ ions sleep about the si        │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 175366628  ┆ gular accoun                   │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186602151 ┆ blithely even                  │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3937663654 ┆ ake boldly among the ideas. s… │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186781220 ┆ thely. slyly pending ideas ar… │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3937999493 ┆  haggle at the regular, pen    │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 2186933061 ┆ ickly. slyly                   │
+├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+│ 1992-01-02 ┆ 3938167204 ┆ carefully silent instructions… │
+╰────────────┴────────────┴────────────────────────────────╯
+
+(Showing first 8 rows)
+```
+
+Any subsequent filter operations on the Daft `df` DataFrame object will be 
correctly optimized to take advantage of Iceberg features such as hidden 
partitioning and file-level statistics for efficient reads.
+
+``` py
+import datetime
+
+# Filter which takes advantage of partition pruning capabilities of Iceberg
+df = df.where(df["L_SHIPDATE"] > datetime.date(1993, 1, 1))
+df.show()

Review Comment:
   Try to update this with a date field...I was trying to materialize the 
snapshot table and I had just found the from_pylist method but not sure if it 
will work with a custom object since it is a List of Dictionaries. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org


Reply via email to