Fokko opened a new issue, #1711:
URL: https://github.com/apache/iceberg-python/issues/1711
### Apache Iceberg version
None
### Please describe the bug 🐞
See:
```
➜ iceberg-python git:(fd-align-codestyle) ipython
Python 3.10.14 (main, Mar 19 2024, 21:46:16) [Clang 15.0.0
(clang-1500.3.9.4)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.31.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import pyarrow as pa
...:
...: arrow_schema = pa.schema(
...: [
...: pa.field("city", pa.string(), nullable=False),
...: pa.field("tags", pa.list_(pa.string()), nullable=False),
...: ]
...: )
...:
...: # Write some data
...: df = pa.Table.from_pylist(
...: [
...: {"city": "Amsterdam", "tags": ["Europe", "Capital"]},
...: {"city": "San Francisco", "tags": ["Amsterdam", "Golden
Gate"]},
...: ],
...: schema=arrow_schema,
...: )
...: joined = df.join(df, "city", join_type="inner")
---------------------------------------------------------------------------
ArrowInvalid Traceback (most recent call last)
Cell In[1], line 18
10 # Write some data
11 df = pa.Table.from_pylist(
12 [
13 {"city": "Amsterdam", "tags": ["Europe", "Capital"]},
(...)
16 schema=arrow_schema,
17 )
---> 18 joined = df.join(df, "city", join_type="inner")
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/table.pxi:5704, in
pyarrow.lib.Table.join()
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/acero.py:249, in
_perform_join(join_type, left_operand, left_keys, right_operand, right_keys,
left_suffix, right_suffix, use_threads, coalesce_keys, output_type)
244 projection = Declaration(
245 "project", ProjectNodeOptions(projections,
projected_col_names)
246 )
247 decl = Declaration.from_sequence([decl, projection])
--> 249 result_table = decl.to_table(use_threads=use_threads)
251 if output_type == Table:
252 return result_table
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/_acero.pyx:590, in
pyarrow._acero.Declaration.to_table()
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/error.pxi:155, in
pyarrow.lib.pyarrow_internal_check_status()
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/error.pxi:92, in
pyarrow.lib.check_status()
ArrowInvalid: Data type list<item: string> is not supported in join non-key
field tags
```
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]