andrewthad opened a new issue, #47118:
URL: https://github.com/apache/arrow/issues/47118
### Describe the usage question you have. Please include as many useful
details as possible.
I'm trying to use pyarrow to query arrow files that have a column named
`tags` of type `list<string>`. The most common query is "does the list contain
this string?" but I'm not able to figure out how to write this query. As an
example:
```
>>> table = p.Table.from_arrays([p.array([["foo"], [], ["foo","bar"],
["bar"], ["bar"]], p.list_(p.string()))], names=['tags'])
>>> table
pyarrow.Table
tags: list<item: string>
child 0, item: string
----
tags: [[["foo"],[],["foo","bar"],["bar"],["bar"]]]
```
From here, I would expect that I should be able to do something like
`table.filter(pc.list_has_element(pc.field('tags')), 'foo')` but there is not a
function `list_has_element`, and there doesn't seem to be anything like it. I'm
wonderful if there is something that I'm missing, some part of the API that is
used to lift other compute functions over lists.
Related Issues:
I see in https://github.com/apache/arrow/issues/45167 that it's not
currently possible to check that lists are equal, so maybe lists are a less
developed part of the library.
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]