cfrancois7 opened a new issue, #953: URL: https://github.com/apache/iceberg-python/issues/953
### Question I'm looking for a tutorial to make a query on one subfield of a struct field. I scrolled all internet but failed to find a way to do it simply with pyiceberg. To make it concret, for instance how to get the row with `"employment.status = 'Employed'"` : ```python [{'id': 1, 'name': 'Alice', 'age': 28, 'address': {'street': '123 Maple St', 'city': 'Springfield', 'postal_code': '12345'}, 'contact': {'email': 'al...@example.com', 'phone': '555-1234'}, 'employment': {'status': 'Employed', 'position': 'Software Engineer', 'company': {'name': 'Tech Corp', 'location': 'Silicon Valley'}}, 'preferences': {'newsletter': True, 'notifications': {'email': True, 'sms': False}}}, {'id': 2, 'name': 'Bob', 'age': 35, 'address': {'street': '456 Oak St', 'city': 'Metropolis', 'postal_code': '67890'}, 'contact': {'email': 'b...@example.com', 'phone': '555-5678'}, 'employment': {'status': 'Self-employed', 'position': 'Consultant', 'company': {'name': 'Freelance', 'location': 'Remote'}}, 'preferences': {'newsletter': False, 'notifications': {'email': True, 'sms': True}}}] ``` With the following schema: ``` import pyarrow as pa schema = pa.schema([ ('id', pa.int32()), ('name', pa.string()), ('age', pa.int32()), ('address', pa.struct([ ('street', pa.string()), ('city', pa.string()), ('postal_code', pa.string()) ])), ('contact', pa.struct([ ('email', pa.string()), ('phone', pa.string()) ])), ('employment', pa.struct([ pa.field('status', pa.string(), nullable=True), pa.field('position', pa.string(), nullable=True), pa.field('company', pa.struct([ ('name', pa.string()), ('location', pa.string()) ]), nullable=True) ])), ('preferences', pa.struct([ ('newsletter', pa.bool_()), ('notifications', pa.struct([ ('email', pa.bool_()), ('sms', pa.bool_()) ])) ])) ]) ``` I tried this kind of query, but without success: ```python row_filter = "employment.status = 'Employed'" table.scan( row_filter=row_filter, selected_fields=["age", "employment", 'contact.email'] ).to_pandas() ``` The command raises the error: ```Python ValueError: Could not find field with name status, case_sensitive=True ``` The backend is supported by SQLite. versions: ```bash $ pip list | grep 'iceberg\|arrow\|sqlite' arrow 1.3.0 pyarrow 15.0.2 pyiceberg 0.6.1 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org