[I] [Python] Allow parsing more general JSON formats [arrow]

via GitHub Mon, 01 Dec 2025 06:37:32 -0800


asfimport opened a new issue, #22011:
URL: https://github.com/apache/arrow/issues/22011


   I have JSON data where the columnar (line-delimited) part is in a `data` 
subkey:
   ```java
   
   {
     "metadata": {"name": "block1"},
     "data" : [
       {"a": 1, "b": 2.0, "c": "foo", "d": false},
       {"a": 4, "b": -5.5, "c": null, "d": true}
     ]
   }
   ```
    
   
    
   
   It would be good if the arrow JSON parser could allow specifying where the 
columnar data is stored.
   
   Since the `metadata` is also important to me it would be even better if the 
rest of the JSON could be returned as a Python dict with the only the specified 
keys parsed as arrow tables - e.g.
   
    
   ```java
   
   >>> block1 = json.read_json(fn, tables=['data'])
   >>> block1['data']
   pyarrow.Table
   a: int64
   b: double
   c: string
   d: bool
   >>> block1['metadata']
   {'name': 'block1'}
   >>> block1
   {
     "metadata": {"name": "block1"},
     "data" : pyarrow.Table
   }
   ```
    
   
    
   
   **Reporter**: [Dave 
Hirschfeld](https://issues.apache.org/jira/browse/ARROW-5568) / @dhirschfeld
   
   <sub>**Note**: *This issue was originally created as 
[ARROW-5568](https://issues.apache.org/jira/browse/ARROW-5568). Please see the 
[migration documentation](https://github.com/apache/arrow/issues/14542) for 
further details.*</sub>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Python] Allow parsing more general JSON formats [arrow]

Reply via email to