0x26res opened a new issue, #44949:
URL: https://github.com/apache/arrow/issues/44949

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   I had mistakenly put a duplicate field in my schema when calling 
`pyarrow.json.read_json` with `ParseOptions.explicit_schema`
   
   I was either getting seg faults or an error message that wasn't clear. 
   
   Given the nature of json, there shouldn't be duplicate fields in the schema. 
`pyarrow.json.ParseOptions` should throw if there's any duplicate field in the 
`explicit_schema`.
   
   I can send an MR, I'm just wondering if there's anywhere else where we do 
similar checks?
   
   ```python
   import io
   import pyarrow as pa
   import pyarrow.json
   
   SCHEMA = pa.schema(
       [
           pa.field("foo", pa.bool_()),
           pa.field("foo", pa.bool_()),
       ]
   )
   
   
   with io.BytesIO(b'{"foo": true,"other": "bar"}') as buffer:
       buffer.seek(0)
       pyarrow.json.read_json(
           buffer,
           parse_options=pyarrow.json.ParseOptions(explicit_schema=SCHEMA),
       )
   
   ```
   
   ```
       pyarrow.json.read_json(
     File "pyarrow/_json.pyx", line 308, in pyarrow._json.read_json
     File "pyarrow/error.pxi", line 155, in 
pyarrow.lib.pyarrow_internal_check_status
     File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
   pyarrow.lib.ArrowInvalid: Failed to convert JSON to bool from 
dictionary<values=string, indices=int32, ordered=0>
   
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to