cottrell opened a new issue, #47897:
URL: https://github.com/apache/arrow/issues/47897

   ### Problem
   
   When converting CSV data today, pyarrow users must either enumerate column 
names or construct a schema to force all fields to a given type. This makes 
simple workflows like "read everything as string" clumsy when the schema is not 
known ahead of time.
   
   ### Proposed change
   
   Expose a single default column type on `arrow::csv::ConvertOptions` and 
plumb it through the bindings so callers can say 
`ConvertOptions(column_type=pa.string())`. The option should apply to any 
columns not listed explicitly in `column_types`, including columns added via 
`include_missing_columns`.
   
   ### Implementation status
   
   A local branch adds `ConvertOptions::column_type`, wires it through the C++ 
reader, exposes it in `pyarrow.csv`, updates the docs, and adds unit tests 
covering the new behavior.
   
   ### Next steps
   
   Raise a PR with the implementation and tests once this ticket is accepted.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to