malduarte opened a new issue, #17548: URL: https://github.com/apache/datafusion/issues/17548
### Is your feature request related to a problem or challenge? In Apache Spark it is possible to define what the expected timestamp format will be. See `timestampFormat` and `timestampNTZFormat ` [csv options](https://spark.apache.org/docs/latest/sql-data-sources-csv.html) This works for both user supplied and inferred schemas. In datafusion similar functionality is only available when parsing CSV via [SQL DDL options](https://datafusion.apache.org/user-guide/sql/format_options.html) Currently with datafusion that is not possible even for user supplied schemas. Parsing a CSV that has timestamps with non standard formats will result in an error. Example CSV contains a column `created_at` with the following value `2025-06-23-05.07.34.214000` Schema definition for the timestamp column: ``` Field::new("created_at", DataType::Timestamp(TimeUnit::Microsecond, None), true), ``` Attempting to parse CSV file ``` Error: ArrowError(ParseError("Error parsing column 11 at line 1: Parser error: Error parsing timestamp from '2025-06-23-05.07.34.214000': invalid timestamp separator"), None) ``` ### Describe the solution you'd like Users should be able to supply a [custom timestamp format](https://docs.rs/chrono/0.4.42/chrono/format/strftime/index.html#specifiers) ### Describe alternatives you've considered Extend `CsvReadOptions` and allow users to supply a custom timestamp format. This would be similar to the Apache Spark approach. This format would be used to parse timestamps during schema inference or with user supplied schemas. And, optionally, extend `DataType::Timestamp` to include a user defined timestamp. This would be more flexible as it would allow per column timestamp formats. ### Additional context - Currenty the only workaround is to define non custom timetamps as being strings and convert them afterwards with extra code. - It is possible to supply a timestamp format already when parsing with [SQL DDL](https://datafusion.apache.org/user-guide/sql/format_options.html) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
