rtyler opened a new issue, #9657: URL: https://github.com/apache/arrow-rs/issues/9657
**Describe the bug** <!-- A clear and concise description of what the bug is. --> arrow-csv will generate `\` characters from `Utf8` columns as `\` in output which lousier CSV parsers, like those written in C/C++ interpret as a string escape sequence and c corrupt the output stream. **To Reproduce** <!-- Steps to reproduce the behavior: --> **Expected behavior** <!-- A clear and concise description of what you expected to happen. --> Arguably those bad CSV parsers should be less bad, but IMHO it's a safe operation to convert `\` to `\\` in the output stream out of an abundance of caution. **Additional context** <!-- Add any other context about the problem here. --> ```patch From 2a7615200965a68c4808efe021b0414e6e155135 Mon Sep 17 00:00:00 2001 From: "R. Tyler Croy" <[email protected]> Date: Thu, 2 Apr 2026 18:24:19 +0000 Subject: [PATCH] chore: properly escape forward slashes in CSV output of strings Signed-off-by: R. Tyler Croy <[email protected]> --- arrow-csv/src/writer.rs | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/arrow-csv/src/writer.rs b/arrow-csv/src/writer.rs index c38d1cdec33..8c7f50b3ca8 100644 --- a/arrow-csv/src/writer.rs +++ b/arrow-csv/src/writer.rs @@ -293,6 +293,13 @@ impl<W: Write> Writer<W> { )) })?; + let data_type = batch.schema().field(col_idx).data_type().clone(); + + if data_type == DataType::Utf8 || data_type == DataType::LargeUtf8 { + // This is fine + buffer = str::replace(&buffer, "\\", "\\\\"); + } + let field_bytes = self.get_trimmed_field_bytes(&buffer, batch.column(col_idx).data_type()); byte_record.push_field(field_bytes); @@ -1358,4 +1365,28 @@ sed do eiusmod tempor,-556132.25,1,,2019-04-18T02:45:55.555,23:46:03,foo write_quote_style_with_null(&batch, QuoteStyle::Always, "NULL") ); } + + #[test] + fn test_write_with_forward_slashes() { + let schema = Schema::new(vec![ + Field::new("text", DataType::Utf8, true), + Field::new("number", DataType::Int32, true), + ]); + + let text = StringArray::from(vec![Some(r"\"), None, Some("world")]); + let number = Int32Array::from(vec![Some(1), Some(2), None]); + + let batch = + RecordBatch::try_new(Arc::new(schema), vec![Arc::new(text), Arc::new(number)]).unwrap(); + + // Test with QuoteStyle::Always + assert_eq!( + r#""text","number" +"\\","1" +"","2" +"world","" +"#, + write_quote_style(&batch, QuoteStyle::Always) + ); + } } -- 2.43.0 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
