kosiew commented on code in PR #20268:
URL: https://github.com/apache/datafusion/pull/20268#discussion_r3153507212


##########
datafusion/spark/src/function/string/format_string.rs:
##########
@@ -2332,6 +2381,24 @@ impl FloatBits for f64 {
     }
 }
 
+/// Inserts thousands separators (`,`) into the integer part of a numeric 
string.
+/// For example, `"1234567.89"` becomes `"1,234,567.89"`.
+fn insert_thousands_separator(number: &str) -> String {

Review Comment:
   `insert_thousands_separator` is a really nice addition. It feels like 
something we could reuse more broadly. I noticed `format_unsigned` still has 
its own inline grouping logic for decimal integers. Maybe that call site could 
switch over to this helper as well? It would reduce duplication and keep 
grouping behavior consistent across numeric formatters.



##########
datafusion/spark/src/function/string/format_string.rs:
##########
@@ -2385,4 +2454,225 @@ mod tests {
 
         Ok(())
     }
+
+    #[test]
+    fn test_insert_thousands_separator() {
+        assert_eq!(insert_thousands_separator("1234567.89"), "1,234,567.89");
+        assert_eq!(insert_thousands_separator("123.45"), "123.45");
+        assert_eq!(insert_thousands_separator("1234"), "1,234");
+        assert_eq!(insert_thousands_separator("12"), "12");
+        assert_eq!(insert_thousands_separator("0.5"), "0.5");
+        assert_eq!(
+            insert_thousands_separator("1234567890.1234"),
+            "1,234,567,890.1234"
+        );
+        assert_eq!(insert_thousands_separator("1000"), "1,000");
+        assert_eq!(insert_thousands_separator("100"), "100");
+    }
+
+    #[test]
+    fn test_grouping_separator_float() -> Result<()> {

Review Comment:
   Would it be worth adding a couple more coverage cases around width and sign 
interactions with the new flag? Something like `%0,10.2f` and `%(,10.2f`, plus 
the decimal equivalents. The main grouping and scientific cases are well 
covered, but padding and sign ordering are often where formatter behavior can 
get a bit tricky.



##########
datafusion/spark/src/function/string/format_string.rs:
##########
@@ -1690,6 +1690,23 @@ impl ConversionSpecifier {
     }
 
     fn format_float(&self, writer: &mut String, value: f64) -> Result<()> {
+        // Java/Spark reject the grouping separator flag with scientific 
notation

Review Comment:
   This new ',' + scientific notation validation looks good, but I noticed the 
same logic now exists in both `format_float` and `format_decimal`. Would it 
make sense to pull this into a small helper on `ConversionSpecifier`? That way 
the Spark or Java compatibility rule lives in one place and it reduces the risk 
of the float and decimal paths drifting over time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to