gortiz commented on code in PR #14337:
URL: https://github.com/apache/pinot/pull/14337#discussion_r1832331941


##########
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/JsonExtractScalarTransformFunction.java:
##########
@@ -184,8 +191,7 @@ public long[] transformToLongValuesSV(ValueBlock 
valueBlock) {
       if (result instanceof Number) {
         _longValuesSV[i] = ((Number) result).longValue();
       } else {
-        // Handle scientific notation
-        _longValuesSV[i] = (long) Double.parseDouble(result.toString());
+          _longValuesSV[i] = Long.parseLong(result.toString());

Review Comment:
   > JSON standard does allow that for numbers but not for integers.
   
   That is not precise. In JSON there is only one type of numeric literal: 
_number_. A _number_ is defined as `number fraction exponent`. Therefore you 
can write a number as `9007199254740993e0` and it will generate the _number_ 
`9007199254740993e`. We have to interpret that as `9007199254740992`, but 
`(long) Double.parseDouble("9007199254740993e0")` returns `9007199254740992`.
   
   Using `BigInteger`/`BigDecimal` should be correct, but IMHO pretty 
expensive. I think it shouldn't be that hard to write our own parser. Something 
like:
   
   ```java
   int e = indexOfExponent(str);
   if (e < 0) {
     return Long.parse(str);
   }
   long base = Long.parse(str, 0, e, 10));
   int exp = Integer.parseInt(str, e+1, 10));
   long powerOfTwo = powerOfTwo(exp); // as suggested in 
https://stackoverflow.com/questions/46983772/fastest-way-to-obtain-a-power-of-10
   long result = verifyOverflow(base * powerOfTen(exp)); // Not sure if 
checking if the value changed the sign is good enough
   ```
   
   Alternatively, if we are not sure if that algorithm is correct, we can do:
   ```java
   int e = indexOfExponent(str);
   if (e < 0) {
     return Long.parse(str);
   }
   if (str.contains('.')) {
     throw whatever;
   }
   return toLong(new BigDecimal(str)); // verifying we fail if the value is not 
representable as double.
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to