Re: Removing FLOAT

Mihai Budiu Tue, 31 Oct 2023 14:35:04 -0700

I have submitted https://github.com/apache/calcite/pull/3492 to fix this.
I hope that the various adapter maintainers will review it carefully.


Mihai
________________________________
From: Julian Hyde <[email protected]>
Sent: Tuesday, October 31, 2023 2:31 PM
To: [email protected] <[email protected]>
Subject: Re: Removing FLOAT

I don’t know. In the native semantics of each adapter, does their float (or 
FLOAT) type match Java or SQL?

In ‘file' adapter, the ‘float’ type matches Java float, and should correspond 
to SQL REAL. Therefore this code is right:

            case "float":
              fieldType = toNullableRelDataType(typeFactory, SqlTypeName.REAL);
              break;

And this code is wrong (should be ‘case REAL’):

      case FLOAT:
        if (string.length() == 0) {
          return null;
        }
        return Float.parseFloat(string);

Julian


> On Oct 30, 2023, at 5:25 PM, Mihai Budiu <[email protected]> wrote:
>
> So just to make sure:
>
> FLOAT: 64 bit
> DOUBLE: 64 bit
> REAL: 32 bit
>
> Piglet, Druid, the File adaptor all got this wrong. They all have FLOAT types 
> represented as Calcite FLOAT, but it should be REAL instead if this is right.
> So I want to make sure I am fixing it in the right way.
>
> Mihai
> ________________________________
> From: Julian Hyde <[email protected]>
> Sent: Thursday, October 26, 2023 3:30 PM
> To: [email protected] <[email protected]>
> Subject: Re: Removing FLOAT
>
> Treating FLOAT as sugar for REAL would work. I don’t think we can go as far 
> as making SqlTypeName.FLOAT deprecated, but we can add checks that it is not 
> used.
>
> I’ll note that Java-based SQL engines (Spark, Hive) got the TIMESTAMP type 
> wrong because they assumed that it was equivalent to java.sql.Timestamp, and 
> they probably made the same mistake with FLOAT, assuming that it was similar 
> to Java float. Sadly, the origins of the SQL type are more likely COBOL or 
> PL/1.
>
> On the subject of DECIMAL and NUMERIC. Someday, I would love to support 
> decimal floating point numbers (our current floating point numbers are 
> binary, so cannot represent 0.1 exactly). Arbitrary precision would be nice 
> (e.g. DECIMAL(192, 0)).
>
> Julian
>
>> On Oct 26, 2023, at 3:13 PM, Mihai Budiu <[email protected]> wrote:
>>
>> I am only proposing to remove it from the IR.
>> The type would still be accepted by the parser, just represented as DOUBLE.
>> No SQL programs would need to change. Just treat it the same way DECIMAL and 
>> NUMERIC are handled; there is only one representation for both.
>>
>> Julian's solution is the first I thought about, and it certainly can be done.
>> There are two reasons that I proposed a radical approach:
>>
>> *   if the type isn't there you cannot make mistakes anymore treating it 
>> wrong
>> *   even more importantly, if third party users of Calcite (e.g., let's say 
>> a system like Spark) assumed that FLOAT is something else (in Spark it's the 
>> same as REAL as far as I can tell), they will now find out that their 
>> assumption is wrong, and can take corrective actions.
>>
>> Both solution would entail a non-trivial amount of work, so I am trying to 
>> avoid working on the wrong one.
>>
>> Mihai
>>
>> ________________________________
>> From: Julian Hyde <[email protected]>
>> Sent: Thursday, October 26, 2023 3:04 PM
>> To: [email protected] <[email protected]>
>> Subject: Re: Removing FLOAT
>>
>> Removing a type that is in the SQL standard is not practical.
>>
>> The best way to deal with the confusion is to state the rules again. If 
>> there are a few inconsistencies in the code, fix the inconsistencies.
>>
>> REAL is a 32 bit float (similar to Java float);
>> DOUBLE is a 64 bit float (similar to Java double);
>> Use of FLOAT is not recommended, but it is currently equivalent to DOUBLE.
>>
>>> On Oct 26, 2023, at 2:51 PM, Paul Jackson 
>>> <[email protected]> wrote:
>>>
>>> Doesn't having two types help distinguish which type is in the source 
>>> database for platforms that support both? A call to 
>>> org.apache.calcite.sql.type.SqlTypeName#getNameForJdbcType would lose that 
>>> distinction. I don't know offhand whether there are databases that require 
>>> a CAST when converting from double to float. Would that be important to 
>>> know before deciding?
>>> -Paul
>>>  On Thursday, October 26, 2023 at 02:02:50 PM EDT, Mihai Budiu 
>>> <[email protected]> wrote:
>>>
>>> Hello,
>>>
>>> It turns out that there is a lot of confusion in Calcite about what FLOAT 
>>> is.
>>> I have filed an issue about it: 
>>> https://issues.apache.org/jira/browse/CALCITE-6074
>>>
>>> It seems that the intent is for FLOAT to be an exact alias of DOUBLE.
>>> If that's the case, we can remove a lot of code and prevent further 
>>> confusions by just removing FLOAT as a SqlTypeName and using DOUBLE 
>>> everywhere starting in the parser. The situation is very similar with 
>>> DECIMAL and NUMERIC.
>>>
>>> This will also prevent a lot of bugs in third party code that uses Calcite, 
>>> since they cannot misinterpret what FLOAT is.
>>>
>>> If you agree that this is the right thing, I will submit a PR which marks 
>>> FLOAT as @Deprecated and removes all its uses in the codebase.
>>>
>>> Mihai
>>>
>>
>

Re: Removing FLOAT

Reply via email to