TL;DR; BigInteger/BigDecimal is the "right" thing to do, otherwise cap at
the client/server floor.
I have a few thoughts here:
1) I don't like losing precision in any case so a cap makes sense (maybe)
2) If you do cap, would you not want to cap to the lowest of the client or
server? I.e. if the client is a 32 bit system and the server is a 64 bit
system, you'd cap at 32 bits.
3) There may be cases where someone needs higher precision numbers. I can't
think of them off hand, but I can guarantee that they'll happen so adding
BigInteger and BigDecimal are probably a good idea.
4) For any fact that is retrieved that has multiple formats, I would like
to see a standard set of a hash for each size so that it is easier to work
with. Sure, right now, I can do variable mangling or post retrieval math,
but it's so very untidy.
disk_size => {
'/dev/sda' => {
'B' => 10737418240,
'kB' => 10485760,
'MB' => 10240,
'GB' => 10,
}
}
But then, how far do you take this? TB, PB? EB.......?
Thanks,
Trevor
On Mon, Sep 1, 2014 at 4:54 AM, Henrik Lindberg <
[email protected]> wrote:
> Hi,
> Recently I have been looking into serialization of various kinds, and the
> issue of how we represent and serialize/deserialize numbers have come up.
>
> TL;DR - I want to specify the max values of integers and floats in the
> puppet language for a number of reasons. Skip the background part
> to get to "Questions and Proposal" if you are already familiar with
> serialization formats, and issues regarding numeric representation.
>
> Background
> ---
> As you may know, Ruby has fluent handling of numbers - if a number would
> overflow its current byte-size a larger representation will be used - i.e.
> from 32 to 64 to (ruby) BigInteger (unlimited). Floating point numbers
> undergo the same transition from 32 to 64 to BigDecimal (unlimited).
>
> This is very flexible and helpful most of the time, but it creates problem
> when serializing / deserializing. Most serialization formats
> can simply not deal with > 64 bit values as regular numbers. They may do
> horrible things like truncation, or use the max/min value if a value is
> too big, or for floating point drastically lose precision.
>
> YAML
> - specifies integers to have arbitrary size, but recommends that an
> implementation uses its native integer size. The specification says:
> "In some languages (such as C), an integer may overflow the native type's
> storage capability. A YAML processor may reject such a value as an error,
> truncate it with a warning, or find some other manner to round-trip it. In
> general, integers representable using 32 binary digits should safely
> round-trip through most systems.". http://www.yaml.org/spec/1.2/spec.html
>
> For floating point values, only IEEE 32 bit are safe.
>
> In other words; it is unspecified... and means a YAML implementation may
> silently truncate numbers to 32 bit values to 32 bit max int
> (2,147,483,647) when running on a 32 bit machine (some implementations as
> noted as "gotchas" in blog posts (google for it)).
>
> JSON
> - is similar to YAML in that it specifies a number to be an arbitrary
> number of digits and it is thus up to an implementation to bind this to a
> representation. It has the same problems as YAML. Notably, if used with
> JavaScript which only has Number for both Integer and Real, the largest
> integer number is 2^53 (after which it starts to lose precision).
>
> MsgPack
> - handles 8-16-32-64 bit integers (signed and unsigned) as well as 32 and
> 64 bit floating point. Does not have built in BigInteger, BigDecimal types.
>
> The Puppet Language Specification
> ---
> In the Puppet Language Specification the size and precision of numbers is
> currently specified as Ruby numbers (simply because this was easiest). This
> is sloppy and leaves edge cases for serialization and storage of data.
>
> Proposal
> ========
> I would like to cap a Puppet Integer to be a 64 signed value when used as
> a resource attribute, or anywhere in external formats. This means a value
> range of -2^63 to 2^63-1 which is in Exabyte range (1 exabyte = 2^60).
>
> I would like to cap a Puppet Float to be a 64 bit (IEEE 754 binary64) when
> used as a resource attribute or anywhere in external formats.
>
> With respect to intermediate results, I propose that we specify that
> values are of arbitrary size and that it is an error to store a value that
> is to big for the typed representation Integer (64 bit signed). For Float
> (64 bit) representation there is no error, but it looses precision. When
> specifying an attribute to have Number type, automatic conversion to Float
> (with loss of precision) takes place if an internal integer number is to
> big for the Integer representation.
>
> (Note, by default, attributes are typed as Any, which means that they by
> default would store a Float if the integer value representation overflows).
>
> Questions
> =========
> * Is it important that Javascript can be used to (accurately) read JSON
> generated by Puppet? (If so, the limit needs to be 2^53 or values lose
> precision).
>
> * Is it important in Puppet Manifests to handle values larger than 2^63-1
> (smaller than -2^63), and if not so, why isn't it sufficient to use a
> floating point value (with reduced precision).
>
> * If you think Puppet needs to handle very large values (yottabyte sized
> disks?), should the language have convenient ways of expressing
> such values e.g. 42yb ?
>
> * Is it ok to automatically do transformation to floating point if values
> overflow, and the type of an attribute is Number? (as discussed above). I
> can imagine this making it difficult to efficiently represent an attribute
> in a database and support may vary between different database engines.
>
> * Do you think it is worth the trouble to add the types BigInteger and
> BigDecimal to the type system to allow the representation to be more
> precise? (Note that this makes it difficult to use standard number
> representation in serialization formats). This means that Number is not
> allowed as an attribute/storage type (user must choose Integer, Float, or
> one of the Big... types).
>
> * Do you think it should work as in Ruby? If so, are you ok with
> serialization that is non standard?
>
> - henrik
> --
>
> Visit my Blog "Puppet on the Edge"
> http://puppet-on-the-edge.blogspot.se/
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/puppet-dev/lu1c8m%24a2n%241%40ger.gmane.org.
> For more options, visit https://groups.google.com/d/optout.
>
--
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
[email protected]
-- This account not approved for unencrypted proprietary information --
--
You received this message because you are subscribed to the Google Groups
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoUWgzwdhhEFtS6STj_POU80dtPpVvrN_dx1Ta13QCjJkQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.