Re: [PATCH v3 0/9] Implement layouts from mdspan.

Tomasz Kaminski Mon, 26 May 2025 05:27:13 -0700

On Mon, May 26, 2025 at 2:20 PM Luc Grosheintz <luc.groshei...@gmail.com>
wrote:


>
>
> On 5/26/25 13:53, Tomasz Kaminski wrote:
> > On Mon, May 26, 2025 at 1:32 PM Luc Grosheintz <luc.groshei...@gmail.com
> >
> > wrote:
> >
> >>
> >>
> >> On 5/26/25 11:43, Tomasz Kaminski wrote:
> >>> On Mon, May 26, 2025 at 11:35 AM Luc Grosheintz <
> >> luc.groshei...@gmail.com>
> >>> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 5/22/25 15:21, Tomasz Kaminski wrote:
> >>>>>
> >>>>> For the stride and product computation, we should perform them in
> >>>>> Extent::size_type, not index_type.
> >>>>> The latter may be signed, and we may hit UB in multiplying non-zero
> >>>>> extents, before reaching the zero.
> >>>>>
> >>>>
> >>>> Then I observe the following issues:
> >>>>
> >>>> 1. When computing products, the integer promotion rules can interfere.
> >>>> For simplicity let's assume that int is a 32 bit integer. Then the
> >>>> relevant case is `uint16_t` (or unsigned short). Which is unsigned;
> and
> >>>> therefore overflow shouldn't be UB. I observe that the expression
> >>>>
> >>>>      prod *= n;
> >>>>
> >>>> will overflow as `int` (for large enough `n`). I believe that during
> the
> >>>> computation of `prod * n` both sides are promoted to int (because the
> >>>> range of uint16_t is contained in the range of `int`) and then
> >>>> overflows, e.g. for n = 2**16-1.
> >>>>
> >>>> Note that many other small, both signed and unsigned, integers
> >>>> semantically also overflow, but it's neither UB that's detected by
> >>>> -fsanitize=undefined, nor a compiler error. Likely because the
> >>>> "overflow" happens during conversion, which (in C++23) is uniquely
> >>>> defined in [conv.integral], i.e. not UB.
> >>>>
> >>>> draft: https://eel.is/c++draft/conv.integral
> >>>> N4950: 7.3.9 on p. 101
> >>>>
> >>>> The solution I've come up is to not use `size_type` but
> >>>>      make_unsigned_t<decltype(index_type{} * index_type{})>
> >>>>
> >>>> Please let me know if there's a better solution to forcing unsigned
> >>>> math.
> >>>>
> >>> I think at this point we should perform stride computation in
> >> std::size_t.
> >>> Because accessors are defined to accept size_t, the
> required_span_size()
> >>> cannot be greater
> >>> than maximum of size_t, and that limits our product of extents.
> >>>
> >>
> >> I looked into this in the context of computing the product of
> >> static extents. The stumbling block was that I couldn't find
> >> a clear statement that sizeof(int) <= sizeof(size_t), or that
> >> size_t is exempted from the integer conversion rules.
> >>
> >> Therefore, the concern was that the overflow issue would come
> >> back on systems with 16-bit size_t and 32-bit int.
> >>
> > We could cast elements of __dyn_exts to size_t before multiplying in
> > __ext_prod.
> > Even use size_t in for loop: for (size_t x ; __dyn_ext()).
> >
>
> This is clear. However, what I'm worried about is that due to
> the integer conversion rules:
> https://eel.is/c++draft/conv.integral#1
>
> On a system in which size_t is 16 bits and int is 32 bits, the
> conversion rank of size_t will be less than that of int:
>
I highly doubt that GCC/libstdc++ targets any architecture where this would
be the case.

> https://eel.is/c++draft/conv.rank#1.2
> https://eel.is/c++draft/conv.rank#1.4
>
> Therefore, during binary operations, both operands undergo integer
> promotion due to the arithmetic conversion:
> https://eel.is/c++draft/expr.arith.conv#1.5
> https://eel.is/c++draft/conv.prom#2
>
> Hence, IIUC, both operands are converted to int and we're straight
> back to the issue of `uint16_t` on a system with 32-bit int, i.e.
>
>    uint16_t(n) * uint16_t(n)
>
> is equivalent to
>
>    int(n) * int(n)
>
> and causes UB due to signed overflow, if `n` is sufficiently large,
> e.g. 2**16-1. Note, it doesn't help to use `*=`.
>
> The case for uint16_t is setup here:
> Godbolt: https://godbolt.org/z/bcY1GnMPr
>
> >>
> >> I'm slightly unhappy that (on common systems) we need to use
> >> 64-bit integers for 32-bit (or less) operations; but as you
> >> point out, this only affects code that shouldn't be performance
> >> sensitive.
> >>
> >>>>
> >>>> Godbolt: https://godbolt.org/z/PnvaYT7vd
> >>>>
> >>>> 2. Let's assume we compute `__extents_prod` safely, e.g. by doing all
> >>>> math as unsigned integers. There's several places we need to be
> careful:
> >>>>
> >>>>      2.1. layout_{right,left}::stride, these still compute products,
> that
> >>>>      overflow and might not be multiplied by `0` to make the answer
> >>>>      unambiguous. For an empty extent, any number is a valid stride.
> >> Hence,
> >>>>      this only requires that we don't run into UB.
> >>>>
> >>>>      2.2. The default ctor of layout_stride computes the layout_right
> >>>>      strides on the fly. We can use __unsigned_prod to keep computing
> the
> >>>>      extents in linear time. The only requirement I'm aware of is that
> >> the
> >>>>      strides are the same as those for layout_right (but the actual
> value
> >>>>      in not defined directly).
> >>>>
> >>>>      2.3 layout_stride::required_span_size, the current implementation
> >>>>      first scans for zeros; and only if there are none does it proceed
> >> with
> >>>>      computing the required span size in index_type. This is safe,
> >> because
> >>>>      the all terms in the sum are non-negative and the mandate states
> >> that
> >>>>      the total is a representable number. Hence, all the involved
> terms
> >> are
> >>>>      representable too.
> >>>>
> >>>> 3. For those interested in what the other two implementions do: both
> >>>> fail in some subset of the corner cases.
> >>>>
> >>>> Godbolt: https://godbolt.org/z/vEYxEvMWs
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Re: [PATCH v3 0/9] Implement layouts from mdspan.

Reply via email to