Thank you for all the work that you have done by doing the two
implementations and
extensive test cases. I wanted to respond to a few points that I think we
may want
to consider to be bugs in specification, and report them as bugs in
standard.
(Would you be interested in doing so?)
I do not understand why the following is the case:
- `layout_left::mapping<Extents> == layout_right::mapping<OExtents>`
is valid, if and only iff `Extents != OExtents`.
For the approach with bases, I was considering something less excessive
where we only
extract Rank0 and Rank1 bases, and not separation at each possible levels.
Regarding, which commits to publish, given that:
* your test showed no difference in the optimized binary
* various inconsistencies in the current specification (we may want to
address them)
* your preference towards separate flat implementation
I think you should move forward with each layout being totally separate.
Regards,
Tomasz
On Mon, May 12, 2025 at 5:17 PM Luc Grosheintz <[email protected]>
wrote:
>
>
> On 5/9/25 8:16 AM, Tomasz Kaminski wrote:
> > The test I would perform would be :
> > std::layout_left::mapping<std::extents<int>> l0;
> > std::layout_right:mapping<std::extents<int>> r0;
> > // stride
> > bool all_unique()
> > {
> > return l0.is_unique();
> > return r0.is_unique();
> > }
> > And we should have only one is_unique symbol.
> >
> > but with a lot more duplication. Then compile it with:
> >> gcc -O2 -c many_symbols.cc
>
> I finished both implementation and wrote the type of test file we
> agreed on, it calls all or almost all ctors, operator(), is_exhaustive,
> extents() and stride. I compiled the test file against both
> implementations (which both pass all the tests).
>
> The generated code in `.text` on `-g -O2` is exceedingly similar (often
> identical) whether we use the flat implementation or the one using base
> classes. Using `nm` I find no symbols (other than those for the global
> variables and the functions to exercise the layout code), everything has
> been inlined.
>
> When looking at code compiled with `-O0` I see more symbols with the
> implementation that uses base classes to reduce the number of symbols. I
> looked at the symbols and can confirm that the method `extents` exists
> exactly once for each specialization of `std::extents` (and not once per
> specialization per layout); same for `operator()`.
>
> However, when looking at ctors I find that (certain) inherited ctors are
> given a symbol. As an example let's look at:
>
> ```
> #include <mdspan>
>
> constexpr size_t dyn = std::dynamic_extent;
>
> using M1 = std::layout_left::mapping<std::extents<int, 5>>;
> using M2 = std::layout_left::mapping<std::extents<char, 5>>;
>
> M1 ctor_cycle_1d(M2 m2) { return M1(m2); }
> ```
>
> This results in the following three (demangled) symbols:
>
> std::layout_left::mapping<std::extents<int, 5ul> >
> ::_Rank1_mapping_base<std::extents<char, 5ul> >(
> std::__mdspan::_Rank1_mapping_base<std::extents<char, 5ul> > const&)
>
> std::__mdspan::_Mapping_base<std::extents<int, 5ul> >
> ::_Mapping_base(std::extents<int, 5ul> const&)
>
> std::__mdspan::_Rank1_mapping_base<std::extents<int, 5ul> >
> ::_Rank1_mapping_base<std::extents<char, 5ul> >(
> std::__mdspan::_Rank1_mapping_base<std::extents<char, 5ul> > const&)
>
> (Naturally, there's more, e.g. for _ExtentsStorage and extents, etc.
> Using `objdump -C -d` I can confirm that these "genuine" and there's a
> little bit of code associated with each symbol.)
>
> With the current implementations I don't see much difference in object
> file size, and almost one when using -O2.
>
> Total object file sizes:
> naive | bases
> -O2 9832 10240
> -g -O2 250768 261920
> -g -O0 600512 564400
>
> Number of symbols (nm -C OBJ | wc -l):
>
> naive | bases
> -O2 46 46
> -g -O2 46 46
> -g -O0 816 907
>
> I hope this explains what I'm seeing. Please let me know if anything is
> unclear or if you suspect I'm doing something wrong.
>
> --------------------------------------------------------------------
>
> I would like to once more make the case for the flat implementation,
> i.e. one class per layout kind, by summarizing some of the twists I've
> encountered.
>
> Generic oddities:
> - `layout_left::mapping<Extents> == layout_right::mapping<OExtents>`
> is valid, if and only iff `Extents != OExtents`.
>
- `layout_right(layout_stride)` is noexcept but
> `layout_left(layout_stride)` isn't.
>
This inconsistency seems to be a bug in specification. I doubt this is
intended.
>
> Traps:
> - The ctor `mapping(extents_type)` can't be inherited, because if we
> do, the template parameter `_Extents` isn't inferred.
> - We must work around a compiler issue that prevents inheriting the
> condition for explicitness of ctors.
> - Reusing `_Rank0_mapping_base` for layout_stride requires hiding the
> ctor `mapping(extents_type)`.
>
- For rank == 0, layout_stride is convertible from other
> layout_stride, if and only if the extent_type's are convertible.
> Whereas layout_stride is convertible to layout_{left,right}
> unconditionally (at rank 0).
>
Here, I also believe that either we should consistently ignore or take into
consideration
the convertibility of index_types for the rank(). Currently it seems like
we are in between
these two.
>
> I believe I've forgotten one; and I'm definitely very worried I've not
> uncovered all of them yet.
>
> Inconveniences:
> - The ctor `mapping(layout_stride)` can't be shared between
> layout_left and layout_right, if we want to respect the missing
> `noexcept` for layout_left.
>
This seems like a bug in a standard.
> - While we can extract a `_MappingBase` which deals with the `extents`
> object we can't write the correct compatiblity check for
> `layout_stride` and the two padded layouts at that level. (Meaning
> we'll need an intermediate layer of ctors for checking these
> conditions).
> - The kind of the layout (left, right, stride) and the rank of the
> layout appear as if they're orthogonal concepts. However, the two
> concepts are intertwined. Making it hard to implement the notion of
> Rank0 or Rank1 in a manner that's independent of the layout kind.
> - The trait `std::is_convertible_v` doesn't actually check that
> the code used to perform the conversion is valid; meaning one must
> also make sure to instantiate the ctor (in all possible variations).
> - I'm currently at over 1500 lines of code for testing "everything" in
> all it's variations.
>
> The architecture at scale looks like this:
>
> - _Mapping_base: because we would like to extract the method
> `extents`.
> - _Rank0_mapping_base: for code shared between all three layouts: e.g.
> ctors and required_span_size.
> - _Rank0_left_right_base: for code not shared with stride: ctor from
> layout_stride (because of differing explicitness conditions).
> - _Rank1_mapping_base: for code shared between left and right: e.g.
> layout changing ctors, trivial implementation of operator().
> - _RankN_mapping_base: for code shared between left and right: e.g.
> ctors that check their preconditions, required_span_size.
> - _RankN_left_base: code for rank N and left only, e.g. non-layout
> changing ctors, operator() and stride.
> - _RankN_right_base: see _RankN_left_base.
> - _Rank0_stride_base: because we need `stride()`.
> - _Stride: which is a base which contains `_M_stride`. This we want
> because it only depends on IndexType and `sizeof...(Indices)`.
> - _RankN_stride_base: code for rank N and stride only, e.g. generic
> implementation of operator() and required_span_size().
> - _IsAlwaysStrided: for `is_always_strided` and `is_strided`, because
> they don't depend on the extents.
> - _IsAlwaysUnique: see above.
> - _IsExhaustive<Always>: for `is_always_exhaustive` and (if Always)
> `is_exhaustive`.
>
> Plus the three not entirely empty classes for `layout_left`,
> `layout_right` and `layout_stride`.
>
> I'm happy to submit either version for review; they're both ready and
> just need to be structured into individual commits.
>
>