Thank you for all the work that you have done by doing the two implementations and extensive test cases. I wanted to respond to a few points that I think we may want to consider to be bugs in specification, and report them as bugs in standard. (Would you be interested in doing so?)
I do not understand why the following is the case: - `layout_left::mapping<Extents> == layout_right::mapping<OExtents>` is valid, if and only iff `Extents != OExtents`. For the approach with bases, I was considering something less excessive where we only extract Rank0 and Rank1 bases, and not separation at each possible levels. Regarding, which commits to publish, given that: * your test showed no difference in the optimized binary * various inconsistencies in the current specification (we may want to address them) * your preference towards separate flat implementation I think you should move forward with each layout being totally separate. Regards, Tomasz On Mon, May 12, 2025 at 5:17 PM Luc Grosheintz <luc.groshei...@gmail.com> wrote: > > > On 5/9/25 8:16 AM, Tomasz Kaminski wrote: > > The test I would perform would be : > > std::layout_left::mapping<std::extents<int>> l0; > > std::layout_right:mapping<std::extents<int>> r0; > > // stride > > bool all_unique() > > { > > return l0.is_unique(); > > return r0.is_unique(); > > } > > And we should have only one is_unique symbol. > > > > but with a lot more duplication. Then compile it with: > >> gcc -O2 -c many_symbols.cc > > I finished both implementation and wrote the type of test file we > agreed on, it calls all or almost all ctors, operator(), is_exhaustive, > extents() and stride. I compiled the test file against both > implementations (which both pass all the tests). > > The generated code in `.text` on `-g -O2` is exceedingly similar (often > identical) whether we use the flat implementation or the one using base > classes. Using `nm` I find no symbols (other than those for the global > variables and the functions to exercise the layout code), everything has > been inlined. > > When looking at code compiled with `-O0` I see more symbols with the > implementation that uses base classes to reduce the number of symbols. I > looked at the symbols and can confirm that the method `extents` exists > exactly once for each specialization of `std::extents` (and not once per > specialization per layout); same for `operator()`. > > However, when looking at ctors I find that (certain) inherited ctors are > given a symbol. As an example let's look at: > > ``` > #include <mdspan> > > constexpr size_t dyn = std::dynamic_extent; > > using M1 = std::layout_left::mapping<std::extents<int, 5>>; > using M2 = std::layout_left::mapping<std::extents<char, 5>>; > > M1 ctor_cycle_1d(M2 m2) { return M1(m2); } > ``` > > This results in the following three (demangled) symbols: > > std::layout_left::mapping<std::extents<int, 5ul> > > ::_Rank1_mapping_base<std::extents<char, 5ul> >( > std::__mdspan::_Rank1_mapping_base<std::extents<char, 5ul> > const&) > > std::__mdspan::_Mapping_base<std::extents<int, 5ul> > > ::_Mapping_base(std::extents<int, 5ul> const&) > > std::__mdspan::_Rank1_mapping_base<std::extents<int, 5ul> > > ::_Rank1_mapping_base<std::extents<char, 5ul> >( > std::__mdspan::_Rank1_mapping_base<std::extents<char, 5ul> > const&) > > (Naturally, there's more, e.g. for _ExtentsStorage and extents, etc. > Using `objdump -C -d` I can confirm that these "genuine" and there's a > little bit of code associated with each symbol.) > > With the current implementations I don't see much difference in object > file size, and almost one when using -O2. > > Total object file sizes: > naive | bases > -O2 9832 10240 > -g -O2 250768 261920 > -g -O0 600512 564400 > > Number of symbols (nm -C OBJ | wc -l): > > naive | bases > -O2 46 46 > -g -O2 46 46 > -g -O0 816 907 > > I hope this explains what I'm seeing. Please let me know if anything is > unclear or if you suspect I'm doing something wrong. > > -------------------------------------------------------------------- > > I would like to once more make the case for the flat implementation, > i.e. one class per layout kind, by summarizing some of the twists I've > encountered. > > Generic oddities: > - `layout_left::mapping<Extents> == layout_right::mapping<OExtents>` > is valid, if and only iff `Extents != OExtents`. > - `layout_right(layout_stride)` is noexcept but > `layout_left(layout_stride)` isn't. > This inconsistency seems to be a bug in specification. I doubt this is intended. > > Traps: > - The ctor `mapping(extents_type)` can't be inherited, because if we > do, the template parameter `_Extents` isn't inferred. > - We must work around a compiler issue that prevents inheriting the > condition for explicitness of ctors. > - Reusing `_Rank0_mapping_base` for layout_stride requires hiding the > ctor `mapping(extents_type)`. > - For rank == 0, layout_stride is convertible from other > layout_stride, if and only if the extent_type's are convertible. > Whereas layout_stride is convertible to layout_{left,right} > unconditionally (at rank 0). > Here, I also believe that either we should consistently ignore or take into consideration the convertibility of index_types for the rank(). Currently it seems like we are in between these two. > > I believe I've forgotten one; and I'm definitely very worried I've not > uncovered all of them yet. > > Inconveniences: > - The ctor `mapping(layout_stride)` can't be shared between > layout_left and layout_right, if we want to respect the missing > `noexcept` for layout_left. > This seems like a bug in a standard. > - While we can extract a `_MappingBase` which deals with the `extents` > object we can't write the correct compatiblity check for > `layout_stride` and the two padded layouts at that level. (Meaning > we'll need an intermediate layer of ctors for checking these > conditions). > - The kind of the layout (left, right, stride) and the rank of the > layout appear as if they're orthogonal concepts. However, the two > concepts are intertwined. Making it hard to implement the notion of > Rank0 or Rank1 in a manner that's independent of the layout kind. > - The trait `std::is_convertible_v` doesn't actually check that > the code used to perform the conversion is valid; meaning one must > also make sure to instantiate the ctor (in all possible variations). > - I'm currently at over 1500 lines of code for testing "everything" in > all it's variations. > > The architecture at scale looks like this: > > - _Mapping_base: because we would like to extract the method > `extents`. > - _Rank0_mapping_base: for code shared between all three layouts: e.g. > ctors and required_span_size. > - _Rank0_left_right_base: for code not shared with stride: ctor from > layout_stride (because of differing explicitness conditions). > - _Rank1_mapping_base: for code shared between left and right: e.g. > layout changing ctors, trivial implementation of operator(). > - _RankN_mapping_base: for code shared between left and right: e.g. > ctors that check their preconditions, required_span_size. > - _RankN_left_base: code for rank N and left only, e.g. non-layout > changing ctors, operator() and stride. > - _RankN_right_base: see _RankN_left_base. > - _Rank0_stride_base: because we need `stride()`. > - _Stride: which is a base which contains `_M_stride`. This we want > because it only depends on IndexType and `sizeof...(Indices)`. > - _RankN_stride_base: code for rank N and stride only, e.g. generic > implementation of operator() and required_span_size(). > - _IsAlwaysStrided: for `is_always_strided` and `is_strided`, because > they don't depend on the extents. > - _IsAlwaysUnique: see above. > - _IsExhaustive<Always>: for `is_always_exhaustive` and (if Always) > `is_exhaustive`. > > Plus the three not entirely empty classes for `layout_left`, > `layout_right` and `layout_stride`. > > I'm happy to submit either version for review; they're both ready and > just need to be structured into individual commits. > >