On Wed, Jul 17, 2024 at 3:17 PM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Richard Biener <richard.guent...@gmail.com> writes: > > On Wed, Jul 17, 2024 at 1:53 PM Tejas Belagod <tejas.bela...@arm.com> wrote: > >> > >> On 7/17/24 4:36 PM, Richard Biener wrote: > >> > On Wed, Jul 17, 2024 at 10:17 AM Tejas Belagod <tejas.bela...@arm.com> > >> > wrote: > >> >> > >> >> On 7/15/24 6:05 PM, Richard Biener wrote: > >> >>> On Mon, Jul 15, 2024 at 1:22 PM Tejas Belagod <tejas.bela...@arm.com> > >> >>> wrote: > >> >>>> > >> >>>> On 7/15/24 12:16 PM, Tejas Belagod wrote: > >> >>>>> On 7/12/24 6:40 PM, Richard Biener wrote: > >> >>>>>> On Fri, Jul 12, 2024 at 3:05 PM Jakub Jelinek <ja...@redhat.com> > >> >>>>>> wrote: > >> >>>>>>> > >> >>>>>>> On Fri, Jul 12, 2024 at 02:56:53PM +0200, Richard Biener wrote: > >> >>>>>>>> Padding is only an issue for very small vectors - the obvious > >> >>>>>>>> choice is > >> >>>>>>>> to disallow vector types that would require any padding. I can > >> >>>>>>>> hardly > >> >>>>>>>> see where those are faster than using a vector of up to 4 char > >> >>>>>>>> elements. > >> >>>>>>>> Problematic are 1-bit elements with 4, 2 or one element vectors, > >> >>>>>>>> 2-bit elements > >> >>>>>>>> with 2 or one element vectors and 4-bit elements with 1 element > >> >>>>>>>> vectors. > >> >>>>>>> > >> >>>>>>> I'd really like to avoid having to support something like > >> >>>>>>> _BitInt(16372) __attribute__((vector_size (sizeof (_BitInt(16372)) > >> >>>>>>> * > >> >>>>>>> 16))) > >> >>>>>>> _BitInt(2) to say size of long long could be acceptable. > >> >>>>>> > >> >>>>>> I'd disallow _BitInt(n) with n >= 8, it should be just the syntactic > >> >>>>>> way to say > >> >>>>>> the element should have n (< 8) bits. > >> >>>>>> > >> >>>>>>>> I have no idea what the stance of supporting _BitInt in C++ are, > >> >>>>>>>> but most certainly diverging support (or even semantics) of the > >> >>>>>>>> vector extension in C vs. C++ is undesirable. > >> >>>>>>> > >> >>>>>>> I believe Clang supports it in C++ next to C, GCC doesn't and Jason > >> >>>>>>> didn't > >> >>>>>>> look favorably to _BitInt support in C++, so at least until > >> >>>>>>> something > >> >>>>>>> like > >> >>>>>>> that is standardized in C++ the answer is probably no. > >> >>>>>> > >> >>>>>> OK, I think that rules out _BitInt use here so while bool is then > >> >>>>>> natural > >> >>>>>> for 1-bit elements for 2-bit and 4-bit elements we'd have to > >> >>>>>> specify the > >> >>>>>> number of bits explicitly. There is signed_bool_precision but like > >> >>>>>> vector_mask it's use is restricted to the GIMPLE frontend because > >> >>>>>> interaction with the rest of the language isn't defined. > >> >>>>>> > >> >>>>> > >> >>>>> Thanks for all the suggestions - really insightful (to me) > >> >>>>> discussions. > >> >>>>> > >> >>>>> Yeah, BitInt seemed like it was best placed for this, but not having > >> >>>>> C++ > >> >>>>> support is definitely a blocker. But as you say, in the absence of > >> >>>>> BitInt, bool becomes the natural choice for bit sizes 1, 2 and 4. One > >> >>>>> way to specify non-1-bit widths could be overloading vector_size. > >> >>>>> > >> >>>>> Also, I think overloading GIMPLE's vector_mask takes us into the > >> >>>>> earlier-discussed territory of what it should actually mean - it > >> >>>>> meaning > >> >>>>> the target truth type in GIMPLE and a generic vector extension in > >> >>>>> the FE > >> >>>>> will probably confuse gcc developers more than users. > >> >>>>> > >> >>>>>> That said - we're mixing two things here. The desire to have > >> >>>>>> "proper" > >> >>>>>> svbool (fix: declare in the backend) and the desire to have "packed" > >> >>>>>> bit-precision vectors (for whatever actual reason) as part of the > >> >>>>>> GCC vector extension. > >> >>>>>> > >> >>>>> > >> >>>>> If we leave lane-disambiguation of svbool to the backend, the values > >> >>>>> I > >> >>>>> see in supporting 1, 2 and 4 bitsizes are 1) first step towards > >> >>>>> supporting BitInt(N) vectors possibly in the future 2) having a way > >> >>>>> for > >> >>>>> targets to define their intrinsics' bool vector types using GNU > >> >>>>> extensions 3) feature parity with Clang's ext_vector_type? > >> >>>>> > >> >>>>> I believe the primary motivation for Clang to support ext_vector_type > >> >>>>> was to have a way to define target intrinsics' vector bool type using > >> >>>>> vector extensions. > >> >>>>> > >> >>>> > >> >>>> > >> >>>> Interestingly, Clang seems to support > >> >>>> > >> >>>> typedef struct { > >> >>>> _Bool i:1; > >> >>>> } STR; > >> >>>> > >> >>>> typedef struct { _Bool i: 1; } __attribute__((vector_size (sizeof > >> >>>> (STR) > >> >>>> * 4))) vec; > >> >>>> > >> >>>> > >> >>>> int foo (vec b) { > >> >>>> return sizeof b; > >> >>>> } > >> >>>> > >> >>>> I can't find documentation about how it is implemented, but I suspect > >> >>>> the vector is constructed as an array STR[] i.e. possibly each > >> >>>> bit-element padded to byte boundary etc. Also, I can't seem to apply > >> >>>> many operations other than sizeof. > >> >>>> > >> >>>> I don't know if we've tried to support such cases in GNU in the past? > >> >>> > >> >>> Why should we do that? It doesn't make much sense. > >> >>> > >> >>> single-bit vectors is what _BitInt was invented for. > >> >> > >> >> Forgive me if I'm misunderstanding - I'm trying to figure out how > >> >> _BitInts can be made to have single-bit generic vector semantics. For > >> >> eg. If I want to initialize a _BitInt as vector, I can't do: > >> >> > >> >> _BitInt (4) a = (_BitInt (4)){1, 0, 1, 1}; > >> >> > >> >> as 'a' expects a scalar initialization. > >> >> > >> >> Of if I want to convert an int vector to bit vector, I can't do > >> >> > >> >> v4si_p = v4si_a > v4si_b; > >> >> _BitInt (4) vbool = __builtin_convertvector (v4si_p, _BitInt (4)); > >> >> > >> >> Also semantics of conditionals with _BitInt behave like scalars > >> >> > >> >> _BitInt (4) p = a && b; // Here a and b are _BitInt (4), but they > >> >> behave as scalars. > >> >> > >> >> Also, I can't do things like > >> >> > >> >> typedef _BitInt (2) vbool __attribute__((vector_size(sizeof (_BitInt > >> >> (2)) * 4))); > >> >> > >> >> to force it to behave as a vector because _BitInt is disallowed here. > >> >> > >> > > >> > All I'm trying to say is that when people want to use vector<bool> as > >> > a large packed bitfield they can now use _BitInt instead. Of course > >> > with a different (but portable) API. > >> > > I don't see single-bit element vectors something as especially > >> > useful with a "vector API". What's its the use-case? (similar > >> > for the two and four bit elements, with or without padding) > >> > > >> > >> I'm trying to figure out if we had a portable (generic) way to represent > >> predicate vectors(eg BitInts) in the front end, and had rules(or a > >> vector API?)) that cast from integer vectors acting as bools to BitInts, > >> would it be more efficient to lower to target predicate modes (VNx16BI > >> etc on targets that support n-bit mode predicates)? It could also > >> possibly interoperate with target intrinsics better than int bool vectors. > > > > No, we don't have an existing way to represent predicate vectors. And no, > > I don't think there's good evidence of necessity for supporting one > > within the realm > > of GCCs generic vector extension. But there's plenty of doubt a portable > > and performant way of doing this is possible. > > We'd like to be able to support things like: > > svbool_t x, y, z; > x &= y | ~z; > y[0] = z[1]; > > etc. And, for fixed-size variants of svbool_t, we'd like to support: > > fixed_svbool_t x = { 1, 0, 1, 0 }; // + implicit zeros > > The hope was that we could do that as a two-step process: > > - add a generic way of representing packed boolean vectors > - inherit that generic support for the SVE ACLE types > > It seemed unlikely that adding SVE ACLE support directly to the frontends > would be acceptable. (E.g. direct target support in frontends was rejected > for Altivec IIRC.) > > _BitInt doesn't seem like a good replacement since, like Tejas said, > it doesn't support vector-style initialisation and indexing, and it > isn't part of C++. The last one is a killer for us, since so much > intrinsics code is written in C++ using abstraction layers. > > Also, things like __builtin_shuffle and __builtin_convert should be > supported for vector booleans, but wouldn't (I guess) be natural > operations on _BitInt. > > std::experimental::simd does support indexing of mask types, which > suggests that there is some demand for it. > > At the moment, the implementation of that for SVE has to convert to an > integer vector, index that, and convert back to a bool: > > template <> > struct __sve_mask_type<2> > { > ... > typedef svuint16_t __sve_mask_vector_type > __attribute__((arm_sve_vector_bits(__ARM_FEATURE_SVE_BITS))); > ... > inline static bool > __sve_mask_get(type __active_mask, size_t __i) > { return __sve_mask_vector_type(svdup_u16_z(__active_mask, 1))[__i] != 0;} > ... > }; > > It would be nice if it could just use: > > inline static bool > __sve_mask_get(type __active_mask, size_t __i) > { return __active_mask[__i * 2]; } > > without the round trip through uint16_ts. > > Even better would be if __sve_mask_type<2> could use a 2-bits-per-element > GNU-style boolean vector, so that the compiler has a better view of what's > actually happening. But for me, the main point was to design the extension > so that multi-bit elements could be added later, rather than being a > requirement from day 1.
I would start with declaring svbool in the backend and make the vector syntax work with that. Thus avoid giving users a way to create "generic" vector bools. Exactly because we would need to sit down and design inter-operability. Richard. > > Thanks, > Richard