Sorry for responding late. Richard Biener <rguent...@suse.de> writes: >> > > > > > OK, so SVE VLS -msve-vector-bits=128 modes are indistinguishable >> > > > > > from >> > Adv. >> > > > > > SIMD >> > > > > > modes by the middle-end? >> > > > > >> > > > > I believe so, the ACLE types have an annotation on them to lift some >> > > > > of the >> > > > > restrictions but the modes are the same.
Yeah, the modes are different but have the same properties (nunits, size, etc.). But like you say, the tree-level types are different even to the middle end, thanks to an a target attribute, since the Adv SIMD and SVE types have different ABIs. >> > > > > >> > > > > > Is there a way to distinguish them, say, by cost >> > > > > > (target_reg_cost?)? Since any use of a SVE reg will require a >> > > > > > predicate reg? >> > > > > > >> > > > > >> > > > > We do have unpredicated SVE instructions, but yes costing could work. >> > > > > Essentially what we're trying to do is find the cheapest mode to >> > > > > perform >> > > > > the operation on. >> > > > > >> > > > > This could work.. But how would we incorporate it into the costing? >> > > > > Part of >> > > > > the problem is that to iterate over similar modes with the same >> > > > > element size >> > > > > likely requires some target input no? Or are you saying we should >> > > > > only >> > > > > iterate over fixed size modes? >> > > > > >> > > > > Regards, >> > > > > Tamar >> > > > > >> > > > > > I think we miss a critical bit of information in the middle-end >> > > > > > here and I'm >> > > > > > trying to see what piece of information that actually is. >> > "find_subvector_type" >> > > > > > doesn't seem to be it, it's maybe using that hidden bit of >> > > > > > information for >> > > > > > one specific use-case but it would be far better to have a way for >> > > > > > the target >> > > > > > to communicate the missing bit of information in a more generic >> > > > > > way? >> > > > > > We can then wrap a "find_subvector_type" around that. >> > > > >> > > > So for this one sth like targetm.mode_requires_predication ()? But >> > > > as Tamar says above this really depends on the operation. But the >> > > > optabs do _not_ expose this requirement (we have non-.COND_ADD for >> > > > SVE modes), but you want to take advantage of this difference. >> > > > Can we access insn attributes from optab entries? Could we add >> > > > some "standard" attribute noting that an insn requires a predicate? >> > > > But of course that likely depends on the alternative? Personally, I think it would be a nice model if targets that only have conditional instructions could define only the cond_* optab, and the target-independent code would provide the all-true predicate where necessary. That would directly give target-independent code more information, but it would also give target-independent code more work to do. Does that seem like a fair trade-off? >> > > > >> > > >> > > We'd likely also require the mask that would be used, because I think >> > > otherwise >> > > targetm.mode_requires_predication would be a bit ambiguous for non-flag >> > setting >> > > instructions or instructions that don’t do cross lane operations. >> > > >> > > e.g. SVE has both COND_ADD and ADD. But the key here is that if we know >> > > we'll >> > > access the bottom 64 or 128 bits we could use an Adv. SIMD ADD. >> > >> > But SVE ADD still requires a predicate register (with all lanes enabled), >> > no? That's the whole point of the optimization we're discussing? >> > I see the only problem with -msve-vector-bits=N where GET_MODE_SIZE >> > is no longer a POLY_INT - otherwise that would be the easy >> > way to identify Adv. SIMD vs. SVE and heuristically prefer >> > fixed-size modes in the vectorizer when possible (for small known >> > niter <= the fixed-size mode number of lanes). But with >> > -msve-vector-bits=128 GET_MODE_SIZE for Adv. SIMD and SVE is equal(?), >> > so we need another way to distinguish. Because even with >> > -msve-vector-bits=128 you need the predicate register appropriately >> > set up as I understand you are not altering the SVE HW config which >> > would be also possible(?), but I'm not sure that would make it >> > possible to have a predicate register less ADD instruction. >> > >> > What SVE register taking machine instructions do not explicitly/implicitly >> > use one of the SVE predicate registers? >> >> Many, ADD for instance is this >> https://developer.arm.com/documentation/ddi0602/2025-03/SVE-Instructions/ADD--vectors--unpredicated---Add-vectors--unpredicated-- >> >> And SVE2 added many more. GCC already takes advantage of this and drops >> predicates entirely when it can to avoid the dependency on the predicate >> pipe. >> >> Those are actually different instructions not just aliases. > > I see. So this clearly is a feature on instructions then, not modes. > In fact it might be profitable to use unpredicated add to avoid > computing the loop mask for a specific element width completely even > when that would require more operation for a wide SVE implementation. > > For the patch at hand I would suggest to re-post without a new target > hook, ignoring the -msve-vector-bits complication for now and simply > key on GET_MODE_SIZE being POLY_INT, having a vectorizer local helper > like > > tree > get_fixed_size_vectype (tree old_vectype, unsigned nlanes-upper-bound) I can see the attraction of that, but it doesn't seem to be conceptually a poly-int vs. fixed-size thing. If a new hook seems like too much, maybe an alternative would be to pass an optional code_helper to TARGET_VECTORIZE_RELATED_MODE? That's the hook that we already use for switching between vector modes in a single piece of vectorisation. Richard