Re: [Dwarf-discuss] ISSUE: tensor types. V3

Ben Woodard via Dwarf-discuss Tue, 25 Apr 2023 17:36:04 -0700


On 4/24/23 13:17, Todd Allen via Dwarf-discuss wrote:

On 4/24/23 13:27, Ben Woodard via Dwarf-discuss wrote:

As for NEON vs. SVE, is there a need to differentiate them?  And can it
not be done by shape of the type?

That one continues to be hard. ARM processors that support SVE also have
NEON registers which like the Intel SSE MMX AVX kind of vector registers
are architecturally specified as having a specific number of bits.
Handling those are trivial.


The weird thing about SVE registers (and the same things also apply to
RVV) are that the number of bits is not architecturally defined and is
therefore unknown at compile time. The size of the registers can even
vary from hardware implementation to hardware implementation. So a
simple processor may only have a 128b wide SVE register while a monster
performance core may have 2048b wide SVE registers. The predicate
registers scale the same way. I that it can even vary from core to core
within a CPU sort of like intel's P-cores vs E-cores. To be able to even
know how much a loop is vectorized you need to read a core specific
register that specifies how wide the vector registers are on this
particular core. Things like induction variables are incremented by the
constant in that core specific register divided by size of the type
being acted upon. So some of the techniques used to select lanes in
DWARF don't quite work the same way.

Just to make things even more difficult, when one of these registers are
spilled to memory like the stack the size is unknown at compile time and
so any subsequent spilling has to determine the size that it takes up.
So any subsequent offsets need to use DWARF expressions to that
reference the width of the vector.

...and then there is SME which is like SVE but they are matrices rather
than vectors. The mind boggles.

So the variability of the vector size is the only significant difference
that you've identified?  If so, then I think the shape of the array type
probably is sufficient.  For SVE, the DW_TAG_subrange_type will have a
DW_AT_upper_bound which is a variable (reference or dwarf expr), or the
DW_TAG_array_type's DW_AT_{byte,bit}_size will be a variable, or both.
Meanwhile, NEON would use DW_AT_bit_size 128 (or DW_AT_byte_size 16) and
a constant DW_AT_upper_bound (128/bitsizeof(elementtype)).  That seems
like it very directly reflects the difference between the two vector types.

I went back and revisited the research that I did on behalf of customersa few years back when customers first got access to SVE and starteddebugging it. The state of the art has advanced since I did that work.

Back then we ran into problems because the only way to get the size ofthe hardware vector was to read a core specific register. A big problemwas that if you were debugging something like a core file, you didn'thave access to the that core specific register. There was no way toreference the core specific register from DWARF.

Furthermore while on the systems that I was looking at, all the coreswere the same, it was architecturally allowed to have different sizes ofthe vector registers depending on which core that you were running on.

At the time, we realized that there needed to be some "magic" thatdidn't exist at the time that provided the debugger with the width ofthe vector. It was this complexity that really left me feeling that SVEneeded to be its own special thing.

At the time we discussed several options. One was pushing the size ofthe vector into a normal variable so that it could be referenced byDWARF; however we didn't know how to make that work because it couldchange depending on which core the code was executing on. There was alsoa kernel problem associated with that, the information about where theprocess was executing needed to be included in the crash dumps. Therewas also a feeling that there was something wrong with this approachbecause the only reason for the variable to exist would be to supportdebugging and keeping it up to date added overhead, and probably somekernel support.

Another idea we kicked around was giving the core specific register aname and number in the register file so that DWARF could access it. Thisbroke ABI. At that time, that option was immediately shot down.

I wasn't able to give the customers a good answer. I didn't know how tosolve the problem. Word evidently got back to ARM and they wrote:https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#dwarf-register-namesThe big innovation that made this possible is ARM introduced a "pseudoregister" which they call VG that is specified to exist in the executionenvironment. They even gave some examples how the DWARF should look forthese typeshttps://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#vector-types-beta

I haven't looked at the implementation of how GCC implements the VGregister yet. So I don't know how it handles some of the problems thatvexed us like making sure the VG exists in crash dumps and the ABIimplications of that, and how they ensure that VG is correct for thecore they are executing on when a process could move from one core toanother with different vector sizes. However, I'm going to look aroundin gcc/config/aarch64/aarch64.cc and see if I can figure it out. The ARMguys are great, but I wouldn't be surprised if I found some bugs there.

They also introduced a new wrinkle that I hadn't come across yet andthat was SME streaming mode and how that can change apparent size of thevector register.

All of this leads me to the conclusion, that you are in fact correct, wedon't need a special flavor of tensors to handle SVE. The complexitythat I knew was under the surface which I felt would need some specialhelp with in DWARF, got handled in by pushing it into runtimeenvironment. (My customers have mostly moved away from ARM but shouldthey move back, my gut feeling is that part of the map should be labeled"here be dragons" and I expect to be chasing some tricky bugs.)


-ben

If all those things You argued that it still should be an enum, but
with only one "default"
value defined.  And I guess any other values that might be added later
would be (or at least start as) vendor extensions. It's peculiar, and I
don't think we have that anywhere else in the standard.

I guess that my point is that I'm fairly certain that SVE and RVV will
need special handling and when the compilers start handling the matrix
types that the hardware is starting to support, they are going need some
help as well.

If there's something more peculiar about the types inhabiting these
vector registers than "variable size", that might convince me.  But
merely being variable-sized doesn't.

If it ever became necessary, you can always add a 2nd attribute for it.
As an example, in our Ada compiler decades ago, we did this for
DW_AT_artificial.  It's just a flag, so either present or not-present.
We added a 2nd DW_AT_artificial_kind with a whole bunch of different
enums for the various kinds our compiler generated.  The point is you
still can get there even if DW_AT_tensor is just a flag.

Totally, not opposed to that if that is the way that people want to
handle it. My only (admittedly weak) argument against doing it that way
is that there there will now be two attributes rather than one and the
space that it takes up. John DelSignore was just dealing with a program
that had 4.9GB of DWARF, it would be nice to keep it as compact as
possible. Of course most of that is likely location lists and template
instantiations and stuff like that not the relatively rare case like
this. The cases where this shows up are likely going to be fairly rare.

Would this be an acceptable compromise for V4 of my proposal? I drop it
back to just being a flag for the time being. Then in a subsequent
submission (which may or may not be in the DWARF6 cycle -- but hopefully
is in time for DWARF6), if I find it necessary to make a flavor to
support SVE, RVV or SME, then my submission for that will include
changing DW_AT_tensor to requiring a constant that then references an
enum like I did above. If it comes out before DWARF6 is released then
great, we don't have to redefine anything. If It bumped to DWARF7 then
we add a _kind attribute.

You can submit it in whichever form you prefer.  I supposed you were
soliciting comments here to get it in a form as close to acceptable as
possible before submitting it.  After you do, the committee will discuss
it, probably ad nauseum.  (And I'll be no exception.)  And changes may
happen then.  Seldom is it rubber stamp vs. reject.

Regards,
Todd


--
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

Re: [Dwarf-discuss] ISSUE: tensor types. V3

Reply via email to