I think most of what you are seeing is a mismatch between how a big struct
is passed in the calling convention and how it is processed within a
function by ssa.
The calling convention lets larger structs be broken up and put in
registers, if there are enough argument registers for it (which is an
arch-dependent thing).
The total set of registers used is fixed, and those registers really can't
be used for anything else at the call point, so there's no danger in
overusing them.
Inside a function, we can have many more such structs and there's no
obvious way to pick which ones get registers and which don't.
`type T struct { a,b,c,d,e int }`
`func f(x,y,z,p,q T) {}`
Here it's obvious how to allocate registers. some prefix of the argument
list gets registers, the rest don't.
There's a fixed set of spill instructions needed to handle the rest.
Whereas if we had
`
func f() {
var x,y,z,p,q T
...
}
`
How do we decide which (parts of) variables get registers? How does that
compete with other, non-large-struct register demands?
Because we don't have great answers to these questions, we want to be
significantly more conservative in how many registers we let a single
variable consume.
All that said, I'm sure there are cases where we could do better. In your
example, those spills are either dead or kind of silly.
On Thursday, December 12, 2024 at 6:13:23 AM UTC-8 Arseny Samoylov wrote:
> If we're concerned about register pressure, perhaps we should look at the
> total number of registers taken by arguments rather than just the size of
> the arguments. Consider the following example:
>
> ```
> type MegaInt struct {
> i1, i2, i3, i4, i5 int64
> }
>
> func foo(i1, i2, i3, i4, i5 int64) int64 {
> return i1 + i2 + i3 + i4 + i5
> }
>
> func bar(i MegaInt) int64 {
> return i.i1 + i.i2 + i.i3 + i.i4 + i.i5
> }
> ```
>
> This compiles to:
> ```
> TEXT command-line-arguments.foo(SB)
> 8b000021 ADD R0, R1, R1
> 8b010041 ADD R1, R2, R1
> 8b010061 ADD R1, R3, R1
> 8b010080 ADD R1, R4, R0
> d65f03c0 RET
>
> TEXT command-line-arguments.bar(SB)
> f90007e0 MOVD R0, 8(RSP)
> f9000be1 MOVD R1, 16(RSP)
> f9000fe2 MOVD R2, 24(RSP)
> f90013e3 MOVD R3, 32(RSP)
> f90017e4 MOVD R4, 40(RSP)
> f94007e5 MOVD 8(RSP), R5
> 8b0100a1 ADD R1, R5, R1
> 8b010041 ADD R1, R2, R1
> 8b010061 ADD R1, R3, R1
> 8b010080 ADD R1, R4, R0
> d65f03c0 RET
> ```
> On Thursday, 12 December 2024 at 12:53:50 UTC+3 Arseny Samoylov wrote:
>
>> Hi everybody!
>>
>> Recently, I noticed that there are some restrictions on the arguments
>> passed to functions in registers.
>>
>> For example, if `a` is a struct, it must have fewer than 5 fields, and
>> its size must be less than `5 * ptrsz`. You can find these restrictions in
>> `cmd/compile/internal/ssa/value.go` at line 590 in the `CanSSA` function:
>>
>> ```
>> // CanSSA reports whether values of type t can be represented as a Value.
>> func CanSSA(t *types.Type) bool {
>> types.CalcSize(t)
>> if t.Size() > int64(4*types.PtrSize) {
>> // 4*Widthptr is an arbitrary constant. We want it
>> // to be at least 3*Widthptr so slices can be registerized.
>> // Too big and we'll introduce too much register pressure.
>> return false
>> }
>> switch t.Kind() {
>> ...
>> case types.TSTRUCT:
>> if t.NumFields() > MaxStruct { // MaxStruct = 4
>> return false
>> }
>> }
>> }
>> ```
>>
>> Consider the following example:
>>
>> ```
>> type A struct {
>> s1, s2 string
>> i1 int64
>> }
>>
>> func (a A) GetInt() int64 {
>> return a.i1
>> }
>> ```
>>
>> This compiles to:
>>
>> ```
>> f90007e0 MOVD R0, 8(RSP)
>> f9000be1 MOVD R1, 16(RSP)
>> f9000fe2 MOVD R2, 24(RSP)
>> f90013e3 MOVD R3, 32(RSP)
>> f90017e4 MOVD R4, 40(RSP)
>> aa0403e0 MOVD R4, R0
>> d65f03c0 RET
>> ```
>>
>> In the recent merged changes (CL#611075)[
>> https://go-review.googlesource.com/c/go/+/611075/4] and (CL#611076)[
>> https://go-review.googlesource.com/c/go/+/611076/6], support was added
>> for making structs with any number of fields SSA-able. With these changes,
>> I was able to remove the size restriction for structs that can be SSA-ized.
>>
>> Without these restrictions, the above example compiles to:
>>
>> ```
>> f90007e0 MOVD R0, 8(RSP)
>> f9000fe2 MOVD R2, 24(RSP)
>> aa0403e0 MOVD R4, R0
>> d65f03c0 RET
>> ```
>>
>> So, I am wondering: why does the restriction on size exist in the first
>> place? It seems unreasonable to place the argument in registers only to
>> later push it to the stack. The comment mentions that it helps reduce
>> register pressure, but can't the register allocator decide to spill the
>> argument if necessary? Also, if we’re preemptively pushing the structure to
>> the stack, why not just pass it on the stack from the beginning?
>>
>> Thank you for your time and attention,
>> Arseny.
>
>
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/golang-nuts/913177ad-c7a1-49be-9251-4dca98071f78n%40googlegroups.com.