Hi Segher,

On 11/14/21 9:29 AM, Segher Boessenkool wrote:
> Hi!
>
> On Sun, Nov 14, 2021 at 08:17:41AM -0600, Bill Schmidt wrote:
>> On 11/11/21 10:50 AM, Bill Schmidt wrote:
>>> On 11/11/21 7:11 AM, Segher Boessenkool wrote:
>>>> void f(long x) { __builtin_set_texasr(x); }
>>>>
>>>> built with -m32 -mpowerpc64 gives (in the expand dump):
>>>>
>>>> void f (long int x)
>>>> {
>>>>   long long unsigned int _1;
>>>>
>>>> ;;   basic block 2, loop depth 0
>>>> ;;    pred:       ENTRY
>>>>   _1 = (long long unsigned int) x_2(D);
>>>>   __builtin_set_texasr (_1); [tail call]
>>>>   return;
>>>> ;;    succ:       EXIT
>>>>
>>>> }
>>>>
>>>> The builtins have a "long long" argument in the existing code, in this
>>>> configuration.  And this is not the same as "long" here.
>>> Hm, strange.  I'll have to go back and revisit this.  Something subtle 
>>> going on.
>>>
>> So, we have one of the more bizarre API situations here that I've ever seen.
>>
>> We have three 64-bit HTM registers:  TEXASR, TFHAR, and TFIAR.  We also have 
>> the
>> 32-bit TEXASRU, which is the upper half of TEXASR.  The documnted interfaces 
>> for
>> reading and modifying these registers are:
>>
>>   unsigned long __builtin_get_texasr (void);
>>   unsigned long __builtin_get_texasru (void);
>>   unsigned long __builtin_get_tfhar (void);
>>   unsigned long __builtin_get_tfiar (void);
>>
>>   void __builtin_set_texasr (unsigned long);
>>   void __builtin_set_texasru (unsigned long);
>>   void __builtin_set_tfhar (unsigned long);
>>   void __builtin_set_tfiar (unsigned long);
>>
>> In reality, these interfaces are defined this way for pure 32-bit and pure 
>> 64-bit,
>> but for -m32 -mpowerpc64 we have some grotesque hackery that overrides the
>> expected interfaces to be:
>>
>>   unsigned long long __builtin_get_texasr (void);
>>   unsigned long long __builtin_get_texasru (void);
>>   unsigned long long __builtin_get_tfhar (void);
>>   unsigned long long __builtin_get_tfiar (void);
>>
>>   void __builtin_set_texasr (unsigned long long);
>>   void __builtin_set_texasru (unsigned long long);
>>   void __builtin_set_tfhar (unsigned long long);
>>   void __builtin_set_tfiar (unsigned long long);
> Yes.  Everything in -m32 -mpowerpc64 should follow the 32-bit ABI.  If
> you consider these builtins part of the ABI (are they documented there?)
> then this is simply a bug.
>
>> An undocumented conditional API is a really, really bad idea, given that it
>> forces users of this interface for general code to #ifdef on the -m32
>> -mpowerpc64 condition.  Not to mention treating 32-bit registers the same as
>> 64-bit ones, and only modifying half the register on a 32-bit system.  (Is 
>> HTM
>> even supported on a 32-bit system?)
> There are no pure 32 bit CPUs that implement HTM, to my knowledge.  But
> of course HTM works fine with SF=0 (that is the reason TEXASRU exists!
> Compare to TB and TBU).
>
>> It would have likely been better to have one consistent interface, using
>> int for TEXASRU and long long for the others, even though that requires
>> dealing with two registers for the 32-bit case; but that's all water under
>> the bridge.  We have what we have.
> "long" for the others, actually.
>
> TFHAR and TFIAR hold code addresses.  TEXASR gets only the low 32 bits
> of the register read, that is why TEXASRU exists :-)

Yes, right - That makes sense, and is what is currently documented.

>
>> If I sound irritated, it's because, just for this case, I'll have to add a
>> bunch of extra machinery to track up to two prototypes for each builtin
>> function, and perform conditional initialization when it applies.  The one
>> good thing is that these already have a builtin attribute "htmspr" that I
>> can key off of to do the extra processing.
> Another option might be to finally fix this.  There still are shipping
> CPUs that support HTM ;-)
>
> And essentially no one uses -m32 -mpowerpc64 on Linux or AIX.  On Linux
> because ucontext_t and jmp_buf do not deal with the high half of the
> registers, and iiuc on AIX the kernel doesn't deal with it in context
> switches even.  Darwin does use it, but afaik no one runs Darwin on a
> CPU with HTM.

Agreed, this seems like an odd use case from the beginning.

>
>> And somebody ought to fix the misleading documentation...
> Yes.
>
> Do you want to fix this mess?  I will take a patch using "long" for
> all these registers and builtins (just like we have for essentially all
> other SPRs!)

Sure!  In fact, that's what my existing patch does.

Thanks -- I will use the time that frees up to take a look at how to get
the overloaded builtin name the user wrote to show up on error messages,
instead of translating it to the specific builtin name.  Then I'll
repost the remaining pieces of the testsuite patch and the 32-bit patch
with all outstanding issues resolved.

Thanks again for all the help.

Bill

>
>
> Segher

Reply via email to