libiberty D tuple demangling

2022-07-24 Thread Jan Beulich via Gcc
Hello,

while commit 3f30a274913b ("libiberty: Update D symbol demangling
for latest ABI spec") mentions in its description that tuple encoding
has changed, there's no real adjustment to dlang_parse_tuple() there,
nor are there any new (or replaced) test cases for that. Was this
simply overlooked?

Furthermore the current ABI specifies "B Parameters Z". As I don't
know what the old ABI said, I can only wonder whether the present
code decoding (in a loop) merely a Type (and not a Parameter) was
actually correct.

Thanks for any insight, Jan


Re: libiberty D tuple demangling

2022-07-25 Thread Jan Beulich via Gcc
On 25.07.2022 14:05, ibuc...@gdcproject.org wrote:
>> On 25/07/2022 08:45 CEST Jan Beulich  wrote:
>> while commit 3f30a274913b ("libiberty: Update D symbol demangling
>> for latest ABI spec") mentions in its description that tuple encoding
>> has changed, there's no real adjustment to dlang_parse_tuple() there,
>> nor are there any new (or replaced) test cases for that. Was this
>> simply overlooked?
> 
> Is there any specific example that fails to demangle, or are you just 
> skimming?

I'm merely looking at the code alongside the ABI spec.

> From what I recall, there is a couple places in the dlang_demangle parser 
> that handle ambiguities in a mangled symbol.  The ABI change only added a 
> terminating 'Z', which makes said code that handles ambiguity redundant - but 
> of course kept around so we handle both old and new symbols.

It's not just the addition of Z at the end but also the dropping of the
number of elements at the beginning, aiui. It's actually that aspect
which caught my attention, since the ABI doesn't talk about any number
there, but the code fetches one.

>> Furthermore the current ABI specifies "B Parameters Z". As I don't
>> know what the old ABI said, I can only wonder whether the present
>> code decoding (in a loop) merely a Type (and not a Parameter) was
>> actually correct.
>>
> 
> Do you think we should instead be calling dlang_function_args instead?
> 
> (Having a quick look at both, that does seem to be the case).

Well - with a number of elements specified, it might have needed to
be a function processing a single argument only. For the new ABI -
yes, that's the function I would have expected to be called.

Jan


Re: libiberty D tuple demangling

2022-07-25 Thread Jan Beulich via Gcc
On 25.07.2022 17:45, ibuc...@gdcproject.org wrote:
>> On 25/07/2022 14:13 CEST Jan Beulich  wrote:
>>
>>  
>> On 25.07.2022 14:05, ibuc...@gdcproject.org wrote:
 On 25/07/2022 08:45 CEST Jan Beulich  wrote:
 while commit 3f30a274913b ("libiberty: Update D symbol demangling
 for latest ABI spec") mentions in its description that tuple encoding
 has changed, there's no real adjustment to dlang_parse_tuple() there,
 nor are there any new (or replaced) test cases for that. Was this
 simply overlooked?
>>>
>>> Is there any specific example that fails to demangle, or are you just 
>>> skimming?
>>
>> I'm merely looking at the code alongside the ABI spec.
>>
>>> From what I recall, there is a couple places in the dlang_demangle parser 
>>> that handle ambiguities in a mangled symbol.  The ABI change only added a 
>>> terminating 'Z', which makes said code that handles ambiguity redundant - 
>>> but of course kept around so we handle both old and new symbols.
>>
>> It's not just the addition of Z at the end but also the dropping of the
>> number of elements at the beginning, aiui. It's actually that aspect
>> which caught my attention, since the ABI doesn't talk about any number
>> there, but the code fetches one.
>>
> 
> Went to have a look at docarchives, but it appears to be down (that's on me, 
> I have been meaning to migrate the service to new servers).
> 
> Yes, your right, the number was indeed dropped too from the ABI.
> 
> https://web.archive.org/web/20170812061158/https://dlang.org/spec/abi.html#TypeTuple
> 
> TypeTuple:
> B Number Parameters
> 
> https://dlang.org/spec/abi.html#TypeTuple
> 
> TypeTuple:
> B Parameters Z
> 
> However, it gets worse the more I stare at it. Looks like it was not 
> understood what 'Number' meant in the old ABI. I assumed it was the encoded 
> number of tuple elements - same as static arrays - however what I see in the 
> front-end is instead an encoded buffer length.
> 
> https://github.com/gcc-mirror/gcc/blob/releases/gcc-10/gcc/d/dmd/dmangle.c#L312-L313
> 
> So the loop should instead be more like:
> ---
>   unsigned long len;
> 
>   mangled = dlang_number (mangled, &len);
>   if (mangled == NULL)
> return NULL;
> 
>   string_append (decl, "Tuple!(");
> 
>   const char *endp = mangled + len;
>   int elements = 0;
>   while (mangled != endp)
> {
>   if (elements++)
> string_append (decl, ", ");
> 
>   mangled = dlang_type (decl, mangled, info);
>   if (mangled == NULL || mangled > endp)
>   return NULL;
> }
> 
>   string_append (decl, ")");
>   return mangled;
> ---

Oh. Then two of the testcases are actually wrong as well:

_D8demangle4testFB2OaaZv
_D8demangle4testFB3aDFZaaZv

I would have assumed they had been taken from observable output of a
compiler, ...

> On top of that, TypeTuple is a compile-time-only type - it never leaks to the 
> code generator - so the grammar entry in the ABI is frivolous (although 
> internally, that it gets a mangle at all would save some memory as duplicated 
> types are merged).

... but one way of reading this would make me infer that can't have
been the case.

Jan


Re: Problems when building NT kernel drivers with GCC / LD

2022-10-31 Thread Jan Beulich via Gcc
On 30.10.2022 02:06, Pali Rohár via Binutils wrote:
> * GCC or LD (not sure who) sets memory alignment characteristics
>   (IMAGE_SCN_ALIGN_MASK) into the sections of PE executable binary.
>   These characteristics should be only in COFF object files, not
>   executable binaries. Specially they should not be in NT kernel
>   drivers.

Like Martin pointed out in reply for another item, I'm pretty sure
this one was taken care of in bfd already (and iirc is in 2.39). You
fail to mention at all what versions of the various components you
use. I guess before reporting such a long list of issue you would
have wanted to test at least with the most recent releases of each
of the involved components. I wouldn't exclude some further items
could then be scratched off your list.

Jan


Re: Problems when building NT kernel drivers with GCC / LD

2022-11-20 Thread Jan Beulich via Gcc
On 20.11.2022 14:10, Pali Rohár wrote:
> On Saturday 05 November 2022 02:26:52 Pali Rohár wrote:
>> On Saturday 05 November 2022 01:57:49 Pali Rohár wrote:
>>> On Monday 31 October 2022 10:55:59 Jan Beulich wrote:
 On 30.10.2022 02:06, Pali Rohár via Binutils wrote:
> * GCC or LD (not sure who) sets memory alignment characteristics
>   (IMAGE_SCN_ALIGN_MASK) into the sections of PE executable binary.
>   These characteristics should be only in COFF object files, not
>   executable binaries. Specially they should not be in NT kernel
>   drivers.

 Like Martin pointed out in reply for another item, I'm pretty sure
 this one was taken care of in bfd already (and iirc is in 2.39). You
 fail to mention at all what versions of the various components you
 use.
>>>
>>> Ou, sorry for that. I take care to write issues in all details and
>>> totally forgot to write such important information like tool versions.
>>>
>>> Now I retested all issues on Debian 11 which has LD 2.35.2 and GCC
>>> 10.2.1 and all issues are there still valid except data characteristic
>>> IMAGE_SCN_CNT_INITIALIZED_DATA for code sections IMAGE_SCN_CNT_CODE.
>>>
>>> I can easily retest it with LD 2.39 and GCC 10.3.0 which is in Debian
>>> testing.
>>
>> Retested with LD 2.39 and GCC 10.3.0 which is in Debian testing and
>> following problems are additionally fixed: --exclude-all-symbols,
>> --dynamicbase and IMAGE_SCN_ALIGN_MASK (which you mentioned above). All
>> other still reminds.
>>
>> Do you need some other information?
> 
> Hello! I would like to ask if you need some other details or something
> else for these issues.

Well, generally speaking it might help if you could provide smallish
testcases for every item individually. But then, with you replying to
me specifically, perhaps you're wrongly assuming that I would be
planning to look into addressing any or all of these? My earlier reply
was merely to point out that _some_ work has already been done ...

Jan


Re: Problems when building NT kernel drivers with GCC / LD

2022-11-28 Thread Jan Beulich via Gcc
On 26.11.2022 20:04, Pali Rohár wrote:
> On Monday 21 November 2022 08:24:36 Jan Beulich wrote:
>> But then, with you replying to
>> me specifically, perhaps you're wrongly assuming that I would be
>> planning to look into addressing any or all of these? My earlier reply
>> was merely to point out that _some_ work has already been done ...
> 
> I added into CC also gcc, ld and mingw mailing list. If this is not
> enough, could you tell me who to contact about those issues?

That's probably enough, sure. I merely tried to set expectations right,
since you did reply To: me (and lists were only on Cc: - it being the
other way around would have demonstrated that you're not asking me
specifically).

Jan


Re: Problems when building NT kernel drivers with GCC / LD

2022-11-28 Thread Jan Beulich via Gcc
On 28.11.2022 09:40, Jonathan Wakely wrote:
> On Mon, 28 Nov 2022, 08:08 Jan Beulich via Gcc,  wrote:
> 
>> On 26.11.2022 20:04, Pali Rohár wrote:
>>> On Monday 21 November 2022 08:24:36 Jan Beulich wrote:
>>>> But then, with you replying to
>>>> me specifically, perhaps you're wrongly assuming that I would be
>>>> planning to look into addressing any or all of these? My earlier reply
>>>> was merely to point out that _some_ work has already been done ...
>>>
>>> I added into CC also gcc, ld and mingw mailing list. If this is not
>>> enough, could you tell me who to contact about those issues?
>>
>> That's probably enough, sure. I merely tried to set expectations right,
>> since you did reply To: me (and lists were only on Cc: - it being the
>> other way around would have demonstrated that you're not asking me
>> specifically).
>>
> 
> That's just how most mailers do "Reply All", I don't think it out implies
> anything.

I know mailers behave that way. But when replying you can adjust To:
vs Cc:. That's what I'm doing all the time (or at least I'm trying to
remember to do so), because it makes a difference to me whether mail
is sent To: me vs I'm only being Cc:-ed. Otherwise - why do we have
To: and Cc: as different categories?

> Removing the Cc list and *only* replying to you would be different.

Sure - that would have meant sending private mail, which is yet worse.

Jan


x86: making better use of vpternlog{d,q}

2023-05-24 Thread Jan Beulich via Gcc
Hello,

for a couple of years I was meaning to extend the use of these AVX512F
insns beyond the pretty minimalistic ones there are so far. Now that I've
got around to at least draft something, I ran into a couple of issues I
cannot explain. I'd like to start with understanding the unexpected
effects of a change to an existing insn I have made (reproduced at the
bottom). I certainly was prepared to observe testsuite failures, but it
ends up failing tests I didn't expect it would fail, and - upon looking
at sibling ones - also ends up leaving intact tests which I would expect
would then need adjustment (because of using the new alternative).

In particular (all mentioned tests are in gcc.target/i386/)
- avx512f-andn-si-zmm-1.c (and its AVX512VL counterparts) fails because
  for whatever reason generated code reverts back to using vpbroadcastd,
- avx512f-andn-di-zmm-1.c, otoh, is unaffected (i.e. continues to use
  vpandnq with embedded broadcast),
- avx512f-andn-si-zmm-2.c doesn't use the new 4th insn alternative when
  at the same time a made-up DI variant of the test (akin to what might
  be an avx512f-andn-di-zmm-2.c testcase) does.
IOW: How is SI mode element size different here from DI mode one? Is
there anything wrong with the 4th alternative I'm adding, or is this
hinting at some anomaly elsewhere?

Just to mention it, avx512f-andn-si-zmm-5.c similarly fails
unexpectedly, but I guess for the same reason (and there aren't AVX512VL
or DI mode element counterparts thereof).

Jan

--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -17019,11 +17019,11 @@
   "TARGET_AVX512F")
 
 (define_insn "*andnot3"
-  [(set (match_operand:VI 0 "register_operand" "=x,x,v")
+  [(set (match_operand:VI 0 "register_operand" "=x,x,v,v")
(and:VI
- (not:VI (match_operand:VI 1 "vector_operand" "0,x,v"))
- (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr")))]
-  "TARGET_SSE"
+ (not:VI (match_operand:VI 1 "bcst_vector_operand" "0,x,v,mBr"))
+ (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr,v")))]
+  "TARGET_SSE && (REG_P (operands[1]) || REG_P (operands[2]))"
 {
   char buf[64];
   const char *ops;
@@ -17090,6 +17090,11 @@
 case 2:
   ops = "v%s%s\t{%%2, %%1, %%0|%%0, %%1, %%2}";
   break;
+case 3:
+  tmp = "pternlog";
+  ssesuffix = "";
+  ops = "v%s%s\t{$0x44, %%1, %%2, %%0|%%0, %%2, %%1, $0x44}";
+  break;
 default:
   gcc_unreachable ();
 }
@@ -17098,7 +17103,7 @@
   output_asm_insn (buf, operands);
   return "";
 }
-  [(set_attr "isa" "noavx,avx,avx")
+  [(set_attr "isa" "noavx,avx,avx,avx512f")
(set_attr "type" "sselog")
(set (attr "prefix_data16")
  (if_then_else
@@ -17106,7 +17111,7 @@
(eq_attr "mode" "TI"))
(const_string "1")
(const_string "*")))
-   (set_attr "prefix" "orig,vex,evex")
+   (set_attr "prefix" "orig,vex,evex,evex")
(set (attr "mode")
(cond [(match_test "TARGET_AVX2")
 (const_string "")
@@ -17119,7 +17124,11 @@
(match_test "optimize_function_for_size_p (cfun)"))
 (const_string "V4SF")
  ]
- (const_string "")))])
+ (const_string "")))
+   (set (attr "enabled")
+   (if_then_else (eq_attr "alternative" "3")
+ (symbol_ref " == 64 ? TARGET_AVX512F : 
TARGET_AVX512VL")
+ (const_string "*")))])
 
 ;; PR target/100711: Split notl; vpbroadcastd; vpand as vpbroadcastd; vpandn
 (define_split


Re: x86: making better use of vpternlog{d,q}

2023-05-25 Thread Jan Beulich via Gcc
On 24.05.2023 11:01, Hongtao Liu wrote:
> On Wed, May 24, 2023 at 3:58 PM Jan Beulich via Gcc  wrote:
>>
>> Hello,
>>
>> for a couple of years I was meaning to extend the use of these AVX512F
>> insns beyond the pretty minimalistic ones there are so far. Now that I've
>> got around to at least draft something, I ran into a couple of issues I
>> cannot explain. I'd like to start with understanding the unexpected
>> effects of a change to an existing insn I have made (reproduced at the
>> bottom). I certainly was prepared to observe testsuite failures, but it
>> ends up failing tests I didn't expect it would fail, and - upon looking
>> at sibling ones - also ends up leaving intact tests which I would expect
>> would then need adjustment (because of using the new alternative).
>>
>> In particular (all mentioned tests are in gcc.target/i386/)
>> - avx512f-andn-si-zmm-1.c (and its AVX512VL counterparts) fails because
>>   for whatever reason generated code reverts back to using vpbroadcastd,
>> - avx512f-andn-di-zmm-1.c, otoh, is unaffected (i.e. continues to use
>>   vpandnq with embedded broadcast),
>> - avx512f-andn-si-zmm-2.c doesn't use the new 4th insn alternative when
>>   at the same time a made-up DI variant of the test (akin to what might
>>   be an avx512f-andn-di-zmm-2.c testcase) does.
>> IOW: How is SI mode element size different here from DI mode one? Is
>> there anything wrong with the 4th alternative I'm adding, or is this
>> hinting at some anomaly elsewhere?
> __m512i is defined as __v8di, when it's used for _mm512_andnot_epi32,
> it's explicitlt converted to (__v16si) and creates an extra subreg
> which is not needed for DImode cases.
> And pass_combine try to match the below pattern but failed due to the
> condition REG_P (operands[1]) || REG_P (operands[2]). Here I think you
> want register_operand instead of REG_P.

Thanks, this has indeed made things match my expectations wrt testsuite
results. Sadly similar adjustments for other (new) insns didn't make
any difference with the further issues I'm facing. I may therefore need
to ask more questions; I hope they're not going to be too dumb.

Jan


Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-18 Thread Jan Beulich via Gcc
On 18.01.2024 06:34, LIU Hao wrote:
> My complete proposal can be found at 
> . 
> Some ideas actually 
> reflect the AT&T syntax. I hope it helps.

I'm sorry, but most of your proposal may even be considered for being
acceptable only if you would gain buy-off from the MASM guys. Anything
MASM treats as valid ought to be permitted by gas as well (within the
scope of certain divergence that cannot be changed in gas without
risking to break people's code). It could probably be considered to
introduce a "strict" mode of Intel syntax, following some / most of
what you propose; making this the default cannot be an option.

Commenting on individual aspects of your proposal is a little difficult,
as you didn't provide the proposal inline (and hence it cannot be easily
used as context in a reply). But to mention the imo worst aspect:
Declaring

mov eax, [rcx]

as invalid is a no-go. I also don't see how this would be related to the
issue at hand. What's in the square brackets may as well be a symbol
name, so requiring the "mode specifier" doesn't disambiguate things at
all.

Otoh the "offset" part of point 3 may be possible to accept even by
default, provided (didn't check) that current gas consistently rejects
that (as an invalid use of a register name).

One remark regarding the underlying pattern leading to the issue:
Personally I view it as questionable practice to have extern or static
variables in C code with names as short as register names are. Avoiding
them does not only avoid the issue here, but also is quite likely going
to improve the code (by having more descriptive variable names). And
automatic variables aren't affected aiui, so can remain short (after
all, commonly automatic variable names are as short as a single char).

That said, I can certainly also see how the introduction of new
registers can lead to new conflicts, which isn't nice. Iirc old 32-bit
MASM escaped this problem by requiring architecture extensions to be
explicitly enabled (may have changed in newer MASM). Gas, otoh, enables
everything by default (and I don't see how we could change that).

Jan


Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-18 Thread Jan Beulich via Gcc
On 19.01.2024 02:42, LIU Hao wrote:
> In addition, `as -msyntax=intel -mnaked-reg` doesn't seem to be equivalent to 
> `.intel_syntax noprefix`:
> 
> $ as -msyntax=intel -mnaked-reg <<< 'mov eax, DWORD PTR gs:0x48' -o a.o
> {standard input}: Assembler messages:
> {standard input}:1: Error: invalid use of register
> 
> $ as <<< '.intel_syntax noprefix;  mov eax, DWORD PTR gs:0x48' -o a.o && 
> objdump -Mintel -d a.o
> ...
>  <.text>:
>0: 65 8b 04 25 48 00 00moveax,DWORD PTR gs:0x48

This (the error above) looks like a bug to me; I'll look into where this
odd difference in behavior is coming from.

Jan


Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-19 Thread Jan Beulich via Gcc
On 18.01.2024 17:40, LIU Hao wrote:
> 在 2024-01-18 20:54, Jan Beulich 写道:
>> I'm sorry, but most of your proposal may even be considered for being
>> acceptable only if you would gain buy-off from the MASM guys. Anything
>> MASM treats as valid ought to be permitted by gas as well (within the
>> scope of certain divergence that cannot be changed in gas without
>> risking to break people's code). It could probably be considered to
>> introduce a "strict" mode of Intel syntax, following some / most of
>> what you propose; making this the default cannot be an option.
> 
> Thanks for your reply.
> 
> I have attached the Markdown source for that page, modified a few hours ago. 
> I am planning to make 
> some updates according to your advice tomorrow.

Just to mention it: Attaching is in no way better than providing a link,
commenting-wise.

> And yes, I am proposing a 'strict' mode, however not for humans, only for 
> compilers.
> 
> My first message references a GCC bug report, where the problematic symbol 
> `bx` comes from C source. 
> I have been aware of the `/APP` and `/NO_APP` markers in generated assembly, 
> so I suspect that GAS 
> should be able to tell which parts are generated from a compiler and which 
> parts are composed by 
> hand. The proposed strict mode may apply only to the output from GCC, which 
> are much more likely to 
> contain bad symbols, but are also more controllable on the GCC side.
> 
> I believe that skillful people who write x86 assembly have known that 
> `offset`, `shr`, `si` etc. are 
> 'bad' names for symbols. Therefore, it's like an issue there.
> 
> 
>> Commenting on individual aspects of your proposal is a little difficult,
>> as you didn't provide the proposal inline (and hence it cannot be easily
>> used as context in a reply). But to mention the imo worst aspect:
>> Declaring
>>
>>  mov eax, [rcx]
>>
>> as invalid is a no-go.
> 
> I agree. I am considering to declare the lack of a symbol as a special case.

Well, I took this as the simplest example. But clearly there should never
be a need for an assembly programmer to needlessly write "dword ptr" or
alike, when operand size is unambiguous. Limiting "strict mode" to compiler
output would take away concerns in this regard (as machine generated
assembly has no issue with uniformly adding such redundant specifiers, much
like in AT&T mode suffixes would typically be emitted even when not needed).
But I see a severe issue with your aim at confining strict mode to
compiler generated code only: In inline assembly (see your mentioning of
APP / NO_APP above) you still potentially reference C symbols. So the
ambiguities don't disappear in APP / NO_APP regions.

>> I also don't see how this would be related to the
>> issue at hand. What's in the square brackets may as well be a symbol
>> name, so requiring the "mode specifier" doesn't disambiguate things at
>> all.
> 
> If someone declares a variable called `rcx` in C, it has be translated to
> 
> mov eax, DWORD PTR rcx  # `movl rcx, %eax`
> 
> instead of
> 
> mov eax, DWORD PTR [rcx]# `movl (%rcx), %eax`

And an array happening to be indexed by rcx would then result in

mov eax, DWORD PTR rcx[rcx]# `movl rcx(%rcx), %eax`

? That's going to be confusing at best. I think this whole issue needs
taking care of differently, and iirc I did already suggest an alternative
in one of the bugzilla entries involved: Potentially ambiguous names
(which to a compiler may mean: all symbol names) ought to simply be
quoted, and it ought to be specified that quoted symbols are never
registers. Iirc this will require gas changes, yes, but it'll address all
ambiguities afaict.

Jan


Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-22 Thread Jan Beulich via Gcc
On 20.01.2024 13:40, LIU Hao wrote:
> 在 2024-01-19 17:13, Jan Beulich 写道:
>> But I see a severe issue with your aim at confining strict mode to
>> compiler generated code only: In inline assembly (see your mentioning of
>> APP / NO_APP above) you still potentially reference C symbols. So the
>> ambiguities don't disappear in APP / NO_APP regions.
> 
> My suggestion is that people who write inline assembly should have been aware 
> of the existence of 
> bad names, and should have been careful to avoid them.
> 
> 
>> And an array happening to be indexed by rcx would then result in
>>
>>  mov eax, DWORD PTR rcx[rcx]# `movl rcx(%rcx), %eax`
>>
>> ? That's going to be confusing at best. 
> 
> This is always confusing, no matter how it is written.
> 
>> I think this whole issue needs
>> taking care of differently, and iirc I did already suggest an alternative
>> in one of the bugzilla entries involved: Potentially ambiguous names
>> (which to a compiler may mean: all symbol names) ought to simply be
>> quoted, and it ought to be specified that quoted symbols are never
>> registers. Iirc this will require gas changes, yes, but it'll address all
>> ambiguities afaict.
> 
> The OP of GCC PR53929 said that 'the problem does _not_ go away even if I 
> quote the symbol name by 
> hand in the assembly output' which was 12 years ago. I tried my local 
> installation and quoting the 
> symbol turned out to avoid the issue:
> 
> > as --version
> GNU assembler (GNU Binutils) 2.41.0.20240108
> 
> > cat test.s
> .intel_syntax noprefix
> lea rax, "bx"[rip]
> 
> > as test.s -o test.o
> 
> > objdump -d test.o
> test.o: file format pe-x86-64
> (...)
>0:   48 8d 05 00 00 00 00learax,[rip+0x0]# 7 
> <.text+0x7>
>7:   90  nop

Right, I did some work in that direction a while ago. But iirc there are
still cases left to be addressed.

Jan


Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-23 Thread Jan Beulich via Gcc
On 23.01.2024 02:27, LIU Hao wrote:
> 在 2024-01-22 16:39, Jan Beulich 写道:
>> Right, I did some work in that direction a while ago. But iirc there are
>> still cases left to be addressed.
> 
> Attached is a draft patch for GCC, bootstrapped on {i686,x86_64}-w64-mingw32 
> with GCC 13.2 and 
> binutils 2.41.0.

Right, but this is very "draft". You can't blindly assume the gas you use
actually can deal with quotation.

> This addresses the issue when a bad name exists in the same translation unit. 
> In the case of an 
> external symbol there's still an error:
> 
> ```
> extern int bx;
> int get(const char* p) { return p[bx]; }
> ```
> 
> ```
> lh_mouse@lhmouse-pc ~/Desktop $ x86_64-w64-mingw32-gcc -S -o - -masm=intel 
> test.c | fgrep bx
>  mov rax, QWORD PTR .refptr.bx[rip]
>  .section.rdata$.refptr.bx, "dr"
>  .globl  .refptr.bx
> .refptr.bx:
>  .quad   bx

Sure, this one needs quoting then, too.

Jan

> lh_mouse@lhmouse-pc ~/Desktop $ x86_64-w64-mingw32-gcc  -masm=intel test.c | 
> fgrep bx
> C:\Users\lh_mouse\AppData\Local\Temp\ccuyuu6c.s: Assembler messages:
> C:\Users\lh_mouse\AppData\Local\Temp\ccuyuu6c.s:29: Error: invalid use of 
> register
> C:\Users\lh_mouse\AppData\Local\Temp\ccuyuu6c.s:29: Warning: register value 
> used as expression
> lh_mouse@lhmouse-pc ~/Desktop $
> ```
> 
> 
> 
> 



Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-23 Thread Jan Beulich via Gcc
On 23.01.2024 10:00, LIU Hao wrote:
> 在 2024-01-23 16:38, Jan Beulich 写道:
>> Right, but this is very "draft". You can't blindly assume the gas you use
>> actually can deal with quotation.
> 
> Let's assume that for the time being, but there's something else; see below.
> 
> 
>>> .refptr.bx:
>>>   .quad   bx
>>
>> Sure, this one needs quoting then, too.
> 
> The attached patch contains `&& name[0] != '*'` with a reason: In the 
> function `assemble_name_raw` 
> in 'gcc/varasm.cc', if `name` starts with a `*`, then its remaining part is 
> output without 
> decoration. I have no idea what `*` means; this `.quad bx` thing apparently 
> results from something like
> 
> assemble_name_raw (file, "*bx");
> 
> Quoting this would break the i686 DWARF2 code, which may contain an 
> arithmetic expression like
> 
> .long LXXYY-1# "LXXYY" minus one
> 
> If it was quoted like `.long "LXXYY-1"`, it would mean something very 
> different and cause linker errors.

Hmm, that would suggest to me that the Dwarf code abuses the interface.
A "name" certainly shouldn't be an expression. And hence the result of
the example ought to be

 .long "LXXYY"-1# "LXXYY" minus one

Jan


Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-23 Thread Jan Beulich via Gcc
On 23.01.2024 10:21, LIU Hao wrote:
> 在 2024-01-23 17:03, Jan Beulich 写道:
>> Hmm, that would suggest to me that the Dwarf code abuses the interface.
>> A "name" certainly shouldn't be an expression. And hence the result of
>> the example ought to be
>>
>>   .long "LXXYY"-1# "LXXYY" minus one
> 
> So I shouldn't have checked for `*` right?

I don't know.

> The calls to `output_addr_const()` are from `dw2_assemble_integer (int size, 
> rtx x)` in 
> 'gcc/dwarf2asm.cc'. Now I need some directives on how to fix this; parsing 
> the symbol seems awkward.

Indeed.

Jan


Re: CREL relocation format for ELF

2024-03-28 Thread Jan Beulich via Gcc
On 28.03.2024 08:43, Fangrui Song wrote:
> On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song  wrote:
>>
>> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song  wrote:
>>>
>>> The relocation formats REL and RELA for ELF are inefficient. In a
>>> release build of Clang for x86-64, .rela.* sections consume a
>>> significant portion (approximately 20.9%) of the file size.
>>>
>>> I propose RELLEB, a new format offering significant file size
>>> reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
>>>
>>> Your thoughts on RELLEB are welcome!
>>>
>>> Detailed analysis:
>>> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
>>> generic ABI (ELF specification):
>>> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
>>> binutils feature request: 
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=31475
>>> LLVM: 
>>> https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600
>>>
>>> Implementation primarily involves binutils changes. Any volunteers?
>>> For GCC, a driver option like -mrelleb in my Clang prototype would be
>>> needed. The option instructs the assembler to use RELLEB.
>>
>> The format was tentatively named RELLEB. As I refine the original pure
>> LEB-based format, “RELLEB” might not be the most fitting name.
>>
>> I have switched to SHT_CREL/DT_CREL/.crel and updated
>> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
>> and
>> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ
>>
>> The new format is simpler and better than RELLEB even in the absence
>> of the shifted offset technique.
>>
>> Dynamic relocations using CREL are even smaller than Android's packed
>> relocations.
>>
>> // encodeULEB128(uint64_t, raw_ostream &os);
>> // encodeSLEB128(int64_t, raw_ostream &os);
>>
>> Elf_Addr offsetMask = 8, offset = 0, addend = 0;
>> uint32_t symidx = 0, type = 0;
>> for (const Reloc &rel : relocs)
>>   offsetMask |= crels[i].r_offset;
>> int shift = std::countr_zero(offsetMask)
>> encodeULEB128(relocs.size() * 4 + shift, os);
>> for (const Reloc &rel : relocs) {
>>   Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
>>   uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) +
>>   (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 
>> 0);
>>   if (deltaOffset < 0x10) {
>> os << char(b);
>>   } else {
>> os << char(b | 0x80);
>> encodeULEB128(deltaOffset >> 4, os);
>>   }
>>   if (b & 1) {
>> encodeSLEB128(static_cast(rel.r_symidx - symidx), os);
>> symidx = rel.r_symidx;
>>   }
>>   if (b & 2) {
>> encodeSLEB128(static_cast(rel.r_type - type), os);
>> type = rel.r_type;
>>   }
>>   if (b & 4) {
>> encodeSLEB128(std::make_signed_t(rel.r_addend - addend), os);
>> addend = rel.r_addend;
>>   }
>> }
>>
>> ---
>>
>> While alternatives like PrefixVarInt (or a suffix-based variant) might
>> excel when encoding larger integers, LEB128 offers advantages when
>> most integers fit within one or two bytes, as it avoids the need for
>> shift operations in the common one-byte representation.
>>
>> While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert
>> SLEB128-encoded type/addend to use ULEB128 instead, the generate code
>> is inferior to or on par with SLEB128 for one-byte encodings.
> 
> 
> We can introduce a gas option --crel, then users can specify `gcc
> -Wa,--crel a.c` (-flto also gets -Wa, options).
> 
> I propose that we add another gas option --implicit-addends-for-data
> (does the name look good?) to allow non-code sections to use implicit
> addends to save space
> (https://sourceware.org/PR31567).
> Using implicit addends primarily benefits debug sections such as
> .debug_str_offsets, .debug_names, .debug_addr, .debug_line, but also
> data sections such as .eh_frame, .data., .data.rel.ro, .init_array.
> 
> -Wa,--implicit-addends-for-data can be used on its own (6.4% .o
> reduction in a clang -g -g0 -gpubnames build)

And this option will the switch from RELA to REL relocation sections,
effectively in violation of most ABIs I'm aware of?

Furthermore, why just data? x86 at least could benefit almost as much
for code. Hence maybe better --implicit-addends=data, with an
option for architectures to also permit --implicit-addends=text.

Jan

>   or together with
> CREL to achieve more incredible size reduction, one single byte for
> most .debug_* relocations!
> With CREL, concerns of debug section relocations will become a thing
> of the past.



Re: Patches submission policy change

2024-04-03 Thread Jan Beulich via Gcc
On 03.04.2024 10:22, Christophe Lyon wrote:
> Dear release managers and developers,
> 
> TL;DR: For the sake of improving precommit CI coverage and simplifying
> workflows, I’d like to request a patch submission policy change, so
> that we now include regenerated files. This was discussed during the
> last GNU toolchain office hours meeting [1] (2024-03-28).
> 
> Benefits or this change include:
> - Increased compatibility with precommit CI
> - No need to manually edit patches before submitting, thus the “git
> send-email” workflow is simplified
> - Patch reviewers can be confident that the committed patch will be
> exactly what they approved
> - Precommit CI can test exactly what has been submitted
> 
> Any concerns/objections?

Yes: Patch size. And no, not sending patches inline is bad practice.
Even assuming sending patches bi-modal (inline and as attachment) works
(please indicate whether that's the case), it would mean extra work on
the sending side.

Jan


Re: Patches submission policy change

2024-04-03 Thread Jan Beulich via Gcc
On 03.04.2024 10:45, Jakub Jelinek wrote:
> On Wed, Apr 03, 2024 at 10:22:24AM +0200, Christophe Lyon wrote:
>> Any concerns/objections?
> 
> I'm all for it, in fact I've been sending it like that myself for years
> even when the policy said not to.  In most cases, the diff for the
> regenerated files is very small and it helps even in patch review to
> actually check if the configure.ac/m4 etc. changes result just in the
> expected changes and not some unrelated ones (e.g. caused by user using
> wrong version of autoconf/automake etc.).
> There can be exceptions, e.g. when in GCC we update from a new version
> of Unicode, the regenerated ucnid.h diff can be large and
> uname2c.h can be huge, such that it can trigger the mailing list size
> limits even when the patch is compressed, see e.g.
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636427.html
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636426.html
> But I think most configure or Makefile changes should be pretty small,
> usual changes shouldn't rewrite everything in those files.

Which may then call for a policy saying "include generate script diff-s,
but don't include generated data file ones"? At least on the binutils
side, dealing (for CI) with what e.g. opcodes/*-gen produce ought to be
possible by having something along the lines of "maintainer mode light".

Jan


Re: Patches submission policy change

2024-04-03 Thread Jan Beulich via Gcc
On 03.04.2024 10:57, Richard Biener wrote:
> On Wed, 3 Apr 2024, Jan Beulich wrote:
>> On 03.04.2024 10:45, Jakub Jelinek wrote:
>>> On Wed, Apr 03, 2024 at 10:22:24AM +0200, Christophe Lyon wrote:
 Any concerns/objections?
>>>
>>> I'm all for it, in fact I've been sending it like that myself for years
>>> even when the policy said not to.  In most cases, the diff for the
>>> regenerated files is very small and it helps even in patch review to
>>> actually check if the configure.ac/m4 etc. changes result just in the
>>> expected changes and not some unrelated ones (e.g. caused by user using
>>> wrong version of autoconf/automake etc.).
>>> There can be exceptions, e.g. when in GCC we update from a new version
>>> of Unicode, the regenerated ucnid.h diff can be large and
>>> uname2c.h can be huge, such that it can trigger the mailing list size
>>> limits even when the patch is compressed, see e.g.
>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636427.html
>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636426.html
>>> But I think most configure or Makefile changes should be pretty small,
>>> usual changes shouldn't rewrite everything in those files.
>>
>> Which may then call for a policy saying "include generate script diff-s,
>> but don't include generated data file ones"? At least on the binutils
>> side, dealing (for CI) with what e.g. opcodes/*-gen produce ought to be
>> possible by having something along the lines of "maintainer mode light".
> 
> I'd say we should send generated files when it fits the mailing list
> limits (and possibly simply lift those limits?).

Well, that would allow patches making it through, but it would still
severely increase overall size. I'm afraid more people than not also
fail to cut down reply context, so we'd further see (needlessly) huge
replies to patches as well.

Additionally - how does one up front determine "fits the mailing list
limits"? My mail UI (Thunderbird) doesn't show me the size of a message
until I've actually sent it.

>  As a last resort
> do a series splitting the re-generation out (but I guess this would
> confuse the CI as well and of course for the push you want to squash
> again).

Yeah, unless the CI would only ever test full series, this wouldn't help.
It's also imo even more cumbersome than simply stripping the generated
file parts from emails.

Jan


Re: Patches submission policy change

2024-04-04 Thread Jan Beulich via Gcc
On 03.04.2024 15:11, Christophe Lyon wrote:
> On Wed, 3 Apr 2024 at 10:30, Jan Beulich  wrote:
>>
>> On 03.04.2024 10:22, Christophe Lyon wrote:
>>> Dear release managers and developers,
>>>
>>> TL;DR: For the sake of improving precommit CI coverage and simplifying
>>> workflows, I’d like to request a patch submission policy change, so
>>> that we now include regenerated files. This was discussed during the
>>> last GNU toolchain office hours meeting [1] (2024-03-28).
>>>
>>> Benefits or this change include:
>>> - Increased compatibility with precommit CI
>>> - No need to manually edit patches before submitting, thus the “git
>>> send-email” workflow is simplified
>>> - Patch reviewers can be confident that the committed patch will be
>>> exactly what they approved
>>> - Precommit CI can test exactly what has been submitted
>>>
>>> Any concerns/objections?
>>
>> Yes: Patch size. And no, not sending patches inline is bad practice.
> Not sure what you mean? Do you mean sending patches as attachments is
> bad practice?

Yes. It makes it difficult to reply to them (with proper reply context).

>> Even assuming sending patches bi-modal (inline and as attachment) works
>> (please indicate whether that's the case), it would mean extra work on
>> the sending side.
>>
> For the CI perspective, we use what patchwork is able to detect as patches.
> Looking at recent patches submissions, it seems patchwork is able to
> cope with the output of git format-patch/git send-email, as well as
> attachments.
> There are cases where patchwork is not able to detect the patch, but I
> don't know patchwork's exact specifications.

Question was though: If a patch was sent inline plus attachment, what
would CI use as the patch to apply? IOW would it be an option to
attach the un-stripped patch, while inlining the stripped one?

Jan