[Bug target/53975] New: [ia64] Target register of a speculative load moved to a branch register prior to the chk.s instruction

2012-07-15 Thread jakub at jermar dot eu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53975

 Bug #: 53975
   Summary: [ia64] Target register of a speculative load moved to
a branch register prior to the chk.s instruction
Classification: Unclassified
   Product: gcc
   Version: 4.7.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: ja...@jermar.eu


When building HelenOS/ia64 with gcc 4.7.1, we hit a problem in this code
sequence:

 4404570:   09 70 00 42 38 10   [MMI]   ld8.s r14=[r33]
 4404576:   a0 04 90 70 20 20   ld8.s r74=[r36]
 440457c:   19 00 00 90 mov r73=1;;
 4404580:   11 38 00 10 06 39   [MIB]   cmp.eq p7,p6=0,r8
 4404586:   60 70 04 80 83 03   mov b6=r14
 440458c:   a0 fd ff 4b   (p07) br.cond.dpnt.few 4404320
;;
 4404590:   c2 40 8a 19 00 21   [MII] (p06) adds r72=98,r12
 4404596:   00 00 00 02 00 40   nop.i 0x0;;
 440459c:   a3 04 00 01 chk.s.i r74,4404730

 44045a0:   08 b8 38 00 40 04   [MMI]   chk.s.m r14,4404710

 44045a6:   00 00 00 02 00 00   nop.m 0x0
 44045ac:   00 00 04 00 nop.i 0x0
 44045b0:   11 00 00 00 01 00   [MIB]   nop.m 0x0
 44045b6:   00 00 00 02 00 00   nop.i 0x0
 44045bc:   68 00 80 10 br.call.sptk.many b0=b6;;

Note that r14 is loaded speculatively and thus can have its NaT bit set as a
result of a deferred exception. This is checked and, if necessary, recovered by
the chk.s instruction at 0x44045a0. The problem seems to be that r14 is moved
to b6 at 0x4404586 prior to the chk.s instruction. If r14 indeed has the NaT
bit set, the instruction will generate the NaT Consumption vector exception.
Seems to me that the mov b6=r14 instruction should have been scheduled only
after the chk.s instruction when we are sure r14's NaT is cleared.


[Bug target/53975] [ia64] Target register of a speculative load moved to a branch register prior to the chk.s instruction

2012-07-15 Thread jakub at jermar dot eu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53975

--- Comment #5 from Jakub Jermar  2012-07-16 06:18:55 
UTC ---
(In reply to comment #1)
> Without a test case on a platform that is supported in GCC, there isn't much
> anyone can do to help. Can you reproduce this on linux or hpux?
> 
> BTW, are there plans to contribute HelenOS support to FSF GCC?

Well, we are squatting on ia64-pc-linux-gnu-gcc to build HelenOS, so this is
reproducible on Linux (no HelenOS-specific support exists). The testcase is our
'image.boot' loader program, but it's kind of huge. Even the function which
contains this issue, i.e. printf_core(), is pretty huge and inlines lots of
other functions.


[Bug target/53975] [ia64] Target register of a speculative load moved to a branch register prior to the chk.s instruction

2012-07-15 Thread jakub at jermar dot eu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53975

--- Comment #6 from Jakub Jermar  2012-07-16 06:28:29 
UTC ---
(In reply to comment #4)
> Ah, of course the "move branch register" instruction faults if the NaT bit of
> the source register is set. So the recovery code is irrelevant, and this could
> be a GCC bug. Need a test case to investigate, though...

Exactly. The problem is that the NaT bit cannot propagate any further when the
new destination is a branch register and so the exception can no longer remain
deferred.

As for the test case, once you have the toolchain in place, the easiest way to
reproduce this is simply to build HelenOS and disassemble the image.boot binary
around the addresses above. I'd be more than happy to provide assistance with
this. If tinkering with the entire HelenOS is not plausible, I can try to
separate at least the printf_core() into a separately buildable testcase.


[Bug target/53975] [ia64] Target register of a speculative load moved to a branch register prior to the chk.s instruction

2012-07-16 Thread jakub at jermar dot eu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53975

--- Comment #7 from Jakub Jermar  2012-07-16 21:25:47 
UTC ---
Created attachment 27805
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27805
Reproducible testcase

Here is a trimmed down reproducible testcase which exhibits the problem. The
tarball contains a README file with instructions.


[Bug target/53975] [ia64] Target register of a speculative load moved to a branch register prior to the chk.s instruction

2012-07-19 Thread jakub at jermar dot eu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53975

--- Comment #11 from Jakub Jermar  2012-07-19 18:06:16 
UTC ---
For HelenOS, we started to hit this only after we switched from gcc 4.6.3 to
4.7.1.


[Bug target/66660] [ia64] Speculative load not checked before use, leading to a NaT Consumption Vector interruption

2016-02-09 Thread jakub at jermar dot eu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=0

--- Comment #7 from Jakub Jermar  ---
Thank you, I will get back to you once I am done with the test.

[Bug target/66660] [ia64] Speculative load not checked before use, leading to a NaT Consumption Vector interruption

2016-02-09 Thread jakub at jermar dot eu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=0

--- Comment #8 from Jakub Jermar  ---
Yep, verified the patch fixes the problem (and reverting the patch reintroduces
it) with gcc 5.3.0 and mainline HelenOS. Thanks again!

[Bug target/66660] [ia64] Speculative load not checked before use, leading to a NaT Consumption Vector interruption

2015-10-27 Thread jakub at jermar dot eu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=0

--- Comment #2 from Jakub Jermar  ---
Has there been any progress on this front?


[Bug target/66660] [ia64] Speculative load not checked before use, leading to a NaT Consumption Vector interruption

2015-12-17 Thread jakub at jermar dot eu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=0

--- Comment #4 from Jakub Jermar  ---
Thanks for looking into this. Let me know if you need to verify the fix, when
it's ready.

[Bug c/66660] New: [ia64] Speculative load not checked before use, leading to a NaT Consumption Vector interruption

2015-06-24 Thread jakub at jermar dot eu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=0

Bug ID: 0
   Summary: [ia64] Speculative load not checked before use,
leading to a NaT Consumption Vector interruption
   Product: gcc
   Version: 5.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at jermar dot eu
  Target Milestone: ---

Created attachment 35848
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35848&action=edit
Reproducible test case.

I hit what appears to be a bug in handling speculative loads on Itanium in GCC
5.1.0. I am going to attach a simplified testcase derived from the latest
HelenOS sources, which reproduces the issue.

In the Ski IA-64 simulator running HelenOS, I can see the following sequence of
events wrt. r23:

 4407ff0:   09 b0 00 00 00 21   [MMI]   mov r22=r0
 4407ff6:   70 01 88 70 20 00   ld8.s r23=[r34] <=
speculative load, sets the NaT bit for r23
 4407ffc:   e0 08 2a 00 mov.i ar.lc=14;;
 4408000:   02 00 00 00 01 00   [MII]   nop.m 0x0
 4408006:   60 01 58 22 00 00   zxt2 r22=r22;;
 440800c:   82 78 20 79 shl r16=r8,r15
 4408010:   03 00 00 00 01 00   [MII]   nop.m 0x0
 4408016:   f0 78 38 80 3c e0   shr.u r15=r14,r15;;
 440801c:   01 78 44 00 zxt2 r15=r15;;
 4408020:   18 70 fc 21 3f 23   [MMB]   adds r14=-1,r16
 4408026:   00 78 74 10 23 00   st2 [r29]=r15
 440802c:   00 00 00 20 nop.b 0x0
 4408030:   01 00 50 34 98 11   [MII]   st8 [r26]=r20
 4408036:   50 11 3c 3c 29 60   extr.u r21=r15,1,31
 440803c:   12 78 b0 80 and r19=1,r15;;
 4408040:   0b c8 64 1c 0c 20   [MMI]   and r25=r25,r14;;
 4408046:   00 00 00 02 00 c0   nop.m 0x0
 440804c:   01 c8 44 00 zxt2 r14=r25;;
 4408050:   08 00 00 00 01 00   [MMI]   nop.m 0x0
 4408056:   60 72 98 00 40 00   add r38=r14,r38
 440805c:   00 00 04 00 nop.i 0x0
 4408060:   03 70 00 22 38 10   [MII]   ld8.s r14=[r17]
 4408066:   00 00 00 02 00 20   nop.i 0x0;;
 440806c:   05 30 59 00 sxt4 r41=r38;;
 4408070:   08 00 00 00 01 00   [MMI]   nop.m 0x0
 4408076:   00 00 00 02 00 00   nop.m 0x0
 440807c:   00 00 04 00 nop.i 0x0
 4408080:   09 80 00 24 38 10   [MMI]   ld8.s r16=[r18]
 4408086:   00 00 00 02 00 c0   nop.m 0x0
 440808c:   00 a0 1c e4 cmp.eq p6,p7=0,r20;;
 4408090:   f1 a0 fc 29 3f 23   [MIB] (p07) adds r20=-1,r20
 4408096:   00 00 00 02 80 03   nop.i 0x0
 440809c:   a0 00 00 43   (p07) br.cond.dpnt.few 4408130
;;
...
 4408130:   01 00 00 00 01 00   [MII]   nop.m 0x0
 4408136:   f0 00 54 22 00 00   zxt2 r15=r21
 440813c:   32 b1 38 80 or r16=r19,r22;;
 4408140:   00 00 3c 3a 88 11   [MII]   st2 [r29]=r15
 4408146:   60 81 f8 9c 29 00   dep.z r22=r16,1,15
 440814c:   02 80 44 00 zxt2 r16=r16
 4408150:   18 00 50 34 98 11   [MMB]   st8 [r26]=r20
 4408156:   00 00 00 02 00 00   nop.m 0x0
 440815c:   00 00 00 20 nop.b 0x0
 4408160:   11 00 00 00 01 00   [MIB]   nop.m 0x0
 4408166:   70 b9 60 00 40 00   add r23=r23,r24 <=
NaT bit not yet consumed
 440816c:   30 00 00 40 br.few 4408190
;;
...
 4408190:   10 00 00 00 01 00   [MIB]   nop.m 0x0
 4408196:   60 01 58 22 00 00   zxt2 r22=r22
 440819c:   00 00 00 20 nop.b 0x0
 44081a0:   09 b8 00 2e 08 10   [MMI]   ld2 r23=[r23]   <=
NaT consumption vector

In short, after doing the speculative load to r23, a conditional branch is
taken to code path which uses the speculatively-loaded register without first
running chk.s on it. So if the register has a NaT bit set (such as after a
deferred exception), there is going to be the NaT consumption vector
interruption on address 44081a0 above.

The issue did not show with GCC 4.8.1, but shows with GCC 5.1.0. Don't know
about the versions in between. The issue is also somewhat similar to an already
fixed Bug ID #53975. The same workaround applies, i.e.
-fno-selective-scheduling -fno-selective-scheduling2 will prevent the bug from
occurring.


[Bug inline-asm/36558] New: "V" inline asm constraint not working on sparc64

2008-06-17 Thread jakub at jermar dot eu
The following program:

---&<---
static volatile long last;

main()
{
long a, b;

a = last;
b = a + 1;
asm volatile ("casx %0, %2, %1\n" : "+V" (last), "+r" (b) :"r" (a));
}
--->&---

will not compile.  The error message printed is:

[EMAIL PROTECTED]:~/x$ /usr/local/sparc64/bin/sparc64-linux-gnu-gcc -c test.c
test.c: In function 'main':
test.c:9: error: inconsistent operand constraints in an 'asm'

Even though the use of the "m" constraint will fix this testcase, "m" cannot be
used in general, because it allows the operand to be offsetable.  The casx
instruction will not tolerate an offset. According to gcc info page, "V" should
be just like "m", but not offsetable. Wonder why "V" does not work when "m"
does in this case.

Moreover, there are more or less ugly workarounds for this, but this has been
bugging me for some time and I think it should be fixed.

The gcc has been configured using:
configure --target=sparc64-linux-gnu --prefix=/usr/local/sparc64
--program-prefix=sparc64-linux-gnu- --with-gnu-as --with-gnu-ld --disable-nls
--disable-threads --enable-languages=c,objc,c++,obj-c++ --disable-multilib
--disable-libgcj --without-headers --disable-shared

Thanks,
Jakub


-- 
   Summary: "V" inline asm constraint not working on sparc64
   Product: gcc
   Version: 4.3.1
Status: UNCONFIRMED
  Severity: normal
      Priority: P3
 Component: inline-asm
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jakub at jermar dot eu
 GCC build triplet: 4.3.1
  GCC host triplet: i486-linux-gnu
GCC target triplet: sparc64-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36558



[Bug target/112604] New: [ia64] Output register not preserved after a branch is not taken

2023-11-18 Thread jakub at jermar dot eu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112604

Bug ID: 112604
   Summary: [ia64] Output register not preserved after a branch is
not taken
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at jermar dot eu
  Target Milestone: ---

After an upgrade to GCC 13.2 the HelenOS VFS server started to crash on IA-64
(see http://www.helenos.org/ticket/864). I looked into the issue and the
problem seems to be that in the following code snippet:

fibril_mutex_lock(&vfs_data->lock);
if (!vfs_data->files) {
vfs_data->files = malloc(VFS_MAX_OPEN_FILES * sizeof(vfs_file_t *));
if (!vfs_data->files) {
fibril_mutex_unlock(&vfs_data->lock);
return false;
}
memset(vfs_data->files, 0, VFS_MAX_OPEN_FILES * sizeof(vfs_file_t *));
}
fibril_mutex_unlock(&vfs_data->lock);

The output argument prepared for the possible call to malloc (1024 in this
case) destroys the argument for fibril_mutex_unlock() if the branch is not
taken.

In assembly it looks like this:

40001a00 <_vfs_fd_alloc>:
40001a00:   08 48 39 18 80 05   [MMI]   alloc
r41=ar.pfs,14,12,0
40001a06:   c0 02 80 00 42 00   mov r44=r32
40001a0c:   05 00 c4 00 mov r40=b0
40001a10:   09 38 01 41 00 21   [MMI]   adds r39=64,r32
40001a16:   a0 02 04 00 42 40   mov r42=r1
40001a1c:   04 10 41 00 zxt1 r34=r34;;
40001a20:   11 28 fd 01 00 24   [MIB]   mov r37=127
40001a26:   b0 02 04 65 00 00   mov.i r43=ar.lc
40001a2c:   e8 b4 01 50 br.call.sptk.many
b0=4001cf00 ;;
40001a30:   08 60 01 40 00 21   [MMI]   mov r44=r32
40001a36:   e0 00 9c 30 20 20   ld8 r14=[r39]
40001a3c:   00 50 01 84 mov r1=r42
40001a40:   0a 68 05 00 00 24   [MMI]   mov r45=1;;
40001a46:   c0 02 00 10 48 e0   mov r44=1024
40001a4c:   00 70 18 e4 cmp.eq p7,p6=0,r14
40001a50:   16 00 00 00 00 c8   [BBB]   nop.b 0x0
40001a56:   01 f0 01 80 21 00 (p07) br.cond.dpnt.few
40001e30 <_vfs_fd_alloc+0x430>
40001a5c:   10 00 00 40 br.few
40001a60 <_vfs_fd_alloc+0x60>
40001a60:   11 00 00 00 01 00   [MIB]   nop.m 0x0
40001a66:   00 00 00 02 00 00   nop.i 0x0
40001a6c:   28 ba 01 50 br.call.sptk.many
b0=4001d480 ;;
40001a70:   08 60 01 40 00 21   [MMI]   mov r44=r32

The out0 is r44 in this context. Note how it is first correctly restored to the
mutex address at address 1a30 after the fibril_mutex_lock call. But then this
value is not used and gets rewritten to 1024 at address 1a46 in preparation for
a possible branch and a consequent call to malloc. If the branch is taken, the
register is restored properly (not shown here), but if the branch is not taken
at address 1a56, the call to fibril_mutex_unlock at address 1a6c is made with a
wrong value of r44.

We used the following command line to compile the above snippet:
usr/local/cross/bin/ia64-helenos-gcc -Iuspace/srv_vfs.p -Iuspace
-I../../../uspace -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall
-Winvalid-pch -Wextra -std=gnu11 -imacros
/home/jermar/software/HelenOS/helenos/build_all/ia64/ski/config.h -O3
-fexec-charset=UTF-8 -finput-charset=UTF-8 -D_HELENOS_SOURCE
-Wa,--fatal-warnings -Wall -Wextra -Wwrite-strings -Wunknown-pragmas
-Wno-unused-parameter -pipe -ffunction-sections -fdata-sections -fno-common
-fdebug-prefix-map=/home/jermar/software/HelenOS/helenos/=
-fdebug-prefix-map=../../= -Wsystem-headers -Werror -Wmissing-prototypes
-Werror-implicit-function-declaration -Wno-missing-braces
-Wno-missing-field-initializers -Wno-unused-parameter -Wno-clobbered
-Wno-nonnull-compare -fno-builtin-strftime -isystem../../../common/include
-isystem../../../abi/include -isystem../../../abi/arch/ia64/include
-isystem../../../uspace/lib/c/arch/ia64/include
-isystem../../../uspace/lib/c/include -D__LE__ -fno-unwind-tables -MD -MQ
uspace/srv_vfs.p/srv_vfs_vfs_file.c.o -MF
uspace/srv_vfs.p/srv_vfs_vfs_file.c.o.d -o
uspace/srv_vfs.p/srv_vfs_vfs_file.c.o -c ../../../uspace/srv/vfs/vfs_file.c

$ /usr/local/cross/bin/ia64-helenos-gcc -v 
   
1209ms  Sat 18 Nov 2023 12:42:07 PM UTC
Using built-in specs

[Bug target/112604] [ia64] Output register not preserved after a branch is not taken

2023-11-18 Thread jakub at jermar dot eu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112604

--- Comment #2 from Jakub Jermar  ---
Created attachment 56632
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56632&action=edit
Requested object and preprocessed source files

Here are the bad and good object and preprocessed sources fies for vfs_file.c.
Note that the difference between the good and bad version is -O3 (bad) v.s -O2
(good), but it was the same compiler. I am not sure this is going to be very
helpful as with -O2 the code is quite different and the offending function is
not inlined. If desired, I may try to go back to GCC 8.2 (the last good version
known to me) and try to provide a good file generated with the same compiler
flags. Let me know if this would be more useful.

I also attached the entire binary of the VFS server, both good and bad
versions. Note these are HelenOS binaries and need to be run in the environment
of the HelenOS operating system, which might not be practical for you.

[Bug target/112604] [ia64] Output register not preserved after a branch is not taken

2023-11-18 Thread jakub at jermar dot eu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112604

--- Comment #3 from Jakub Jermar  ---
Created attachment 56633
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56633&action=edit
Updated requested good object file for vfs_file.c

Seems like -O3 -fno-unswitch-loops makes the bug go away. See the new
attachment for a better example of the good binary and the object file. With
this optimization disabled the argument for the malloc/calloc is set only after
the branch is taken so it does not destroy the argument for
fibril_mutex_unlock():

40001a00 <_vfs_fd_alloc>:
40001a00:   08 48 3d 1a 80 05   [MMI]   alloc
r41=ar.pfs,15,13,0
40001a06:   d0 02 80 00 42 60   mov r45=r32
40001a0c:   05 00 cc 00 mov r43=pr
40001a10:   09 38 01 41 00 21   [MMI]   adds r39=64,r32
40001a16:   a0 02 04 00 42 40   mov r42=r1
40001a1c:   04 10 41 00 zxt1 r34=r34;;
40001a20:   11 80 00 44 91 39   [MIB]   cmp4.eq
p16,p17=0,r34
40001a26:   80 02 00 62 00 00   mov r40=b0
40001a2c:   68 b2 01 50 br.call.sptk.many
b0=4001cc80 ;;
40001a30:   08 68 01 40 00 21   [MMI]   mov r45=r32
40001a36:   44 02 00 00 42 20 (p16) mov r36=r0
40001a3c:   00 50 01 84 mov r1=r42
40001a40:   09 70 00 4e 18 10   [MMI]   ld8 r14=[r39]
40001a46:   54 02 00 00 42 c0 (p16) mov r37=r0
40001a4c:   15 00 00 90 mov r46=1;;
40001a50:   38 22 e1 01 07 64   [MMB] (p17) mov r36=1016
40001a56:   54 fa 03 00 48 00 (p17) mov r37=127
40001a5c:   00 00 00 20 nop.b 0x0
40001a60:   11 38 00 1c 06 39   [MIB]   cmp.eq p7,p6=0,r14
40001a66:   c0 02 04 65 80 03   mov.i r44=ar.lc
40001a6c:   f0 03 00 43   (p07) br.cond.dpnt.few
40001e50 <_vfs_fd_alloc+0x450>;;
40001a70:   10 00 00 00 01 00   [MIB]   nop.m 0x0
40001a76:   00 00 00 02 00 00   nop.i 0x0
40001a7c:   10 00 00 40 br.few
40001a80 <_vfs_fd_alloc+0x80>
40001a80:   11 00 00 00 01 00   [MIB]   nop.m 0x0
40001a86:   00 00 00 02 00 00   nop.i 0x0
40001a8c:   88 b7 01 50 br.call.sptk.many
b0=4001d200 ;;
40001a90:   11 68 01 40 00 21   [MIB]   mov r45=r32

40001e50:   11 68 01 00 08 24   [MIB]   mov r45=1024
40001e56:   00 00 00 02 00 00   nop.i 0x0
40001e5c:   f8 bd 00 50 br.call.sptk.many
b0=4000dc40 ;;