[Bug go/95876] New: Error in compiling gcc-11-20200621 with gcc-10 without -g

2020-06-24 Thread 570070308 at qq dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95876

Bug ID: 95876
   Summary: Error in compiling gcc-11-20200621 with gcc-10 without
-g
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
  Assignee: ian at airs dot com
  Reporter: 570070308 at qq dot com
CC: cmang at google dot com
  Target Milestone: ---

Created attachment 48781
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48781&action=edit
fullscreenlog and a sh file

This is my host gcc version:
ig@ig-vmware71:~$ gcc-10 -v
Using built-in specs.
COLLECT_GCC=gcc-10
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 10.1.0-3ubuntu1'
--with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-10
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-10-JWjDxk/gcc-10-10.1.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JWjDxk/gcc-10-10.1.0/debian/tmp-gcn/usr,hsa
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.1.0 (Ubuntu 10.1.0-3ubuntu1)

The status:
Successfully building gcc-11-20200614 with gcc-10 by default.
Successfully building gcc-11-20200614 with gcc-10 without -g.
Successfully building gcc-11-20200621 with gcc-10 by default.
Failed building gcc-11-20200621 with gcc-10 without -g.


Part of the error log:
mv -f .deps/matmul_c8.Tpo .deps/matmul_c8.Plo
f="context.o"; if test ! -f $f; then f="./.libs/context.o"; fi;
x86_64-linux-gnu-objcopy -j .go_export $f context.s-gox.tmp; /bin/bash
../../../../gcc-11-20200621/libgo/mvifdiff.sh context.s-gox.tmp `echo
context.s-gox | sed -e 's/s-gox/gox/'`
echo timestamp > context.s-gox
f="crypto/cipher.o"; if test ! -f $f; then f="crypto/.libs/cipher.o"; fi;
x86_64-linux-gnu-objcopy -j .go_export $f crypto/cipher.s-gox.tmp; /bin/bash
../../../../gcc-11-20200621/libgo/mvifdiff.sh crypto/cipher.s-gox.tmp `echo
crypto/cipher.s-gox | sed -e 's/s-gox/gox/'`
echo timestamp > crypto/cipher.s-gox
f="crypto/sha512.o"; if test ! -f $f; then f="crypto/.libs/sha512.o"; fi;
x86_64-linux-gnu-objcopy -j .go_export $f crypto/sha512.s-gox.tmp; /bin/bash
../../../../gcc-11-20200621/libgo/mvifdiff.sh crypto/sha512.s-gox.tmp `echo
crypto/sha512.s-gox | sed -e 's/s-gox/gox/'`
during GIMPLE pass: vrp
[01m[K../../../gcc-11-20200621/libgo/go/golang.org/x/crypto/ed25519/internal/edwards25519/edwards25519.go:[m[K
In function ‘[01m[Kedwards25519.GeDoubleScalarMultVartime[m[K’:
[01m[K../../../gcc-11-20200621/libgo/go/golang.org/x/crypto/ed25519/internal/edwards25519/edwards25519.go:879:1:[m[K
[01;31m[Kinternal compiler error: [m[KSegmentation fault
  879 | func GeDoubleScalarMultVartime(r *ProjectiveGroupElement, a *[32]byte,
A *ExtendedGroupElement, b *[32]byte) {
  | [01;31m[K^[m[K
echo timestamp > crypto/sha512.s-gox
f="crypto/ed25519/internal/edwards25519.o"; if test ! -f $f; then
f="crypto/ed25519/internal/.libs/edwards25519.o"; fi; x86_64-linux-gnu-objcopy
-j .go_export $f crypto/ed25519/internal/edwards25519.s-gox.tmp; /bin/bash
../../../../gcc-11-20200621/libgo/mvifdiff.sh
crypto/ed25519/internal/edwards25519.s-gox.tmp `echo
crypto/ed25519/internal/edwards25519.s-gox | sed -e 's/s-gox/gox/'`
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions.
make[4]: *** [Makefile:2870:
golang.org/x/crypto/ed25519/internal/edwards25519.lo] Error 1
make[4]: *** Waiting for unfinished jobs

The log is generated by `script ../screen.log`. It seems that the colorful
warning generated by compiler turn to Garbage characters. But the log is too
long so I can't copy the log from the shell. I will upload the full log file. I
will upload the full log file with an attached archive (rar) becau

[Bug c++/95911] New: [8/9/10/11] returning && makes an error without any warning

2020-06-26 Thread 570070308 at qq dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95911

Bug ID: 95911
   Summary: [8/9/10/11] returning && makes an error without any
warning
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

Created attachment 48789
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48789&action=edit
the .ii file

class A
{
public:
A()
{
printf("A create.\n");
}
A(const A&a)
{
printf("A copy create.\n");
}
A(A&&a)
{
printf("A move create.\n");
}
~A()
{
printf("A delete.\n");
}
};
A newA()
{
A a;
return a;
}
A&& bug(A&& x)
{
printf("bug\n");
return std::move(x);
}
int main()
{
A &&a=newA();  //ok
printf("--\n");
A &&b=bug(newA()); //error
printf("--\n");
A c=bug(newA());   //ok
printf("--\n");
return 0;
}


runing the code, the result is:
A create.
--
A create.
bug
A delete.
--
A create.
bug
A move create.
A delete.
--
A delete.
A delete.


In main function create three class A veriables, but only two class A veriables
are deleted at last. But the compiler do not give any warning. I have try -Wall
-Wextra but it still have no warning. I have tried with
-fno-elide-constructors, -O3 and -O0, the result is the same. In my
comprehension of c++, the second class A veriable should not be deleted. When
double using function bug(see the .ii file), you can find it was not deleted so
early. If it should be deleted, that means the code is dangerous and compiler
should give a warning. There is no warning in g++ 8/9/10/11, and also no
warning in clang++.

[Bug libstdc++/96240] New: Error in building gcc-11 with --disable-shared

2020-07-19 Thread 570070308 at qq dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96240

Bug ID: 96240
   Summary: Error in building gcc-11 with --disable-shared
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

Created attachment 48892
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48892&action=edit
The full log file

Successfully build gcc-11-20200705 and gcc-11-20200712 without --disable-shared
or with --enabled-shared.
Failed to build gcc-11-20200705 and gcc-11-20200712 with --disable-shared.

My system is Ubuntu 20.10. Host gcc is gcc9.3 .

The configure command is:
../gcc-11-20200712/configure -v
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,lto
--prefix=/home/ig/temp2 --with-gcc-major-version-only --program-suffix=-11
--program-prefix=x86_64-linux-gnu- (--disable-shared or --enabled-shared)
--enable-linker-build-id --without-included-gettext --enable-threads=posix
--enable-nls --enable-clocale=gnu --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib
--enable-libphobos-checking=release --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic --enable-offload-targets=nvptx-none,amdgcn-amdhsa,hsa
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu

Part of the error log:
/usr/bin/x86_64-linux-gnu-ld:
/home/ig/temp/x86_64-linux-gnu/libstdc++-v3/src/.libs/libstdc++.a(eh_aux_runtime.o):
relocation R_X86_64_PC32 against symbol `_ZTISt8bad_cast' can not be used when
making a shared object; recompile with -fPIC
/usr/bin/x86_64-linux-gnu-ld: final link failed: bad value
make[9]: Nothing to be done for 'all'.
make[9]: Leaving directory
'/home/ig/temp/x86_64-linux-gnu/32/libstdc++-v3/src/c++11'
Making all in c++17
collect2: error: ld returned 1 exit status
make[3]: *** [Makefile:561: libcc1.la] Error 1

The full error log see the attachment I uploaded.

[Bug c/96317] New: [8/9/10/11] Int compare optimizations make some errors

2020-07-25 Thread 570070308 at qq dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96317

Bug ID: 96317
   Summary: [8/9/10/11] Int compare optimizations make some errors
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

for the code:

signed int a=2147483647;
if( (signed int)( (signed int)a + (signed int)1 ) < (signed int)2147483647
)
{
printf("111\n");
}
if( (signed int)( (signed int)a + (signed int)1 ) < (signed int)2147483646
)
{
printf("222\n");
}
signed int b=2147483646;
if( (signed int)( (signed int)a + (signed int)1 ) < (signed int)b )
{
printf("333\n");
}

The result is:
111
333

I have checked the assembly files. It seems that the compiler optimize the
a+1<2147483646 to a<=2147483644. There are some other similar situations in
compartion of <=,>,>=. I think it is better to change the ways or give out a
warning.

[Bug libstdc++/98233] New: A small bug in stl

2020-12-10 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98233

Bug ID: 98233
   Summary: A small bug in stl
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

In the code
```
#include 
#include 
struct A
{
public:
int a;
std::vector m;
};
int main()
{
A x;
x.m.emplace_back();
x.a=13;
x.m[0].a=9;
printf("%d\n",x.m[0].a);
x.m[0]=x;
printf("%d\n",x.m[0].m[0].a);
return 0;
}
```
expect to output 9 and 9.
But the result is 9 and 13.

[Bug c++/98261] New: Wrong optimize for virtual function

2020-12-13 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98261

Bug ID: 98261
   Summary: Wrong optimize for virtual function
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

In
```
#include 
#include 
class A
{ 
public:
virtual void print()
{
printf("A\n");
}
};
class B
{
public:
virtual void print()
{
printf("B\n");
}
};
int main()
{
A a;
B b;
A *pa=&a;
pa->print();
pa=(A *)&b;
pa->print();
return 0;
}
```
The compile is successful.
The program run normally in -O0. But in -O3 the program will print a lot of A
and distory.

[Bug c++/98261] Wrong optimize for virtual function

2020-12-13 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98261

--- Comment #3 from 。 <570070308 at qq dot com> ---
(In reply to Jonathan Wakely from comment #2)
> The bug reporting guidelines tell you to try -fsanitize=undefined before
> reporting a bug. That would have told you your code is wrong:
> 
> A
> d.C:26:14: runtime error: member call on address 0x7ffe71274e18 which does
> not point to an object of type 'A'
> 0x7ffe71274e18: note: object is of type 'B'
>  00 00 00 00  28 20 40 00 00 00 00 00  40 20 40 00 00 00 00 00  18 4e 27 71
> fe 7f 00 00  00 00 00 00
>   ^~~
>   vptr for 'B'
> B

thanks, I will pay attention to it next time.

[Bug middle-end/108441] New: [12.2] Maybe missed optimization: loading an 16-bit integer value from .rodata instead of an immediate store

2023-01-18 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108441

Bug ID: 108441
   Summary: [12.2] Maybe missed optimization: loading an 16-bit
integer value from .rodata instead of an immediate
store
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

For the C code:
```c
#include 

extern struct __attribute__((packed))
{
uint8_t size;
uint8_t pad;
uint16_t sec_num;
uint16_t offset;
uint16_t segment;
uint64_t sec_id;
} ldap;

//uint16_t x __attribute__((aligned(4096)));

void kkk()
{
ldap.size = 16;
ldap.pad = 0;
//x = 16;
}
```

gcc-12.2 compile it with O2, O3 or Ofast:

```
.globl  kkk
.type   kkk, @function
kkk:
movzwl  .LC0(%rip), %eax
movw%ax, ldap(%rip)
ret
.size   kkk, .-kkk
.section.rodata.cst2,"aM",@progbits,2
.align 2
.LC0:
.byte   16
.byte   0
```

It seems strange to load $16 to %eax from .rodata section. There is no such
problem with gcc-11.

gcc version:
```
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 12.2.0-3ubuntu1'
--with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.2.0 (Ubuntu 12.2.0-3ubuntu1)
```

The gcc is installed by apt on Ubuntu 22.10.

See more detail on:
https://stackoverflow.com/questions/75154687/is-this-a-missed-optimization-in-gcc-loading-an-16-bit-integer-value-from-roda

[Bug middle-end/108441] [12.2] Maybe missed optimization: loading an 16-bit integer value from .rodata instead of an immediate store

2023-01-18 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108441

--- Comment #1 from 。 <570070308 at qq dot com> ---
When compiling with `-fno-tree-slp-vectorize`, it seems to be better:
```
kkk:
movl$16, %eax
movw%ax, ldap(%rip)
ret
```

[Bug c/104763] New: [12.0] Generate wrong assembly code

2022-03-02 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104763

Bug ID: 104763
   Summary: [12.0] Generate wrong assembly code
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

file test.c:
```
#include 
#include 
#include 
void move_up()
{
for ( size_t* i=(size_t *)(0xb8000+160*24); ; )
{
*i=0x0700070007000700;
//if ( i == (size_t *)(0xb8000+160*24+7*sizeof(size_t)) )
if ( i == (volatile size_t *)(0xb8000+160*24)+2 )
{
break;
}
++i;
}
while (1){}
}
```
compile with `gcc -O1 test.c -S` or -O2 or -O3
the assemble code:
```
move_up:
.LFB0:
.cfi_startproc
endbr64
movabsq $504410854964332288, %rax
movq%rax, 757512
movq%rax, 757520
.L2:
jmp .L2
.cfi_endproc
```

But the correct assembly code should be:
```
move_up:
movabsq $504410854964332288, %rax
movq%rax, 757504
movq%rax, 757512
movq%rax, 757520
.L2:
jmp .L2
```

I try to compile with -fno-strict-aliasing -fwrapv
-fno-aggressive-loop-optimizations -Wall -Wextra and the result is same.

This is the full gcc compile info:
```
ig@ig-virtual-machine:~/os/myos/temp$ gcc -O1 test.c -S -v
Using built-in specs.
COLLECT_GCC=gcc
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
12-20220222-1ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-TT8eTw/gcc-12-12-20220222/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-TT8eTw/gcc-12-12-20220222/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.1 20220222 (experimental) [master r12-7325-g2f59f067610]
(Ubuntu 12-20220222-1ubuntu1) 
COLLECT_GCC_OPTIONS='-O1' '-S' '-v' '-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/12/cc1 -quiet -v -imultiarch x86_64-linux-gnu
test.c -quiet -dumpbase test.c -dumpbase-ext .c -mtune=generic -march=x86-64
-O1 -version -o test.s -fasynchronous-unwind-tables -fstack-protector-strong
-Wformat -Wformat-security -fstack-clash-protection -fcf-protection
GNU C17 (Ubuntu 12-20220222-1ubuntu1) version 12.0.1 20220222 (experimental)
[master r12-7325-g2f59f067610] (x86_64-linux-gnu)
compiled by GNU C version 12.0.1 20220222 (experimental) [master
r12-7325-g2f59f067610], GMP version 6.2.1, MPFR version 4.1.0, MPC version
1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/12/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/12/include
 /usr/local/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C17 (Ubuntu 12-20220222-1ubuntu1) version 12.0.1 20220222 (experimental)
[master r12-7325-g2f59f067610] (x86_64-linux-gnu)
compiled by GNU C version 12.0.1 20220222 (experimental) [master
r12-7325-g2f59f067610], GMP version 6.2.1, MPFR version 4.1.0, MPC version
1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: b0de683ccdd6a31c725d9273cea9e1f8
COMPILER_PATH=/usr/lib/gcc/x86_64-linux-gnu/12/:/usr/lib/gcc/x86_64-linux-gnu/12/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/12/:/usr/lib/gcc/x86_64-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/x86_64-linux-gnu/12/:/usr/lib/gcc/x86_64-linux-gnu/12/../../../x86_64-linux-gnu/:/usr/lib/gc

[Bug c/104763] [12.0] Generate wrong assembly code

2022-03-02 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104763

--- Comment #1 from 。 <570070308 at qq dot com> ---
change `*i=0x0700070007000700;` to `*(volatile size_t *)i=0x0700070007000700;`
will fix it.
This is my mistake

[Bug middle-end/104763] [12.0] Generate wrong assembly code

2022-03-02 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104763

--- Comment #3 from 。 <570070308 at qq dot com> ---
(In reply to Jakub Jelinek from comment #2)
> Can't reproduce with -O2, with -O1 there are 2 stores instead of 3
> before the endless loop starting with
> r9-384-gf1bcb061d172ca7e3bdcc46476b20776382a2974

edit `+2` to `+7` then -O3 will be wrong too.

test.c:
```
#include 
#include 
#include 
void move_up()
{
for ( size_t* i=(size_t *)(0xb8000+160*24); ; )
{
*i=0x0700070007000700;
if ( i == (size_t *)(0xb8000+160*24)+7 )
{
break;
}
++i;
}
while (1){}
}
```
assembly with -O3:
```
move_up:
.LFB0:
.cfi_startproc
endbr64
movabsq $504410854964332288, %rax
movq%rax, 757512
movq%rax, 757520
movq%rax, 757528
movq%rax, 757536
movq%rax, 757544
movq%rax, 757552
movq%rax, 757560
.L2:
jmp .L2
.cfi_endproc
```

[Bug c/104786] New: [12.0]internal compiler error with extern asm

2022-03-04 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104786

Bug ID: 104786
   Summary: [12.0]internal compiler error with extern asm
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

Created attachment 52565
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52565&action=edit
/tmp/ccusgsXN.out

log:
```
ig@ig-virtual-machine:~/temp$ gcc-12 test5.c -S -o 2.s -freport-bug
-fno-builtin -v
Using built-in specs.
COLLECT_GCC=gcc-12
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
12-20220222-1ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-TT8eTw/gcc-12-12-20220222/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-TT8eTw/gcc-12-12-20220222/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.1 20220222 (experimental) [master r12-7325-g2f59f067610]
(Ubuntu 12-20220222-1ubuntu1) 
COLLECT_GCC_OPTIONS='-S' '-o' '2.s' '-freport-bug' '-fno-builtin' '-v'
'-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/12/cc1 -quiet -v -imultiarch x86_64-linux-gnu
test5.c -quiet -dumpbase 2.c -dumpbase-ext .c -mtune=generic -march=x86-64
-version -freport-bug -fno-builtin -o 2.s -fasynchronous-unwind-tables
-fstack-protector-strong -Wformat -Wformat-security -fstack-clash-protection
-fcf-protection
GNU C17 (Ubuntu 12-20220222-1ubuntu1) version 12.0.1 20220222 (experimental)
[master r12-7325-g2f59f067610] (x86_64-linux-gnu)
compiled by GNU C version 12.0.1 20220222 (experimental) [master
r12-7325-g2f59f067610], GMP version 6.2.1, MPFR version 4.1.0, MPC version
1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/12/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/12/include
 /usr/local/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C17 (Ubuntu 12-20220222-1ubuntu1) version 12.0.1 20220222 (experimental)
[master r12-7325-g2f59f067610] (x86_64-linux-gnu)
compiled by GNU C version 12.0.1 20220222 (experimental) [master
r12-7325-g2f59f067610], GMP version 6.2.1, MPFR version 4.1.0, MPC version
1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: b0de683ccdd6a31c725d9273cea9e1f8
test5.c: In function ‘memcpy’:
test5.c:23:10: warning: passing argument 1 of ‘test’ discards ‘const’ qualifier
from pointer target type [-Wdiscarded-qualifiers]
   23 | test(si);
  |  ^~
test5.c:3:18: note: expected ‘void *’ but argument is of type ‘const void *’
3 | void test(void * a);
  |   ~~~^
during RTL pass: expand
test5.c:11:9: internal compiler error: in assign_stack_temp_for_type, at
function.cc:798
   11 | __asm__ inline (
  | ^~~
0x63d27e assign_stack_temp_for_type(machine_mode, poly_int<1u, long>,
tree_node*)
../../src/gcc/function.cc:798
0x90cb33 assign_temp(tree_node*, int, int)
../../src/gcc/function.cc:1018
0x7aa202 expand_asm_stmt
../../src/gcc/cfgexpand.cc:3332
0x7adccc expand_gimple_stmt_1
../../src/gcc/cfgexpand.cc:3861
0x7adccc expand_gimple_stmt
../../src/gcc/cfgexpand.cc:4028
0x7b2a77 expand_gimple_basic_block
../../src/gcc/cfgexpand.cc:6069
0x7b

[Bug c/104786] [12.0]internal compiler error with extern asm

2022-03-04 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104786

--- Comment #1 from 。 <570070308 at qq dot com> ---
gcc-9 crashed too

[Bug c/104804] New: [12.0] x86_64 Extended asm always failed with "+=m" in Output Operands and wrong error info

2022-03-06 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104804

Bug ID: 104804
   Summary: [12.0] x86_64 Extended asm always failed with "+=m" in
Output Operands and wrong error info
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

When using "+&m" in the Output Operands, the compile will always failed,
however, the "+&r" is ok.

And the error info seen to be wrong:
```
error: input operand constraint contains ‘&’
```
It is the onput operand constraint not input.


test2.c:
```
ig@ig-virtual-machine:~/temp$ cat test2.c
int main()
{
char a, b;
__asm__
(
 "\n"
 :"+&m"( a )
 :"r"( b )
 :"cc"
 );
return 0;
}
ig@ig-virtual-machine:~/temp$
```
compile error log:
```
ig@ig-virtual-machine:~/temp$ gcc-12 test2.c -S  -v
Using built-in specs.
COLLECT_GCC=gcc-12
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
12-20220302-1ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-VlTCdr/gcc-12-12-20220302/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-VlTCdr/gcc-12-12-20220302/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.1 20220302 (experimental) [master r12-7448-g58394373a70]
(Ubuntu 12-20220302-1ubuntu1) 
COLLECT_GCC_OPTIONS='-S' '-v' '-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/12/cc1 -quiet -v -imultiarch x86_64-linux-gnu
test2.c -quiet -dumpbase test2.c -dumpbase-ext .c -mtune=generic -march=x86-64
-version -o test2.s -fasynchronous-unwind-tables -fstack-protector-strong
-Wformat -Wformat-security -fstack-clash-protection -fcf-protection
GNU C17 (Ubuntu 12-20220302-1ubuntu1) version 12.0.1 20220302 (experimental)
[master r12-7448-g58394373a70] (x86_64-linux-gnu)
compiled by GNU C version 12.0.1 20220302 (experimental) [master
r12-7448-g58394373a70], GMP version 6.2.1, MPFR version 4.1.0, MPC version
1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/12/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/12/include
 /usr/local/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C17 (Ubuntu 12-20220302-1ubuntu1) version 12.0.1 20220302 (experimental)
[master r12-7448-g58394373a70] (x86_64-linux-gnu)
compiled by GNU C version 12.0.1 20220302 (experimental) [master
r12-7448-g58394373a70], GMP version 6.2.1, MPFR version 4.1.0, MPC version
1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 3929bb57dd80f5cc2d4f3202c349e2dc
test2.c: In function ‘main’:
test2.c:4:5: error: input operand constraint contains ‘&’
4 | __asm__
  | ^~~
ig@ig-virtual-machine:~/temp$
```

and this is ok:
```
int main()
{
char a, b;
__asm__
(
 "\n"
 :"+&r"( a )
 :"r"( b )
 :"cc"
 );
return 0;
}
```

the gcc is installed by apt on ubuntu 22.04

[Bug c/104804] [12.0] x86_64 Extended asm always failed with "+&m" in Output Operands and wrong error info

2022-03-06 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104804

--- Comment #2 from 。 <570070308 at qq dot com> ---
(In reply to Jakub Jelinek from comment #1)
> +m is handled as =m with corresponding m, early clobber for that doesn't
> make sense, on one side you require that the input is the same as the
> output, on the other hand you require that no input can be equal to the
> output because the output might be overwritten before the inputs are read.
> Just don't do that.

Thanks for Reply

According to the doc in
https://gcc.gnu.org/onlinedocs/gcc/Modifiers.html#Modifiers and
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#OutputOperands, in my
understanding, if an OutputOperand('+'or'=') is written before reading from any
OutputOperands(+) or IutputOperands except the OutputOperand itself, it should
be added '&'.
So if +m is handled as =m with corresponding m, I think +&m should be handle as
+m and =m too.
for example in:
```
__asm__(
"read a"
"write a"
"read b"
:"+&m"(a)
:"g"(b)
:
);
```

[Bug c/104805] New: [12.0] x86_64 Extended asm may use rbp register to input/output even thougth "rbp" is in the clobber list when "rsp" and "rbp" are both in the in the clobber list

2022-03-06 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104805

Bug ID: 104805
   Summary: [12.0] x86_64 Extended asm may use rbp register to
input/output even thougth "rbp" is in the clobber list
when "rsp" and "rbp" are both in the in the clobber
list
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
          Reporter: 570070308 at qq dot com
  Target Milestone: ---

test.c:
```
void kkk()
{
char a;
__asm__ volatile (
"%0"
:"+m"(a)
:
: "rsp","rbp"
);
}
```

assembly code:
```
#APP
# 4 "test.c" 1
-9(%rbp)
# 0 "" 2
#NO_APP
```

can see that it use `-9(%rbp)` to represent `a`. Accouding to
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers,
the rbp should not be use to represent any operands because it is in the
clobber list.

compile log:
```
ig@ig-virtual-machine:~/temp$ gcc-12 -S test.c -O1 -v
Using built-in specs.
COLLECT_GCC=gcc-12
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
12-20220302-1ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-VlTCdr/gcc-12-12-20220302/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-VlTCdr/gcc-12-12-20220302/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.1 20220302 (experimental) [master r12-7448-g58394373a70]
(Ubuntu 12-20220302-1ubuntu1) 
COLLECT_GCC_OPTIONS='-S' '-O1' '-v' '-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/12/cc1 -quiet -v -imultiarch x86_64-linux-gnu
test.c -quiet -dumpbase test.c -dumpbase-ext .c -mtune=generic -march=x86-64
-O1 -version -o test.s -fasynchronous-unwind-tables -fstack-protector-strong
-Wformat -Wformat-security -fstack-clash-protection -fcf-protection
GNU C17 (Ubuntu 12-20220302-1ubuntu1) version 12.0.1 20220302 (experimental)
[master r12-7448-g58394373a70] (x86_64-linux-gnu)
compiled by GNU C version 12.0.1 20220302 (experimental) [master
r12-7448-g58394373a70], GMP version 6.2.1, MPFR version 4.1.0, MPC version
1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/12/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/12/include
 /usr/local/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C17 (Ubuntu 12-20220302-1ubuntu1) version 12.0.1 20220302 (experimental)
[master r12-7448-g58394373a70] (x86_64-linux-gnu)
compiled by GNU C version 12.0.1 20220302 (experimental) [master
r12-7448-g58394373a70], GMP version 6.2.1, MPFR version 4.1.0, MPC version
1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 3929bb57dd80f5cc2d4f3202c349e2dc
test.c: In function ‘kkk’:
test.c:4:5: warning: listing the stack pointer register ‘rsp’ in a clobber list
is deprecated [-Wdeprecated]
4 | __asm__ volatile (
  | ^~~
test.c:4:5: note: the value of the stack pointer after an ‘asm’ statement must
be the same as it was before the statement
COMPILER_PATH=/usr/lib/gcc/x86_64-linux-gnu/12/:/usr/lib/gcc/x86_64-linux-gnu/12/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/12/:/usr/lib/gcc/x86_64-linux-gnu/
L

[Bug c/104805] [12.0] x86_64 Extended asm may use rbp register to input/output even thougth "rbp" is in the clobber list when "rsp" and "rbp" are both in the in the clobber list

2022-03-06 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104805

--- Comment #2 from 。 <570070308 at qq dot com> ---
(In reply to Jakub Jelinek from comment #1)
> Clobber of "rsp" makes no sense, you can't change the value of the stack
> pointer in inline asm without restoring it back before the end of inline asm.

I know that changing rsp is dangerous, and the gcc will give a warning if you
use rsp. I have never changed rsp in my inline-asm coding, or put "rsp" in the
clobber list. I'm just for finding bug and make gcc better.

what about:

__asm__
(
"pushq %%rax\n\t"
"popq %%rax"
:
:
:"rsp"
);

[Bug c/104804] [12.0] x86_64 Extended asm always failed with "+&m" in Output Operands and wrong error info

2022-03-06 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104804

--- Comment #4 from 。 <570070308 at qq dot com> ---
(In reply to Jakub Jelinek from comment #3)

> What exactly are you trying to achieve (because & on your testcase makes no
> sense at all, the other input is "r" and therefore can't ever match the
> memory)?

Yes, in x86 that's true.
But if there is an Instruction Set, which support 
```
 movq ((%rax)), %rax
```

And in
test.c :
```
void kkk()
{
char a;
char *pa=&a;
char *ppa=&pa;
__asm__ (
:"+m"(pa)
:"m"(a)
:
);
}
```

gcc can use %rax to represent ppa, (%rax) to represent pa, ((%rax)) to
represent a., then "+&m"(pa) will make difference.

[Bug c/104805] [12.0] x86_64 Extended asm may use rbp register to input/output even thougth "rbp" is in the clobber list when "rsp" and "rbp" are both in the in the clobber list

2022-03-06 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104805

--- Comment #3 from 。 <570070308 at qq dot com> ---
(In reply to Jakub Jelinek from comment #1)
> Clobber of "rsp" makes no sense, you can't change the value of the stack
> pointer in inline asm without restoring it back before the end of inline asm.

And, this bug is about "rbp" not "rsp". GCC uses "rbp" to represent input
operand even though the "rbp" is in the clobber list.

[Bug c/104805] [12.0] x86_64 Extended asm may use rbp register to input/output even thougth "rbp" is in the clobber list when "rsp" and "rbp" are both in the in the clobber list

2022-03-06 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104805

--- Comment #5 from 。 <570070308 at qq dot com> ---
(In reply to Jakub Jelinek from comment #4)
> rbp is hard frame pointer, so depending on whether the function needs a
> frame pointer (at -O0 I think all functions do), the register isn't
> available for use (and therefore for clobbering) in inline asm.
> Only in functions where it isn't needed, it is not fixed then and can be
> used for other purposes.

So you explained why "rbp" cann't be in the clobber list with -O0, and may be
can be in the clobber list with -O1,-O2 or -O3(when the function don't needs a
frame pointer), I understand this now.

But when the "rbp" is in the clobber list successfully, it should not be used
to represent any input/output operands according to the doc because user may
change the %rbp and make the input/output operands wrong.

for example:
```
void kkk()
{
char a;
__asm__ volatile (
"writing %%rbp\n\t"
// %0 may point to error memory because %rbp change
// for example -9(%rbp) represent char a
"reading %0\n\t"
"writing %0"
:"+m"(a)
:
: "rsp","rbp"
);
}
```

I have done a lot of experiments, if a register is list in the clobber list, it
will never appear to represent in the input/output operands, and the doc say so
too. Only "rbp" is an exception.

https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers
```
When the compiler selects which registers to use to represent input and output
operands, it does not use any of the clobbered registers. As a result,
clobbered registers are available for any use in the assembler code.
```

[Bug middle-end/104763] [12 Regression] Generate wrong assembly code

2022-03-07 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104763

--- Comment #8 from 。 <570070308 at qq dot com> ---
(In reply to Richard Biener from comment #7)
> Note that the case of an endless loop is somewhat special since the store
> is dead there since there is no way to reach a load from that point with
> C standard methods.  So one could also argue the optimization is valid
> and this bug is invalid.
> 
> How did you end up with this case?

I agree with this bug is invaild.
I do something on memory 0xb8000, this is because the 0xb8000 is the video
memory, and can display on screen.
However gcc don't know it, and I think it doesn't need to know it. Gcc can
simply think that all the writing/reading on memory is useless, unless the
writing/reading on memory will affect subsequent function calls or the function
return. This way gcc can better optimize the code.
According to this, the while(1){} will never return or call a function, so gcc
can think that all the writing/reading on 0xb8000 before while(1){} is useless.
A way to tell gcc that writing/reading on 0xb8000 will make an impact is to
change `*i` to `*(volatile size_t *)i`, and it really work.

So I think:
test.c:
```
void move_up()
{
for ( size_t* i=(size_t *)(0xb8000+160*24); ; )
{
*i=0x0700070007000700;
if ( i == (size_t *)(0xb8000+160*24)+2 )
{
break;
}
++i;
}
while (1){}
}
```
compile to
```
move_up:
jmp move_up
```
and test.c:
```
void move_up()
{
for ( size_t* i=(size_t *)(0xb8000+160*24); ; )
{
*(volatile size_t *)i=0x0700070007000700;
if ( i == (size_t *)(0xb8000+160*24)+2 )
{
break;
}
++i;
}
while (1){}
}
```
compile to:
```
move_up:
movabsq $504410854964332288, %rax
movq%rax, 757504
movq%rax, 757512
movq%rax, 757520
.L2:
jmp .L2
```
is the best.

[Bug c/105311] New: [12]Still generate memset even with -fno-tree-loop-distribute-patterns

2022-04-19 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105311

Bug ID: 105311
   Summary: [12]Still generate memset even with
-fno-tree-loop-distribute-patterns
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

I have read the https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888 , according
to this, adding `-fno-tree-loop-distribute-patterns` will prevent gcc from
calling memset. However, the gcc still call memset with the code below:
```test.c
#include 
struct Page_Table_Page
{
uint64_t pts[511][512];
};

void init_ptp(struct Page_Table_Page*const ptp)
{
*ptp=(struct Page_Table_Page){{{0}}};
}
```
compile with:
```
gcc-12 -O3 test.c -S -fno-tree-loop-distribute-patterns -fno-builtin-memset
-fno-builtin -nodefaultlibs -nostdlib -ffreestanding
```

```test.s
init_ptp:
.LFB24:
.cfi_startproc
endbr64
subq$8, %rsp
.cfi_def_cfa_offset 16
movl$2093056, %edx
xorl%esi, %esi
callmemset@PLT
addq$8, %rsp
.cfi_def_cfa_offset 8
ret
```

The gcc-12 is installed by apt on Ubuntu 22.04.
The full compile log:
```
$ gcc-12 -O3 test.c -S -fno-tree-loop-distribute-patterns -fno-builtin-memset
-fno-builtin -nodefaultlibs -nostdlib -ffreestanding -v
Using built-in specs.
COLLECT_GCC=gcc-12
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
12-20220319-1ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-OcsLtf/gcc-12-12-20220319/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-OcsLtf/gcc-12-12-20220319/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f]
(Ubuntu 12-20220319-1ubuntu1) 
COLLECT_GCC_OPTIONS='-O3' '-S' '-fno-tree-loop-distribute-patterns'
'-fno-builtin-memset' '-fno-builtin' '-nodefaultlibs' '-nostdlib'
'-ffreestanding' '-v' '-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/12/cc1 -quiet -v -imultiarch x86_64-linux-gnu
test.c -quiet -dumpbase test.c -dumpbase-ext .c -mtune=generic -march=x86-64
-O3 -version -fno-tree-loop-distribute-patterns -fno-builtin-memset
-fno-builtin -ffreestanding -o test.s -fasynchronous-unwind-tables -Wformat
-Wformat-security -fstack-clash-protection -fcf-protection
GNU C17 (Ubuntu 12-20220319-1ubuntu1) version 12.0.1 20220319 (experimental)
[master r12-7719-g8ca61ad148f] (x86_64-linux-gnu)
compiled by GNU C version 12.0.1 20220319 (experimental) [master
r12-7719-g8ca61ad148f], GMP version 6.2.1, MPFR version 4.1.0, MPC version
1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/12/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/12/include
 /usr/local/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C17 (Ubuntu 12-20220319-1ubuntu1) version 12.0.1 20220319 (experimental)
[master r12-7719-g8ca61ad148f] (x86_64-linux-gnu)
compiled by GNU C version 12.0.1 20220319 (experimental) [master
r12-7719-g8ca61ad148f], GMP version 6.2.1, MPFR version 4.1.0, MPC version
1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 200a3

[Bug middle-end/105342] New: [Extended Asm]Memory barrier geater than a function call

2022-04-21 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105342

Bug ID: 105342
   Summary: [Extended Asm]Memory barrier geater than a function
call
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

This is an enhancement request, not a bug.
According to doc, using the "memory" clobber effectively forms a read/write
memory barrier for the compiler. Through my tests, the memory barrier's range
in Extended-Asm is even greater than a function call, it will barrier the
memory in the function's own stack. I think it is useless and it may even
generate more complex code with inline funtion.


For example in the test case:
```test.c
extern unsigned long int x[512];

void test(long int,long int,long int,long int,long int);
void test1(long int);
void test2();

int kkk()
{
unsigned long int k[512];
for (size_t i=0; i<512; ++i )
{
k[i]=x[i];
}
test1(k[0]);
k[1]=3;
k[2]=3;
k[3]=3;
k[4]=3;
k[5]=3;
test2();
test(k[1], k[2], k[3], k[4], k[5]);
return 0;
}
```
and
```test2.c
void test2()
{
__asm__ volatile
(""
 :
 :
 :"memory"
 );
}
```
compile with 
```
gcc-12 -fno-stack-protector -fcf-protection=none
-fno-asynchronous-unwind-tables -mgeneral-regs-only -O3 -S test.c test2.c
```
than generate
```test.s:
kkk:
subq$4096, %rsp
orq $0, (%rsp)
subq$8, %rsp
leaqx(%rip), %rsi
movl$512, %ecx
movq%rsp, %rdi
rep movsq
movq(%rsp), %rdi
calltest1@PLT
xorl%eax, %eax
calltest2@PLT
movl$3, %r8d
movl$3, %ecx
movl$3, %edx
movl$3, %esi
movl$3, %edi
calltest@PLT
xorl%eax, %eax
addq$4104, %rsp
ret
```
```test2.s
test2:
ret
```
The kkk's assembly code looks neat.

However, if I put the contents of test2.c in test.c, then it will generate:
```test.s
kkk:
subq$4096, %rsp
orq $0, (%rsp)
subq$8, %rsp
leaqx(%rip), %rsi
movl$512, %ecx
movq%rsp, %rdi
rep movsq
movq(%rsp), %rdi
calltest1@PLT
movq$3, 8(%rsp)
movq$3, 16(%rsp)
movq$3, 24(%rsp)
movq$3, 32(%rsp)
movq$3, 40(%rsp)
movq40(%rsp), %r8
movq32(%rsp), %rcx
movq24(%rsp), %rdx
movq16(%rsp), %rsi
movq8(%rsp), %rdi
calltest@PLT
xorl%eax, %eax
addq$4104, %rsp
ret
test2:
ret
```
The compiler automatically inline the function test2() and think k[1], k[2],
k[3], k[4], k[5] is barrier with the extended-asm, so the inlining test2() is
even slower than not inlining it.

The gcc-12 is installed by apt on ubuntu 22.04. Full compile log:
```
$ gcc-12 -fno-stack-protector -fcf-protection=none
-fno-asynchronous-unwind-tables -mgeneral-regs-only -O3 -S test.c test2.c -v
Using built-in specs.
COLLECT_GCC=gcc-12
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
12-20220319-1ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-OcsLtf/gcc-12-12-20220319/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-OcsLtf/gcc-12-12-20220319/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f]
(Ubuntu 12-20220319-1ubuntu1) 
COLLECT_GCC_OPTIONS='-fno-stack-protector' '-fcf-protection=none'
'-fno-asynchronous-unwind-table

[Bug middle-end/105342] [Extended Asm]Memory barrier geater than a function call

2022-04-21 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105342

--- Comment #4 from 。 <570070308 at qq dot com> ---
(In reply to Richard Biener from comment #1)
> Is it really important though?

The doc says that "The asm statement allows you to include assembly
instructions directly within C code. This may help you to maximize performance
in time-sensitive code or to access assembly instructions that are not readily
available to C programs.", and I use extended-asm rather than writing a whole
function with assembly just for maximize performance.

I try my best for not using "memory" clobber, but in some cases I have to use
it, for example, using the asm to operate a list structure like the `struct
list_head` in Linux. It is impossible to list all the list elements in the
Input/OutputOperands.

[Bug web/107494] New: -ffinite-loops is not enable by default

2022-11-01 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107494

Bug ID: 107494
   Summary: -ffinite-loops is not enable by default
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: web
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

According to https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

```
-ffinite-loops
Assume that a loop with an exit will eventually take the exit and not loop
indefinitely. This allows the compiler to remove loops that otherwise have no
side-effects, not considering eventual endless looping as such.

This option is enabled by default at -O2 for C++ with -std=c++11 or higher.
```


How ever, it was not enable by default at -O2 for C++ with -std=c++11 or
higher:

```
ig@ig-virtual-machine:~/igc$ /usr/lib/gcc/x86_64-linux-gnu/12/cc1plus
-std=c++11 -O2 --help | grep finite-loops
  -ffinite-loops[disabled]
ig@ig-virtual-machine:~/igc$ /usr/lib/gcc/x86_64-linux-gnu/12/cc1plus
-std=gnu++23 -Ofast --help | grep finite-loops
  -ffinite-loops[disabled]
```

gcc version:
```
ig@ig-virtual-machine:~/igc$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 12.2.0-3ubuntu1'
--with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.2.0 (Ubuntu 12.2.0-3ubuntu1) 
ig@ig-virtual-machine:~/igc$
```

[Bug target/81036] -fcall-saved-X does not work for floating-point registers

2022-11-05 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81036

--- Comment #2 from 。 <570070308 at qq dot com> ---
gcc 12.2 -fcall-saved-xmm0 not work too, target and host is x86-64.

[Bug c/105510] New: [12] error: initializer element is not constant

2022-05-06 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105510

Bug ID: 105510
   Summary: [12] error: initializer element is not constant
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

clang can success compile it but gcc can't. MSVC can't too, I'm not sure this
is a bug.
test.c
```
struct Test2
{
long int x;
long int y;
};

struct Test
{
long int x;
struct Test2 t;
};

struct Test t=(struct Test){1, (struct Test2){3, 4}};
```
gcc version: gcc-11.3 or gcc-12.0.1

[Bug c/105510] error: initializer element is not constant

2022-05-09 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105510

--- Comment #2 from 。 <570070308 at qq dot com> ---
(In reply to Richard Biener from comment #1)
> As a workaround it works with
> 
> struct Test t=(struct Test){1, {3, 4}};
> 
> I don't think it your way of writing is actually valid though.

Yes, I'm not sure that this is a Bug. I meet this error when I'm using macro to
init a struct. For example:
```test.c
struct Test2
{
long int x;
long int y;
};

struct Test
{
long int x;
struct Test2 t;
};

#define TEST2_INIT(x, y) ((struct Test2){x, y})

#define TEST_INIT(x) ((struct Test){x, TEST2_INIT(1, 2)})

struct Test test=TEST_INIT(0);  // error
```

[Bug middle-end/109484] New: [Wrong Code][inline-asm] output operands overlap with output

2023-04-12 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109484

Bug ID: 109484
   Summary: [Wrong Code][inline-asm] output operands overlap with
output
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 570070308 at qq dot com
  Target Milestone: ---

For code:
```c
void kkk(void **const pp)
{
void *temp;
__asm__ volatile (
"movq   %1, %0\n\t"
"movq   $0, %1"
:"=r"(temp), "+m"(*pp)
:
:);
__asm__ volatile(""::"D"(temp):);
}
```

After compile with -O3:
```assemble
kkk:
movq   (%rdi), %rdi
movq   $0, (%rdi)   # %rdi overlap, abort if %rdi == NULL
ret
```



I think there's nothing wrong with this c code according to gcc inline asm doc:
``` From GCC DOC
GCC may allocate the output operand in the same register as an unrelated input
operand, on the assumption that the assembler code consumes its inputs before
producing outputs. 
```
The C code do read *pp first, then write the output.




I think according to gcc's doc, an output operand(without'&') will only overlap
to input operands.
``` From GCC DOC
Operands using the ‘+’ constraint modifier count as two operands (that is, both
as input and output) towards the total maximum of 30 operands per asm
statement.

Use the ‘&’ constraint modifier (see Modifiers) on all output operands that
must not overlap an input. Otherwise, GCC may allocate the output operand in
the same register as an unrelated input operand, on the assumption that the
assembler code consumes its inputs before producing outputs. This assumption
may be false if the assembler code actually consists of more than one
instruction.

The same problem can occur if one output parameter (a) allows a register
constraint and another output parameter (b) allows a memory constraint. The
code generated by GCC to access the memory address in b can contain registers
which might be shared by a, and GCC considers those registers to be inputs to
the asm. As above, GCC assumes that such input registers are consumed before
any outputs are written. This assumption may result in incorrect behavior if
the asm statement writes to a before using b. Combining the ‘&’ modifier with
the register constraint on a ensures that modifying a does not affect the
address referenced by b. Otherwise, the location of b is undefined if a is
modified before using b.
```

[Bug middle-end/109484] [Wrong Code][inline-asm] output operands overlap with output

2023-04-12 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109484

--- Comment #2 from 。 <570070308 at qq dot com> ---
(In reply to Richard Biener from comment #1)
> but you clobber 'temp' early and fail to indicate that so GCC allocates the
> same register as part of the "+m" output.

The requirements you describe are not reflected in the documentation. The
document only says that `GCC assumpts that the assembler code consumes its
inputs before producing outputs`, and this code fits the assumption. First, it
reads the input from %1, then write the output to %0, then write the output to
%1. No outputs happend before inputs.

[Bug middle-end/109484] [Wrong Code][inline-asm] output operands overlap with output

2023-04-12 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109484

--- Comment #5 from 。 <570070308 at qq dot com> ---
(In reply to Richard Biener from comment #3)
> (In reply to 。 from comment #2)
> > (In reply to Richard Biener from comment #1)
> > > but you clobber 'temp' early and fail to indicate that so GCC allocates 
> > > the
> > > same register as part of the "+m" output.
> > 
> > The requirements you describe are not reflected in the documentation. The
> > document only says that `GCC assumpts that the assembler code consumes its
> > inputs before producing outputs`, and this code fits the assumption. First,
> > it reads the input from %1, then write the output to %0, then write the
> > output to %1. No outputs happend before inputs.
> 
> You first write to 'temp' and then read from it.  The wording applies to the
> assigned register / address, _not_ to the C variables mapped.
> 
> Note I'm not an expert here and I wonder if an output operand is the
> appropriate
> way to create a scratch register for arbitrary use.

The second instruction is $0, not %0.

[Bug middle-end/109484] [Wrong Code][inline-asm] output operands overlap with output

2023-04-12 Thread 570070308 at qq dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109484

--- Comment #6 from 。 <570070308 at qq dot com> ---
A better testcase:
```c
void kkk(void **const pp)
{
void *temp;
__asm__ volatile (
"movq   $0xff, %0\n\t"
"movq   $0xff, %1"
:"=r"(temp), "=m"(*pp)
:
:);
__asm__ volatile(""::"D"(temp):);
}
```

After compile:
```assemble
kkk:
movq   $0xff, %rdi
movq   $0xff, (%rdi)# abort
ret
```