[Bug sanitizer/61422] New: False Asan positive in libopus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61422 Bug ID: 61422 Summary: False Asan positive in libopus Product: gcc Version: 4.10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: m.zakirov at samsung dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org Bug is reproducible on simple test like: //Issue was taken from libopus, ffmpeg library #define NLSF_QUANT_DEL_DEC_STATES 4 #define MAX_LPC_ORDER 16 int main() { int ind_min_max = 1, ind_max_min = 3; char ind[ NLSF_QUANT_DEL_DEC_STATES ][ MAX_LPC_ORDER ]; __asm ("\n" : "=m"(ind_min_max), "=m"(ind_max_min)); memcpy( ind[ ind_max_min ], ind[ ind_min_max ], MAX_LPC_ORDER * sizeof( char ) ); return 0; } Output will be: ==20809==ERROR: AddressSanitizer: unknown-crash on address 0x7fff92804b00 at pc 0x4f934d bp 0x7fff92804a20 sp 0x7fff92804a00 WRITE of size 16 at 0x7fff92804b00 thread T0 #0 0x4f934c in main (/home/mzakirov/proj/found_bugs/asan_bug/res.out+0x4f934c) #1 0x7f32dd7d776c in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2176c) #2 0x40c308 (/home/mzakirov/proj/found_bugs/asan_bug/res.out+0x40c308) Address 0x7fff92804b00 is located in stack of thread T0 at offset 208 in frame #0 0x4f921b in main (/home/mzakirov/proj/found_bugs/asan_bug/res.out+0x4f921b) This frame has 3 object(s): [32, 36) 'ind_min_max' [96, 100) 'ind_max_min' [160, 224) 'ind' <== Memory access at offset 208 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: unknown-crash ??:0 main Shadow bytes around the buggy address: 0x1000724f8910: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000724f8920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000724f8930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000724f8940: 00 00 00 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f2 f2 0x1000724f8950: f2 f2 04 f4 f4 f4 f2 f2 f2 f2 00 00 00 00 00 00 =>0x1000724f8960:[00]00 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 0x1000724f8970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000724f8980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000724f8990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000724f89a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000724f89b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user:f7 Container overflow: fc ASan internal: fe ==20809==ABORTING
[Bug sanitizer/61422] False Asan positive in libopus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61422 --- Comment #1 from Marat Zakirov --- Created attachment 32896 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=32896&action=edit Proposed patch Only tested with asan testsuite on x64.
[Bug sanitizer/61422] False Asan positive in libopus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61422 --- Comment #3 from Marat Zakirov --- I fix it.
[Bug sanitizer/61422] False Asan positive in libopus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61422 --- Comment #5 from Marat Zakirov --- Thank you for your quick response Jacub. Actually I take this issue from existing ffmpeg source so the test is just a truncated version. Following fail in my 4.10 without discovered fix: cat test.c #define N 4 #define M 16 int main () { int i = 1, j = 3; char ind[N][M]; __builtin_memset( ind, 0, M * N); __asm ("\n" : "+m"(i), "+m"(j)); __builtin_memcpy (ind[j], ind[i], M * 1); return 0; } gcc test.c -fsanitize=address -static-libasan ./a.out
[Bug sanitizer/61422] False Asan positive in libopus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61422 --- Comment #6 from Marat Zakirov --- Created attachment 32898 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=32898&action=edit Proposed patch Try this. It is mostly the same. No additional patches is needed. I hope it's reproducible.
[Bug translation/61561] New: arm gcc internal error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61561 Bug ID: 61561 Summary: arm gcc internal error Product: gcc Version: 4.10.0 Status: UNCONFIRMED Severity: major Priority: P3 Component: translation Assignee: unassigned at gcc dot gnu.org Reporter: m.zakirov at samsung dot com Created attachment 32973 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=32973&action=edit Proposed patch To reproduce the issue do: 1) Configure gcc for arm as traget and x86 as host. $ ./configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=arm-v7a15v5r2-linux-gnueabi --prefix=/home/mzakirov/proj/gcc_arm_ref/arm-v7a15v5r2 --with-sysroot=/your_arm/sys-root 2) make -j6 3) make install 4) Compile following: $ cat ex.c int dummy(int a); char a; void mmm (void) { char dyn[ dummy(3) ]; a = (char)&dyn[0]; } $ gcc -O3 ex.c -o ex.o ex.c: In function ‘mmm’: ex.c:8:7: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] a = (char)&dyn[0]; ^ ex.c:9:1: internal compiler error: in check_rtl, at lra.c:1919 } ^ 0x8689e5 check_rtl /home/mzakirov/proj/gcc_arm_ref/build.arm.cortex-a15/sources/gcc_1/gcc/lra.c:1919 0x86bde5 lra(_IO_FILE*) /home/mzakirov/proj/gcc_arm_ref/build.arm.cortex-a15/sources/gcc_1/gcc/lra.c:2309 0x82b02e do_reload /home/mzakirov/proj/gcc_arm_ref/build.arm.cortex-a15/sources/gcc_1/gcc/ira.c:5325 0x82b02e execute /home/mzakirov/proj/gcc_arm_ref/build.arm.cortex-a15/sources/gcc_1/gcc/ira.c:5486 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. Analysis showed that the problem in instruction like: (insn 26 12 18 2 (set (reg:QI 2 r2) (reg:QI 13 sp)) ex.c:8 205 {*arm_movqi_insn} (nil)) gcc cse propogates sp register to (char/short) cast from a pointer (32 bit value) and reload phase fails because gcc do not founds applicable template for loading a byte or word from sp. Adding correct template to arm.md fixes the issue. See attached patch. Note: that issue appeared while translating linux kernel for arm.
[Bug c/65088] New: Does GCC has load/store widening phase?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65088 Bug ID: 65088 Summary: Does GCC has load/store widening phase? Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: m.zakirov at samsung dot com This example says me that it doesn't. $ cat t2.c int a[2]; int b[2]; int main () { b[0] = a[0]; b[1] = a[1]; return 0; } $ gcc t2.c -O3 -S $ cat t2.s ... main: .LFB0: .cfi_startproc movla(%rip), %eax movl%eax, b(%rip) movla+4(%rip), %eax movl%eax, b+4(%rip) xorl%eax, %eax ret .cfi_endproc gcc version is: commit 71464ecd3a554b889c3bbc53d8874fc532bdf953 Author: trippels Date: Mon Jan 12 07:53:10 2015 +
[Bug c/65088] Does GCC has load/store widening phase?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65088 --- Comment #3 from Marat Zakirov --- > I think this has been discussed on the gcc mailing list Marek could you please share some resuting conclusion at least for x86 platform? Why didn't x86 GCC RTL fold these loads/stores?
[Bug sanitizer/63806] New: #UBSAN ignores signed char possible overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63806 Bug ID: 63806 Summary: #UBSAN ignores signed char possible overflow Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: m.zakirov at samsung dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org For the following example GCC with ubsan do not constructs UBSAN_ADD_CHECK for signed char return value. signed char a; signed char b; signed char foo () { return a + b; } Dump after ubsan foo () { signed char a.0_2; unsigned char a.1_3; signed char b.2_4; unsigned char b.3_5; unsigned char _6; signed char _7; : a.0_2 = a; a.1_3 = (unsigned char) a.0_2; b.2_4 = b; b.3_5 = (unsigned char) b.2_4; _6 = a.1_3 + b.3_5; _7 = (signed char) _6; return _7; } Command line to reproduce gcc -O3 t.c -fsanitize=signed-integer-overflow -fdump-tree-ubsan -S
[Bug target/43725] Poor instructions selection, scheduling and registers allocation for ARM NEON intrinsics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43725 Marat Zakirov changed: What|Removed |Added CC||joseph at codesourcery dot com, ||m.zakirov at samsung dot com --- Comment #7 from Marat Zakirov --- Another neon alloc issue. Code: #include #include extern uint16x8x4_t m0; extern uint16x8x4_t m1; void foo(uint16_t * in_ptr) { uint16x8x4_t t0, t1; t0 = vld4q_u16((uint16_t *)&in_ptr[0 ]); t1 = vld4q_u16((uint16_t *)&in_ptr[64]); t0.val[0] *= 333; t0.val[1] *= 333; t0.val[2] *= 333; t0.val[3] *= 333; t1.val[0] *= 333; t1.val[1] *= 333; t1.val[2] *= 333; t1.val[3] *= 333; m0 = t0; m1 = t1; } Asm file: .vsave {d8, d9, d10, d11, d12, d13, d14, d15} add r1, r0, #160 vld4.16 {d8, d10, d12, d14}, [r0] add r0, r0, #32 .pad #64 sub sp, sp, #64 vld4.16 {d16, d18, d20, d22}, [r2] movwr3, #:lower16:m1 movwr2, #:lower16:m0 vldrd6, .L3 vldrd7, .L3+8 movtr3, #:upper16:m1 movtr2, #:upper16:m0 vld4.16 {d9, d11, d13, d15}, [r0] vld4.16 {d17, d19, d21, d23}, [r1] vmul.i16q12, q3, q4 vstmia sp, {d16-d23} <<< * vld1.64 {d4-d5}, [sp:64] <<< * vmul.i16q13, q3, q5 <<< ** vmul.i16q9, q3, q9 vmul.i16q14, q3, q6 <<< ** vmul.i16q10, q3, q10 vmul.i16q8, q3, q2 <<< **, *** vmul.i16q15, q3, q7 <<< ** vmul.i16q11, q3, q11 vstmia r2, {d24-d31} vstmia r3, {d16-d23} add sp, sp, #64 @ sp needed fldmfdd sp!, {d8-d15} bx lr So my qustion are: 1) Why do we need * and why compiler used q2 in *** ? 2) Why compiler didn't reuse registers q5,q6,q2,q7 in ** ? Command line: cc1 -quiet -v t.c -quiet -dumpbase t.c -mfpu=neon -mcpu=cortex-a15 -mfloat-abi=softfp -marm -mtls-dialect=gnu -auxbase-strip t.s -O3 -Wno-error=unused-local-typedefs -version -fdump-tree-all -fdump-rtl-all -funwind-tables -o t.s gcc version = 4.10.0 --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=arm-v7a15v5r2-linux-gnueabi --Marat
[Bug target/61561] arm gcc internal error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61561 Marat Zakirov changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Marat Zakirov --- fixed
[Bug sanitizer/61875] ATRIBUTE_NONNULL macro error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61875 Marat Zakirov changed: What|Removed |Added CC||m.zakirov at samsung dot com --- Comment #2 from Marat Zakirov --- We had the same problem in Tizen. But my Q to asan and gcc maintainers is wider. Do we consider that asan in gcc should build with option -fexeptions? Or should it ignore this option? Or fail as it is for know?
[Bug regression/61887] New: vect.exp UNRESOLVED tests
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61887 Bug ID: 61887 Summary: vect.exp UNRESOLVED tests Product: gcc Version: 4.10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: regression Assignee: unassigned at gcc dot gnu.org Reporter: m.zakirov at samsung dot com I found that some tests from vect.exp has status UNRESOLVED in cureent compiler version due to dissynchronization of compiler dumpers and tests check. Example: Test bb-slp-10.c awaits for name bb-slp-10.c.124t.slp but it gets this bb-slp-10.c.124t.slp2 Open bb-slp-10.c Change "slp" to "slp2" in ... /* { dg-final { scan-tree-dump-times "unsupported alignment in basic block." 1 "slp" { xfail vect_element_align } } } */ /* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_element_align } } } */ Test will pass or at least it won't have UNRESOLVED status. Another UNRESOLVED example vect-105-big-array.c and generaly all tests with scan-tree-dump-times and -flto option. -flto makes gcc to create file with name vect-105-big-array.exe.ltrans0.114t.vect Which is obviosly not supported too. Configuration: /home/mzakirov/proj/gcc_unalign/build.arm.cortex-a15/sources/gcc_1/configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=arm-linux-gnueabi --with-interwork --enable-long-long --enable-languages=c,c++,fortran --enable-shared --with-gnu-as --with-gnu-ld --with-arch=armv7-a Run tests: make -k check RUNTESTFLAGS='vect.exp' --Marat
[Bug regression/61887] vect.exp UNRESOLVED tests
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61887 --- Comment #1 from Marat Zakirov --- This issue is suitible for ARM
[Bug target/43725] Poor instructions selection, scheduling and registers allocation for ARM NEON intrinsics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43725 --- Comment #8 from Marat Zakirov --- UPDATE Using little fix you may got a much better code... transpose_16x16: .fnstart @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. add r2, r0, #128 vld4.16 {d24, d26, d28, d30}, [r0] add r1, r0, #160 vld4.16 {d16, d18, d20, d22}, [r2] add r0, r0, #32 movwr3, #:lower16:m1 vldrd6, .L2 vldrd7, .L2+8(in CSE) movwr2, #:lower16:m0 movtr3, #:upper16:m1 movtr2, #:upper16:m0 vld4.16 {d25, d27, d29, d31}, [r0] vld4.16 {d17, d19, d21, d23}, [r1] vmul.i16q12, q3, q12 vmul.i16q8, q3, q8 vmul.i16q13, q3, q13 vmul.i16q9, q3, q9 vmul.i16q14, q3, q14 vmul.i16q10, q3, q10 vmul.i16q15, q3, q15 vmul.i16q11, q3, q11 vstmia r2, {d24-d31} vstmia r3, {d16-d23} bx lr .L3: About fix: I discovered that GCC register allocator has 'weak' support for stream (in my case NEON) registers. RA works with stream resgisters as with unsplitible ranges. So if some register of range becomes free GCC do not reuse them untill whole range becomes free. Is actually OK, but... I found that GCC CSE phase makes partly substitution for register-ranges and this leads to terrible register pressure increse. Example Before CSE a = b a0 = a0 * 3 a1 = a1 * 3 a2 = a2 * 3 a3 = a3 * 3 After a = b a0 = b0 * 3 a1 = a1 * 3 <<< * a2 = a2 * 3 a3 = a3 * 3 CSE do not substitute b1 to a1 because at the moment (*) a0 was define so actually a != b. Yes but a1 = b1, unfortuanatly CSE also do not how to handle register-ranges parts as RA does. And I am not sure that 'unfortuanatly'. Because. a0 = b0 * 3 a1 = b1 * 3 a2 = b2 * 3 a3 = b3 * 3 Also requres x2 more stream registers than its really need to. My solution here is to forbid CSE for XImode registers.
[Bug target/43725] Poor instructions selection, scheduling and registers allocation for ARM NEON intrinsics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43725 --- Comment #9 from Marat Zakirov --- I used following patch diff --git a/gcc/cse.c b/gcc/cse.c index 34f9364..a9e0442 100644 --- a/gcc/cse.c +++ b/gcc/cse.c @@ -2862,6 +2862,9 @@ canon_reg (rtx x, rtx insn) || ! REGNO_QTY_VALID_P (REGNO (x))) return x; +if (GET_MODE (x) == XImode) + return x; + q = REG_QTY (REGNO (x)); ent = &qty_table[q]; first = ent->first_reg; diff --git a/gcc/fwprop.c b/gcc/fwprop.c index 547fcd6..eadc729 100644 --- a/gcc/fwprop.c +++ b/gcc/fwprop.c @@ -1317,6 +1317,9 @@ forward_propagate_and_simplify (df_ref use, rtx def_insn, rtx def_set) if (!new_rtx) return false; + if (GET_MODE (reg) == XImode) +return false; + return try_fwprop_subst (use, loc, new_rtx, def_insn, set_reg_equal); }
[Bug sanitizer/61422] False Asan positive in libopus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61422 Marat Zakirov changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #8 from Marat Zakirov --- Already fixed by Gribov commit.
[Bug sanitizer/61875] ATRIBUTE_NONNULL macro error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61875 --- Comment #6 from Marat Zakirov --- Created attachment 33446 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33446&action=edit Proposed patch According to https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00061.html I think this bug should be closed as invalid. If you want libsanitizer to assert when meet -fexceptions you may use attached patch.