Re: [RFC6 PATCH v6 00/21] ILP32 for ARM64 - LTP results
On Fri, Apr 22, 2016 at 8:37 PM, Zhangjian (Bamvor) wrote: > Hi, Yury > > > On 2016/4/6 6:44, Yury Norov wrote: >> >> There are about 20 failing tests of 782 in lite scenario. >> float_bessel >> float_exp_log >> float_iperb >> float_power >> float_trigo >> pipeio_1 >> pipeio_3 >> pipeio_5 >> pipeio_8 >> abort01 >> clone02 >> kill11 >> mmap16 >> open12 >> pause01 >> rename11 >> rmdir02 >> umount2_01 >> umount2_02 >> umount2_03 >> utime06 >> mtest06 >> >> The list is rough because some tests fail not every time. >> >> Tests abort01 and kill11 fail for lp64 too, so maybe there's >> a reason unrelated to ilp32 itself. >> >> float_xxx tests fail because they call unwind() from signal context, >> and GCC for ilp32 has problem with it, as Andrew told. > > Is there some progress about this issue. When we talk about unwind > functions, do you mean the function in libgcc? > > We encountered another issue(abort not segfault) which also called > pthread_cancel(). The test code is in the attachment. Here is the > backtrace: Yes this was a known issue I knew about. I have a patch GCC to fix this. Basically REG_VALUE_IN_UNWIND_CONTEXT needs to be defined while building libgcc to support the correct unwind information. I will be posting a GCC patch to fix this tomorrow. This was a bug even in the original set of ilp32 patches. I only finally was able to sit down and fix it today. Thanks, Andrew > > ``` > Program received signal SIGABRT, Aborted. > [Switching to Thread 0xf77ee330 (LWP 2958)] > 0x0040f5bc in raise (sig=sig@entry=6) > at ../sysdeps/unix/sysv/linux/raise.c:55 > 55 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. > (gdb) bt > #0 0x0040f5bc in raise (sig=sig@entry=6) > at ../sysdeps/unix/sysv/linux/raise.c:55 > #1 0x0040f884 in abort () at abort.c:89 > > #2 0x004073b4 in uw_update_context_1 ( > context=context@entry=0xf77ec820, fs=fs@entry=0xf77ebec8) > at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1430 > > #3 0x004078c0 in uw_update_context > (context=context@entry=0xf77ec820, > fs=fs@entry=0xf77ebec8) >at > /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1506 > #4 0x00407a9c in uw_advance_context (fs=0xf77ebec8, > context=0xf77ec820) > at > /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1529 > #5 _Unwind_ForcedUnwind_Phase2 (exc=exc@entry=0xf77ee580, > context=context@entry=0xf77ec820) > at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind.inc:185 > #6 0x00408228 in _Unwind_ForcedUnwind (exc=0xf77ee580, > stop=stop@entry=0x405440 , stop_argument=0xf77eddd8) > at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind.inc:207 > #7 0x004055c4 in __pthread_unwind (buf=) > at unwind.c:126 > #8 0x004050b4 in __do_cancel () at ./pthreadP.h:283 > #9 sigcancel_handler (sig=, si=, > ctx=) at nptl-init.c:225 > ---Type to continue, or q to quit--- > #10 > > #11 0x in ?? () > > #12 0x00423084 in __select (nfds=-1, readfds=, > writefds=, exceptfds=, timeout=0x0) > at ../sysdeps/unix/sysv/linux/generic/select.c:45 > #13 0x00400604 in TEST_TaskDelay ( > uiMillSecs=) > at test-cancel.c:18 > #14 0x00400680 in printids ( > s=) > at test-cancel.c:38 > #15 0x004006d0 in thr_fn ( > arg=) > at test-cancel.c:49 > #16 0x00401b28 in start_thread (arg=0x4a3000) at > pthread_create.c:335 > #17 0x00401b28 in start_thread (arg=0x4a3000) at > pthread_create.c:335 > Backtrace stopped: previous frame identical to this frame (corrupt stack?) > ``` > > Such abort is raise by the following code: > ``` > static void > uw_update_context_1 (struct _Unwind_Context *context, _Unwind_FrameState > *fs) > { > //... > /* Compute this frame's CFA. */ > switch (fs->regs.cfa_how) > { > case CFA_REG_OFFSET: > cfa = _Unwind_GetPtr (&orig_context, fs->regs.cfa_reg); > cfa += fs->regs.cfa_offset; > break; > > case CFA_EXP: > { > const unsigned char *exp = fs->regs.cfa_exp; > _uleb128_t len; > > exp = read_uleb128 (exp, &len); > cfa = (void *) (_Unwind_Ptr) > execute_stack_op (exp, exp + len, &orig_context, 0); > break; > } > > default: > gcc_unreachable (); > } > context->cfa = cfa; > //... > } > `` > > Any suggestion is appreciated. > > CC gcc mailing list. Sorry if it is off topic. > > Regards > > Bamvor > > > > >> pipeio_x tests are very unstable and may fail randomly. I strongly >> suspect race conditions, as they all work like a charm if pinned to >> single CPU with taskset. Probably, race is the reason of clone02 too. >> Though I'm not sure, is the race in kernel, glibc or test itself. >> >> But I know for sure that pause01 fails due to test design: >> if (setitimer(ITIMER_REAL, &it, NULL)) /
Where to find global var declaration
Hello, I tried to add a new global declaration of a pointer and I expected to see it in varpool nodes, but it does not appear there. ustackptr = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier ("ustackptr"), build_pointer_type(void_type_node)); TREE_ADDRESSABLE (ustackptr) = 1; TREE_USED (ustackptr) = 1; rest_of_decl_compilation (ustackptr, 1, 0); and struct varpool_node *node; FOR_EACH_VARIABLE (node) { fprintf(stdout, "%s\n", get_name(node->decl)); } Thanks!
RE: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL
Hi , > -Original Message- > From: Ilya Enkovich [mailto:enkovich@gmail.com] > Sent: Tuesday, April 26, 2016 7:09 PM > To: Kumar, Venkataramanan > Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak > (ubiz...@gmail.com) > Subject: Re: Question on TARGET_MMX and > X86_TUNE_GENERAL_REGS_SSE_SPILL > > 2016-04-14 8:39 GMT+03:00 Kumar, Venkataramanan > : > > Hi, > > > > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE regs > instead of memory. > > > > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune- > ctrl=general_regs_sse_spill. > > I did not find any code differences. > > > > Looking at the below code to enable this tune, mmx ISA needs to be turned > off. > > > > static reg_class_t > > ix86_spill_class (reg_class_t rclass, machine_mode mode) { > > if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! > TARGET_MMX > > && (mode == SImode || (TARGET_64BIT && mode == DImode)) > > && rclass != NO_REGS && INTEGER_CLASS_P (rclass)) > > return ALL_SSE_REGS; > > return NO_REGS; > > } > > > > All processor variants enable MMX by default and why we need to switch > off mmx? > > That really looks weird to me. I ran SPEC2006 on Ofast + LTO with and > without -mno-mmx and -mno-mmx gives (Haswell machine): > > SPEC2006INT :+0.30% > SPEC2006FP :+0.60% > SPEC2006ALL :+0.48% > > Which is quite surprising for disabling a hardware feature hardly used > anywhere now. As I said without mmx (-mno-mmx), the tune X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now. Not sure if there are any other reason. > > > Thanks, > Ilya > > > > > Thanks and regards, > > Venkat. Regards, Venkat.
GCC 6.1 Released
After slightly more than a year since last major GCC release, we are proud to announce new major GCC release, 6.1. GCC 6.1 is a major release containing substantial new functionality not available in GCC 5.x or previous GCC releases. The C++ frontend now defaults to C++14 standard instead of C++98 it has been defaulting to previously, for compiling older C++ code that might require either explicitly compiling with selected older C++ standards, or might require some code adjustment, see http://gcc.gnu.org/gcc-6/porting_to.html for details. The experimental C++17 support has been enhanced in this release. This releases features various improvements in the emitted diagnostics, including improved locations, location ranges, suggestions for misspelled identifiers, option names etc., fix-it hints and a couple of new warnings have been added. The OpenMP 4.5 specification is fully supported in this new release, the compiler can be configured for OpenMP offloading to Intel XeonPhi Knights Landing and AMD HSAIL. The OpenACC 2.0a specification support has been much improved, with offloading to NVidia PTX. The optimizers have been improved, with improvements appearing in all of intra-procedural optimizations, inter-procedural optimizations, link time optimizations and various target backends. See https://gcc.gnu.org/gcc-6/changes.html for more information about changes in GCC 6.1. This release is available from the FTP servers listed here: http://www.gnu.org/order/ftp.html The release is in gcc/gcc-6.1.0/ subdirectory. If you encounter difficulties using GCC 6.1, please do not contact me directly. Instead, please visit http://gcc.gnu.org for information about getting help. Driving a leading free software project such as GNU Compiler Collection would not be possible without support from its many contributors. Not to only mention its developers but especially its regular testers and users which contribute to its high quality. The list of individuals is too large to thank individually!
GCC 6.1.1 Status Report (2015-05-27)
Status == GCC 6.1 has been released, branches/gcc-6-branch now identifies itself as 6.1.1 and is now open again under the usual release branch rules (regression fixes and documentation fixes only). The next release, 6.2, should be released in about two or three months from now, unless something very urgent forces us to release earlier. Quality Data Priority # Change from last report --- --- P10+- 0 P2 79- 1 P3 15+ 6 P4 100+ 1 P5 29+- 0 --- --- Total P1-P3 94+ 5 Total 224+ 6 Previous Report === https://gcc.gnu.org/ml/gcc/2016-04/msg00103.html
Re: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL
2016-04-27 14:35 GMT+03:00 Kumar, Venkataramanan : > Hi , > >> -Original Message- >> From: Ilya Enkovich [mailto:enkovich@gmail.com] >> Sent: Tuesday, April 26, 2016 7:09 PM >> To: Kumar, Venkataramanan >> Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak >> (ubiz...@gmail.com) >> Subject: Re: Question on TARGET_MMX and >> X86_TUNE_GENERAL_REGS_SSE_SPILL >> >> 2016-04-14 8:39 GMT+03:00 Kumar, Venkataramanan >> : >> > Hi, >> > >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE regs >> instead of memory. >> > >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune- >> ctrl=general_regs_sse_spill. >> > I did not find any code differences. >> > >> > Looking at the below code to enable this tune, mmx ISA needs to be turned >> off. >> > >> > static reg_class_t >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) { >> > if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! >> TARGET_MMX >> > && (mode == SImode || (TARGET_64BIT && mode == DImode)) >> > && rclass != NO_REGS && INTEGER_CLASS_P (rclass)) >> > return ALL_SSE_REGS; >> > return NO_REGS; >> > } >> > >> > All processor variants enable MMX by default and why we need to switch >> off mmx? >> >> That really looks weird to me. I ran SPEC2006 on Ofast + LTO with and >> without -mno-mmx and -mno-mmx gives (Haswell machine): >> >> SPEC2006INT :+0.30% >> SPEC2006FP :+0.60% >> SPEC2006ALL :+0.48% >> >> Which is quite surprising for disabling a hardware feature hardly used >> anywhere now. > > As I said without mmx (-mno-mmx), the tune X86_TUNE_GENERAL_REGS_SSE_SPILL > may be active now. > Not sure if there are any other reason. Surely that should be the main reason I see performance gain. So I want to ask the same question as you did: why does this important performance feature requires disabled MMX. This restriction exists from the very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in trunk) and no comments on why we have this restriction. Did you try to remove !TARGET_MMX and see what happens? Thanks, Ilya > >> >> >> Thanks, >> Ilya >> >> > >> > Thanks and regards, >> > Venkat. > > Regards, > Venkat.
Update gcc 7.0.0 status on main page?
Hi all, the web page at http://gcc.gnu.org still links to the gcc 7 status report from March 10, but there is a more recent one from April 15. Could this please be updated? Cheers, Martin Reinecke
Re: Where to find global var declaration
On Wed, 2016-04-27 at 12:34 +0300, Cristina Georgiana Opriceana wrote: > Hello, > > I tried to add a new global declaration of a pointer and I expected > to > see it in varpool nodes, but it does not appear there. > > ustackptr = build_decl (UNKNOWN_LOCATION, > VAR_DECL, get_identifier ("ustackptr"), > build_pointer_type(void_type_node)); > TREE_ADDRESSABLE (ustackptr) = 1; > TREE_USED (ustackptr) = 1; > rest_of_decl_compilation (ustackptr, 1, 0); > > and > > struct varpool_node *node; > FOR_EACH_VARIABLE (node) { > fprintf(stdout, "%s\n", get_name(node->decl)); > } FWIW, in the the jit "frontend", I wasn't aware of rest_of_decl_compilation. Instead I have the following code for creating a global variable, which calls varpool_node::get_create and varpool_node::finalize_decl directly on the VAR_DECL instance. That said, maybe rest_of_decl_compilation is the best approach, but I'm not sure why it isn't working for you. (I'm not an expert at this, I copied from the C frontend and hacked it up till it worked). This is from gcc/jit/jit-playback.c (which has a family of wrapper classes around "tree", but hopefully the idea is clear): /* Construct a playback::lvalue instance (wrapping a tree). */ playback::lvalue * playback::context:: new_global (location *loc, enum gcc_jit_global_kind kind, type *type, const char *name) { gcc_assert (type); gcc_assert (name); tree inner = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier (name), type->as_tree ()); TREE_PUBLIC (inner) = (kind != GCC_JIT_GLOBAL_INTERNAL); DECL_COMMON (inner) = 1; switch (kind) { default: gcc_unreachable (); case GCC_JIT_GLOBAL_EXPORTED: TREE_STATIC (inner) = 1; break; case GCC_JIT_GLOBAL_INTERNAL: TREE_STATIC (inner) = 1; break; case GCC_JIT_GLOBAL_IMPORTED: DECL_EXTERNAL (inner) = 1; break; } if (loc) set_tree_location (inner, loc); varpool_node::get_create (inner); varpool_node::finalize_decl (inner); m_globals.safe_push (inner); return new lvalue (this, inner); } Hope this is helpful Dave
RE: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL
Hi, > -Original Message- > From: Ilya Enkovich [mailto:enkovich@gmail.com] > Sent: Wednesday, April 27, 2016 5:35 PM > To: Kumar, Venkataramanan > Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak > (ubiz...@gmail.com) > Subject: Re: Question on TARGET_MMX and > X86_TUNE_GENERAL_REGS_SSE_SPILL > > 2016-04-27 14:35 GMT+03:00 Kumar, Venkataramanan > : > > Hi , > > > >> -Original Message- > >> From: Ilya Enkovich [mailto:enkovich@gmail.com] > >> Sent: Tuesday, April 26, 2016 7:09 PM > >> To: Kumar, Venkataramanan > >> Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak > >> (ubiz...@gmail.com) > >> Subject: Re: Question on TARGET_MMX and > >> X86_TUNE_GENERAL_REGS_SSE_SPILL > >> > >> 2016-04-14 8:39 GMT+03:00 Kumar, Venkataramanan > >> : > >> > Hi, > >> > > >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE > >> > regs > >> instead of memory. > >> > > >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune- > >> ctrl=general_regs_sse_spill. > >> > I did not find any code differences. > >> > > >> > Looking at the below code to enable this tune, mmx ISA needs to be > >> > turned > >> off. > >> > > >> > static reg_class_t > >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) { > >> > if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! > >> TARGET_MMX > >> > && (mode == SImode || (TARGET_64BIT && mode == DImode)) > >> > && rclass != NO_REGS && INTEGER_CLASS_P (rclass)) > >> > return ALL_SSE_REGS; > >> > return NO_REGS; > >> > } > >> > > >> > All processor variants enable MMX by default and why we need to > >> > switch > >> off mmx? > >> > >> That really looks weird to me. I ran SPEC2006 on Ofast + LTO with > >> and without -mno-mmx and -mno-mmx gives (Haswell machine): > >> > >> SPEC2006INT :+0.30% > >> SPEC2006FP :+0.60% > >> SPEC2006ALL :+0.48% > >> > >> Which is quite surprising for disabling a hardware feature hardly > >> used anywhere now. > > > > As I said without mmx (-mno-mmx), the tune > X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now. > > Not sure if there are any other reason. > > Surely that should be the main reason I see performance gain. > So I want to ask the same question as you did: why does this important > performance feature requires disabled MMX. This restriction exists from the > very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in > trunk) and no comments on why we have this restriction. I was told by Uros, that using TARGET_MMX is to prevent intreg <-> MMX moves that clobber stack registers. > > Did you try to remove !TARGET_MMX and see what happens? > Yes, I tried on SPEC2006 but did not find any benefit. > Thanks, > Ilya > > > > >> > >> > >> Thanks, > >> Ilya > >> > >> > > >> > Thanks and regards, > >> > Venkat. > > Regards, Venkat.
Re: Where to find global var declaration
On Wed, Apr 27, 2016 at 4:54 PM, David Malcolm wrote: > On Wed, 2016-04-27 at 12:34 +0300, Cristina Georgiana Opriceana wrote: >> Hello, >> >> I tried to add a new global declaration of a pointer and I expected >> to >> see it in varpool nodes, but it does not appear there. >> >> ustackptr = build_decl (UNKNOWN_LOCATION, >> VAR_DECL, get_identifier ("ustackptr"), >> build_pointer_type(void_type_node)); >> TREE_ADDRESSABLE (ustackptr) = 1; >> TREE_USED (ustackptr) = 1; >> rest_of_decl_compilation (ustackptr, 1, 0); >> >> and >> >> struct varpool_node *node; >> FOR_EACH_VARIABLE (node) { >> fprintf(stdout, "%s\n", get_name(node->decl)); >> } > > FWIW, in the the jit "frontend", I wasn't aware of > rest_of_decl_compilation. Instead I have the following code for > creating a global variable, which calls varpool_node::get_create and > varpool_node::finalize_decl directly on the VAR_DECL instance. > > That said, maybe rest_of_decl_compilation is the best approach, but I'm > not sure why it isn't working for you. (I'm not an expert at this, I > copied from the C frontend and hacked it up till it worked). > > This is from gcc/jit/jit-playback.c (which has a family of wrapper classes > around "tree", but hopefully the idea is clear): > > /* Construct a playback::lvalue instance (wrapping a tree). */ > > playback::lvalue * > playback::context:: > new_global (location *loc, > enum gcc_jit_global_kind kind, > type *type, > const char *name) > { > gcc_assert (type); > gcc_assert (name); > tree inner = build_decl (UNKNOWN_LOCATION, VAR_DECL, >get_identifier (name), >type->as_tree ()); > TREE_PUBLIC (inner) = (kind != GCC_JIT_GLOBAL_INTERNAL); > DECL_COMMON (inner) = 1; > switch (kind) > { > default: > gcc_unreachable (); > > case GCC_JIT_GLOBAL_EXPORTED: > TREE_STATIC (inner) = 1; > break; > > case GCC_JIT_GLOBAL_INTERNAL: > TREE_STATIC (inner) = 1; > break; > > case GCC_JIT_GLOBAL_IMPORTED: > DECL_EXTERNAL (inner) = 1; > break; > } > > if (loc) > set_tree_location (inner, loc); > > varpool_node::get_create (inner); > > varpool_node::finalize_decl (inner); > > m_globals.safe_push (inner); > > return new lvalue (this, inner); > } > > > Hope this is helpful I've checked the rest_of_decl_compilation for your steps and apparently I missed to set the storage to be static. I thought it would be automatically set on 1 for global vars. Thanks! Cristina > Dave
Re: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL
2016-04-27 17:06 GMT+03:00 Kumar, Venkataramanan : > Hi, > >> -Original Message- >> From: Ilya Enkovich [mailto:enkovich@gmail.com] >> Sent: Wednesday, April 27, 2016 5:35 PM >> To: Kumar, Venkataramanan >> Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak >> (ubiz...@gmail.com) >> Subject: Re: Question on TARGET_MMX and >> X86_TUNE_GENERAL_REGS_SSE_SPILL >> >> 2016-04-27 14:35 GMT+03:00 Kumar, Venkataramanan >> : >> > Hi , >> > >> >> -Original Message- >> >> From: Ilya Enkovich [mailto:enkovich@gmail.com] >> >> Sent: Tuesday, April 26, 2016 7:09 PM >> >> To: Kumar, Venkataramanan >> >> Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak >> >> (ubiz...@gmail.com) >> >> Subject: Re: Question on TARGET_MMX and >> >> X86_TUNE_GENERAL_REGS_SSE_SPILL >> >> >> >> 2016-04-14 8:39 GMT+03:00 Kumar, Venkataramanan >> >> : >> >> > Hi, >> >> > >> >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE >> >> > regs >> >> instead of memory. >> >> > >> >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune- >> >> ctrl=general_regs_sse_spill. >> >> > I did not find any code differences. >> >> > >> >> > Looking at the below code to enable this tune, mmx ISA needs to be >> >> > turned >> >> off. >> >> > >> >> > static reg_class_t >> >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) { >> >> > if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! >> >> TARGET_MMX >> >> > && (mode == SImode || (TARGET_64BIT && mode == DImode)) >> >> > && rclass != NO_REGS && INTEGER_CLASS_P (rclass)) >> >> > return ALL_SSE_REGS; >> >> > return NO_REGS; >> >> > } >> >> > >> >> > All processor variants enable MMX by default and why we need to >> >> > switch >> >> off mmx? >> >> >> >> That really looks weird to me. I ran SPEC2006 on Ofast + LTO with >> >> and without -mno-mmx and -mno-mmx gives (Haswell machine): >> >> >> >> SPEC2006INT :+0.30% >> >> SPEC2006FP :+0.60% >> >> SPEC2006ALL :+0.48% >> >> >> >> Which is quite surprising for disabling a hardware feature hardly >> >> used anywhere now. >> > >> > As I said without mmx (-mno-mmx), the tune >> X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now. >> > Not sure if there are any other reason. >> >> Surely that should be the main reason I see performance gain. >> So I want to ask the same question as you did: why does this important >> performance feature requires disabled MMX. This restriction exists from the >> very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in >> trunk) and no comments on why we have this restriction. > > I was told by Uros, that using TARGET_MMX is to prevent intreg <-> MMX moves > that clobber stack registers. ix86_spill_class is supposed to return a register class to be used to store general purpose registers. It returns ALL_SSE_REGS which doesn't intersect with MMX_REGS class. So I don't see why intreg <-> MMX moves may appear. And if those moves appear we should fix it, not disable the whole feature. @Uros, do you have a comment here? Thanks, Ilya > >> >> Did you try to remove !TARGET_MMX and see what happens? >> > Yes, I tried on SPEC2006 but did not find any benefit. > >> Thanks, >> Ilya >> >> > >> >> >> >> >> >> Thanks, >> >> Ilya >> >> >> >> > >> >> > Thanks and regards, >> >> > Venkat. >> > > > Regards, > Venkat.
Re: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL
On Wed, Apr 27, 2016 at 4:26 PM, Ilya Enkovich wrote: >>> >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE >>> >> > regs >>> >> instead of memory. >>> >> > >>> >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune- >>> >> ctrl=general_regs_sse_spill. >>> >> > I did not find any code differences. >>> >> > >>> >> > Looking at the below code to enable this tune, mmx ISA needs to be >>> >> > turned >>> >> off. >>> >> > >>> >> > static reg_class_t >>> >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) { >>> >> > if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! >>> >> TARGET_MMX >>> >> > && (mode == SImode || (TARGET_64BIT && mode == DImode)) >>> >> > && rclass != NO_REGS && INTEGER_CLASS_P (rclass)) >>> >> > return ALL_SSE_REGS; >>> >> > return NO_REGS; >>> >> > } >>> >> > >>> >> > All processor variants enable MMX by default and why we need to >>> >> > switch >>> >> off mmx? >>> >> >>> >> That really looks weird to me. I ran SPEC2006 on Ofast + LTO with >>> >> and without -mno-mmx and -mno-mmx gives (Haswell machine): >>> >> >>> >> SPEC2006INT :+0.30% >>> >> SPEC2006FP :+0.60% >>> >> SPEC2006ALL :+0.48% >>> >> >>> >> Which is quite surprising for disabling a hardware feature hardly >>> >> used anywhere now. >>> > >>> > As I said without mmx (-mno-mmx), the tune >>> X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now. >>> > Not sure if there are any other reason. >>> >>> Surely that should be the main reason I see performance gain. >>> So I want to ask the same question as you did: why does this important >>> performance feature requires disabled MMX. This restriction exists from the >>> very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in >>> trunk) and no comments on why we have this restriction. >> >> I was told by Uros, that using TARGET_MMX is to prevent intreg <-> MMX >> moves that clobber stack registers. > > ix86_spill_class is supposed to return a register class to be used > to store general purpose registers. It returns ALL_SSE_REGS which > doesn't intersect with MMX_REGS class. So I don't see why > intreg <-> MMX moves may appear. And if those moves appear we should > fix it, not disable the whole feature. > > @Uros, do you have a comment here? Looking at the implementation of ix86_spill_class, TARGET_MMX check really looks too restrictive. However, we need to check TARGET_SSE2 and TARGET_INTERUNIT_MOVES instead, otherwise movq xmm <-> intreg pattern gets disabled This change should be OK then, but just in case, SSE2 enabled -mfpmath=i387 32bit SPEC run should uncover unwanted MMX instructions. Uros.
Re: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL
On Wed, Apr 27, 2016 at 4:39 PM, Uros Bizjak wrote: > On Wed, Apr 27, 2016 at 4:26 PM, Ilya Enkovich wrote: > >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE >> > regs >> instead of memory. >> > >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune- >> ctrl=general_regs_sse_spill. >> > I did not find any code differences. >> > >> > Looking at the below code to enable this tune, mmx ISA needs to be >> > turned >> off. >> > >> > static reg_class_t >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) { >> > if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! >> TARGET_MMX >> > && (mode == SImode || (TARGET_64BIT && mode == DImode)) >> > && rclass != NO_REGS && INTEGER_CLASS_P (rclass)) >> > return ALL_SSE_REGS; >> > return NO_REGS; >> > } >> > >> > All processor variants enable MMX by default and why we need to >> > switch >> off mmx? >> >> That really looks weird to me. I ran SPEC2006 on Ofast + LTO with >> and without -mno-mmx and -mno-mmx gives (Haswell machine): >> >> SPEC2006INT :+0.30% >> SPEC2006FP :+0.60% >> SPEC2006ALL :+0.48% >> >> Which is quite surprising for disabling a hardware feature hardly >> used anywhere now. > > As I said without mmx (-mno-mmx), the tune X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now. > Not sure if there are any other reason. Surely that should be the main reason I see performance gain. So I want to ask the same question as you did: why does this important performance feature requires disabled MMX. This restriction exists from the very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in trunk) and no comments on why we have this restriction. >>> >>> I was told by Uros, that using TARGET_MMX is to prevent intreg <-> MMX >>> moves that clobber stack registers. >> >> ix86_spill_class is supposed to return a register class to be used >> to store general purpose registers. It returns ALL_SSE_REGS which >> doesn't intersect with MMX_REGS class. So I don't see why >> intreg <-> MMX moves may appear. And if those moves appear we should >> fix it, not disable the whole feature. >> >> @Uros, do you have a comment here? > > Looking at the implementation of ix86_spill_class, TARGET_MMX check > really looks too restrictive. However, we need to check TARGET_SSE2 > and TARGET_INTERUNIT_MOVES instead, otherwise movq xmm <-> intreg > pattern gets disabled I'm testing following patch: --cut here-- Index: i386.c === --- i386.c (revision 235516) +++ i386.c (working copy) @@ -53560,9 +53560,12 @@ static reg_class_t ix86_spill_class (reg_class_t rclass, machine_mode mode) { - if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! TARGET_MMX + if (TARGET_GENERAL_REGS_SSE_SPILL + && TARGET_SSE2 + && TARGET_INTER_UNIT_MOVES_TO_VEC + && TARGET_INTER_UNIT_MOVES_FROM_VEC && (mode == SImode || (TARGET_64BIT && mode == DImode)) - && rclass != NO_REGS && INTEGER_CLASS_P (rclass)) + && INTEGER_CLASS_P (rclass)) return ALL_SSE_REGS; return NO_REGS; } --cut here-- Uros.
Re: [RFC6 PATCH v6 00/21] ILP32 for ARM64 - LTP results
On Wed, Apr 27, 2016 at 12:30 AM, Andrew Pinski wrote: > On Fri, Apr 22, 2016 at 8:37 PM, Zhangjian (Bamvor) > wrote: >> Hi, Yury >> >> >> On 2016/4/6 6:44, Yury Norov wrote: >>> >>> There are about 20 failing tests of 782 in lite scenario. >>> float_bessel >>> float_exp_log >>> float_iperb >>> float_power >>> float_trigo >>> pipeio_1 >>> pipeio_3 >>> pipeio_5 >>> pipeio_8 >>> abort01 >>> clone02 >>> kill11 >>> mmap16 >>> open12 >>> pause01 >>> rename11 >>> rmdir02 >>> umount2_01 >>> umount2_02 >>> umount2_03 >>> utime06 >>> mtest06 >>> >>> The list is rough because some tests fail not every time. >>> >>> Tests abort01 and kill11 fail for lp64 too, so maybe there's >>> a reason unrelated to ilp32 itself. >>> >>> float_xxx tests fail because they call unwind() from signal context, >>> and GCC for ilp32 has problem with it, as Andrew told. >> >> Is there some progress about this issue. When we talk about unwind >> functions, do you mean the function in libgcc? >> >> We encountered another issue(abort not segfault) which also called >> pthread_cancel(). The test code is in the attachment. Here is the >> backtrace: > > Yes this was a known issue I knew about. I have a patch GCC to fix > this. Basically REG_VALUE_IN_UNWIND_CONTEXT needs to be defined while > building libgcc to support the correct unwind information. > I will be posting a GCC patch to fix this tomorrow. This was a bug > even in the original set of ilp32 patches. I only finally was able to > sit down and fix it today. Here is the link to the GCC patch which I said was going to submit today: https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01726.html Thanks, Andrew > > > Thanks, > Andrew > >> >> ``` >> Program received signal SIGABRT, Aborted. >> [Switching to Thread 0xf77ee330 (LWP 2958)] >> 0x0040f5bc in raise (sig=sig@entry=6) >> at ../sysdeps/unix/sysv/linux/raise.c:55 >> 55 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. >> (gdb) bt >> #0 0x0040f5bc in raise (sig=sig@entry=6) >> at ../sysdeps/unix/sysv/linux/raise.c:55 >> #1 0x0040f884 in abort () at abort.c:89 >> >> #2 0x004073b4 in uw_update_context_1 ( >> context=context@entry=0xf77ec820, fs=fs@entry=0xf77ebec8) >> at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1430 >> >> #3 0x004078c0 in uw_update_context >> (context=context@entry=0xf77ec820, >> fs=fs@entry=0xf77ebec8) >>at >> /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1506 >> #4 0x00407a9c in uw_advance_context (fs=0xf77ebec8, >> context=0xf77ec820) >> at >> /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1529 >> #5 _Unwind_ForcedUnwind_Phase2 (exc=exc@entry=0xf77ee580, >> context=context@entry=0xf77ec820) >> at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind.inc:185 >> #6 0x00408228 in _Unwind_ForcedUnwind (exc=0xf77ee580, >> stop=stop@entry=0x405440 , stop_argument=0xf77eddd8) >> at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind.inc:207 >> #7 0x004055c4 in __pthread_unwind (buf=) >> at unwind.c:126 >> #8 0x004050b4 in __do_cancel () at ./pthreadP.h:283 >> #9 sigcancel_handler (sig=, si=, >> ctx=) at nptl-init.c:225 >> ---Type to continue, or q to quit--- >> #10 >> >> #11 0x in ?? () >> >> #12 0x00423084 in __select (nfds=-1, readfds=, >> writefds=, exceptfds=, timeout=0x0) >> at ../sysdeps/unix/sysv/linux/generic/select.c:45 >> #13 0x00400604 in TEST_TaskDelay ( >> uiMillSecs=) >> at test-cancel.c:18 >> #14 0x00400680 in printids ( >> s=) >> at test-cancel.c:38 >> #15 0x004006d0 in thr_fn ( >> arg=) >> at test-cancel.c:49 >> #16 0x00401b28 in start_thread (arg=0x4a3000) at >> pthread_create.c:335 >> #17 0x00401b28 in start_thread (arg=0x4a3000) at >> pthread_create.c:335 >> Backtrace stopped: previous frame identical to this frame (corrupt stack?) >> ``` >> >> Such abort is raise by the following code: >> ``` >> static void >> uw_update_context_1 (struct _Unwind_Context *context, _Unwind_FrameState >> *fs) >> { >> //... >> /* Compute this frame's CFA. */ >> switch (fs->regs.cfa_how) >> { >> case CFA_REG_OFFSET: >> cfa = _Unwind_GetPtr (&orig_context, fs->regs.cfa_reg); >> cfa += fs->regs.cfa_offset; >> break; >> >> case CFA_EXP: >> { >> const unsigned char *exp = fs->regs.cfa_exp; >> _uleb128_t len; >> >> exp = read_uleb128 (exp, &len); >> cfa = (void *) (_Unwind_Ptr) >> execute_stack_op (exp, exp + len, &orig_context, 0); >> break; >> } >> >> default: >> gcc_unreachable (); >> } >> context->cfa = cfa; >> //... >> } >> `` >> >> Any suggestion is appreciated. >> >> CC gcc mailing list. Sorry if it is off topic. >> >> Regards >> >> Bamvor >> >> >> >> >>> pipeio_x tests are
gcc-4.9-20160427 is now available
Snapshot gcc-4.9-20160427 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160427/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.9 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch revision 235537 You'll find: gcc-4.9-20160427.tar.bz2 Complete GCC MD5=f525275b0d646be9cb2293ac219a325e SHA1=71f295cd00023419e161513460633b52aa9f24ba Diffs from 4.9-20160420 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: SafeStack proposal in GCC
On 04/13/2016 07:01 AM, Cristina Georgiana Opriceana wrote: Hello, I bring to your attention SafeStack, part of a bigger research project - CPI/CPS [1], which offers complete protection against stack-based control flow hijacks. I am interested in developing SafeStack for GCC and I would like to ask for your feedback on this proposal. SafeStack is a security mechanism that protects against stack based control flow attacks, while also keeping a low runtime overhead - it prevents all stack-based attacks in the RIPE benchmark, and has just 0.05% overhead on average on SPEC CPU2006 benchmarks [2]. Safestack has been recently merged into the Clang/LLVM mainline [3]. Its design is based on the separation of stack-allocated memory objects in two regions: the safe stack, where we keep the return addresses, spilled registers and local variables proved to be only accessed in a safe way by a static analysis pass at compilation, and the regular region, where we move everything else. With this separation and randomized-based isolation of the safe stack, we ensure that no overflows from the unsafe stack can overwrite sensitive data from the safe stack. Further on, the isolation mechanism can be improved to use hardware segment protection or hardware extensions, such as Intel Memory Protection Keys. We aim to extend all of CPI into the GNU userland, but start with a SafeStack port in GCC. In GCC, we propose a design composed of an instrumentation module (implemented as a GIMPLE pass) and a runtime library. The instrumentation pass will perform static analysis to discover stack objects that are only accessed in a safe way. It will also insert code that allocates a stack frame for the rest of the objects, those that did not satisfy the safety condition. The pass will run independently, after GIMPLE lowering, scheduled on the all_passes list and after other optimizations, such as dead code elimination. Then, all accesses to unsafe objects have to be re-written, based on the new stack base and offset in the unsafe stack. In the first phase of the implementation, the unsafe stack will be allocated on the heap, and we will rely on ASLR for the isolation. The runtime support will have to deal with unsafe stack allocation - a hook in the pthread create/destroy functions to create per-thread stack regions. This runtime support might be reused from the Clang implementation. This all sounds good. And I'd definitely look to re-use the runtime and perhaps tests from Clang. Jeff
How to avoid instrumenting function in a particular section?
Is it possible to avoid instrumenting functions (-finstrument-functions) if they are in a particular section?