Re: [RFC6 PATCH v6 00/21] ILP32 for ARM64 - LTP results

2016-04-27 Thread Andrew Pinski
On Fri, Apr 22, 2016 at 8:37 PM, Zhangjian (Bamvor)
 wrote:
> Hi, Yury
>
>
> On 2016/4/6 6:44, Yury Norov wrote:
>>
>> There are about 20 failing tests of 782 in lite scenario.
>> float_bessel
>> float_exp_log
>> float_iperb
>> float_power
>> float_trigo
>> pipeio_1
>> pipeio_3
>> pipeio_5
>> pipeio_8
>> abort01
>> clone02
>> kill11
>> mmap16
>> open12
>> pause01
>> rename11
>> rmdir02
>> umount2_01
>> umount2_02
>> umount2_03
>> utime06
>> mtest06
>>
>> The list is rough because some tests fail not every time.
>>
>> Tests abort01 and kill11 fail for lp64 too, so maybe there's
>> a reason unrelated to ilp32 itself.
>>
>> float_xxx tests fail because they call unwind() from signal context,
>> and GCC for ilp32 has problem with it, as Andrew told.
>
> Is there some progress about this issue. When we talk about unwind
> functions, do you mean the function in libgcc?
>
> We encountered another issue(abort not segfault) which also called
> pthread_cancel(). The test code is in the attachment. Here is the
> backtrace:

Yes this was a known issue I knew about.  I have a patch GCC to fix
this.  Basically REG_VALUE_IN_UNWIND_CONTEXT needs to be defined while
building libgcc to support the correct unwind information.
I will be posting a GCC patch to fix this tomorrow.  This was a bug
even in the original set of ilp32 patches.  I only finally was able to
sit down and fix it today.


Thanks,
Andrew

>
> ```
> Program received signal SIGABRT, Aborted.
> [Switching to Thread 0xf77ee330 (LWP 2958)]
> 0x0040f5bc in raise (sig=sig@entry=6)
> at ../sysdeps/unix/sysv/linux/raise.c:55
> 55  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb) bt
> #0  0x0040f5bc in raise (sig=sig@entry=6)
> at ../sysdeps/unix/sysv/linux/raise.c:55
> #1  0x0040f884 in abort () at abort.c:89
>
> #2  0x004073b4 in uw_update_context_1 (
> context=context@entry=0xf77ec820, fs=fs@entry=0xf77ebec8)
> at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1430
>
> #3  0x004078c0 in uw_update_context
> (context=context@entry=0xf77ec820,
> fs=fs@entry=0xf77ebec8)
>at
> /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1506
> #4  0x00407a9c in uw_advance_context (fs=0xf77ebec8,
> context=0xf77ec820)
> at
> /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1529
> #5  _Unwind_ForcedUnwind_Phase2 (exc=exc@entry=0xf77ee580,
> context=context@entry=0xf77ec820)
> at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind.inc:185
> #6  0x00408228 in _Unwind_ForcedUnwind (exc=0xf77ee580,
> stop=stop@entry=0x405440 , stop_argument=0xf77eddd8)
> at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind.inc:207
> #7  0x004055c4 in __pthread_unwind (buf=)
> at unwind.c:126
> #8  0x004050b4 in __do_cancel () at ./pthreadP.h:283
> #9  sigcancel_handler (sig=, si=,
> ctx=) at nptl-init.c:225
> ---Type  to continue, or q  to quit---
> #10 
>
> #11 0x in ?? ()
>
> #12 0x00423084 in __select (nfds=-1, readfds=,
> writefds=, exceptfds=, timeout=0x0)
> at ../sysdeps/unix/sysv/linux/generic/select.c:45
> #13 0x00400604 in TEST_TaskDelay (
> uiMillSecs=)
> at test-cancel.c:18
> #14 0x00400680 in printids (
> s=)
> at test-cancel.c:38
> #15 0x004006d0 in thr_fn (
> arg=)
> at test-cancel.c:49
> #16 0x00401b28 in start_thread (arg=0x4a3000) at
> pthread_create.c:335
> #17 0x00401b28 in start_thread (arg=0x4a3000) at
> pthread_create.c:335
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> ```
>
> Such abort is raise by the following code:
> ```
> static void
> uw_update_context_1 (struct _Unwind_Context *context, _Unwind_FrameState
> *fs)
> {
> //...
>   /* Compute this frame's CFA.  */
>   switch (fs->regs.cfa_how)
> {
> case CFA_REG_OFFSET:
>   cfa = _Unwind_GetPtr (&orig_context, fs->regs.cfa_reg);
>   cfa += fs->regs.cfa_offset;
>   break;
>
> case CFA_EXP:
>   {
> const unsigned char *exp = fs->regs.cfa_exp;
> _uleb128_t len;
>
> exp = read_uleb128 (exp, &len);
> cfa = (void *) (_Unwind_Ptr)
>   execute_stack_op (exp, exp + len, &orig_context, 0);
> break;
>   }
>
> default:
>   gcc_unreachable ();
> }
>   context->cfa = cfa;
> //...
> }
> ``
>
> Any suggestion is appreciated.
>
> CC gcc mailing list. Sorry if it is off topic.
>
> Regards
>
> Bamvor
>
>
>
>
>> pipeio_x tests are very unstable and may fail randomly. I strongly
>> suspect race conditions, as they all work like a charm if pinned to
>> single CPU with taskset. Probably, race is the reason of clone02 too.
>> Though I'm not sure, is the race in kernel, glibc or test itself.
>>
>> But I know for sure that pause01 fails due to test design:
>> if (setitimer(ITIMER_REAL, &it, NULL)) /

Where to find global var declaration

2016-04-27 Thread Cristina Georgiana Opriceana
Hello,

I tried to add a new global declaration of a pointer and I expected to
see it in varpool nodes, but it does not appear there.

ustackptr = build_decl (UNKNOWN_LOCATION,
 VAR_DECL, get_identifier ("ustackptr"),
 build_pointer_type(void_type_node));
TREE_ADDRESSABLE (ustackptr) = 1;
TREE_USED (ustackptr) = 1;
rest_of_decl_compilation (ustackptr, 1, 0);

and

struct varpool_node *node;
FOR_EACH_VARIABLE (node) {
fprintf(stdout, "%s\n", get_name(node->decl));
}

Thanks!


RE: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL

2016-04-27 Thread Kumar, Venkataramanan
Hi ,

> -Original Message-
> From: Ilya Enkovich [mailto:enkovich@gmail.com]
> Sent: Tuesday, April 26, 2016 7:09 PM
> To: Kumar, Venkataramanan 
> Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak
> (ubiz...@gmail.com) 
> Subject: Re: Question on TARGET_MMX and
> X86_TUNE_GENERAL_REGS_SSE_SPILL
> 
> 2016-04-14 8:39 GMT+03:00 Kumar, Venkataramanan
> :
> > Hi,
> >
> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE regs
> instead of memory.
> >
> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune-
> ctrl=general_regs_sse_spill.
> > I did not find any code differences.
> >
> > Looking at the below code to enable this tune,  mmx ISA needs to be turned
> off.
> >
> > static reg_class_t
> > ix86_spill_class (reg_class_t rclass, machine_mode mode) {
> >   if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && !
> TARGET_MMX
> >   && (mode == SImode || (TARGET_64BIT && mode == DImode))
> >   && rclass != NO_REGS && INTEGER_CLASS_P (rclass))
> > return ALL_SSE_REGS;
> >   return NO_REGS;
> > }
> >
> > All processor variants enable MMX by default  and why we need to switch
> off mmx?
> 
> That really looks weird to me.  I ran SPEC2006 on Ofast + LTO with and
> without -mno-mmx and -mno-mmx gives (Haswell machine):
> 
> SPEC2006INT :+0.30%
> SPEC2006FP  :+0.60%
> SPEC2006ALL :+0.48%
> 
> Which is quite surprising for disabling a hardware feature hardly used
> anywhere now.

As I said without mmx (-mno-mmx), the tune X86_TUNE_GENERAL_REGS_SSE_SPILL may 
be active now.
Not sure if there are any other reason.

> 
> 
> Thanks,
> Ilya
> 
> >
> > Thanks and regards,
> > Venkat.

Regards,
Venkat.


GCC 6.1 Released

2016-04-27 Thread Jakub Jelinek
After slightly more than a year since last major GCC release, we are proud
to announce new major GCC release, 6.1.

GCC 6.1 is a major release containing substantial new
functionality not available in GCC 5.x or previous GCC releases.

The C++ frontend now defaults to C++14 standard instead of C++98 it has
been defaulting to previously, for compiling older C++ code that might
require either explicitly compiling with selected older C++ standards,
or might require some code adjustment, see
http://gcc.gnu.org/gcc-6/porting_to.html for details.  The experimental
C++17 support has been enhanced in this release.

This releases features various improvements in the emitted diagnostics,
including improved locations, location ranges, suggestions for misspelled
identifiers, option names etc., fix-it hints and a couple of new warnings
have been added.

The OpenMP 4.5 specification is fully supported in this new release, the
compiler can be configured for OpenMP offloading to Intel XeonPhi Knights
Landing and AMD HSAIL.  The OpenACC 2.0a specification support has been
much improved, with offloading to NVidia PTX.

The optimizers have been improved, with improvements appearing in all of
intra-procedural optimizations, inter-procedural optimizations,
link time optimizations and various target backends.

See

  https://gcc.gnu.org/gcc-6/changes.html

for more information about changes in GCC 6.1.

This release is available from the FTP servers listed here:

 http://www.gnu.org/order/ftp.html

The release is in gcc/gcc-6.1.0/ subdirectory.

If you encounter difficulties using GCC 6.1, please do not contact me
directly.  Instead, please visit http://gcc.gnu.org for information
about getting help.

Driving a leading free software project such as GNU Compiler Collection
would not be possible without support from its many contributors.
Not to only mention its developers but especially its regular testers
and users which contribute to its high quality.  The list of individuals
is too large to thank individually!


GCC 6.1.1 Status Report (2015-05-27)

2016-04-27 Thread Jakub Jelinek
Status
==

GCC 6.1 has been released, branches/gcc-6-branch now identifies itself as
6.1.1 and is now open again under the usual release branch rules (regression
fixes and documentation fixes only).
The next release, 6.2, should be released in about two or three months
from now, unless something very urgent forces us to release earlier.

Quality Data


Priority  #   Change from last report
---   ---
P10+-  0
P2   79-   1
P3   15+   6
P4  100+   1
P5   29+-  0
---   ---
Total P1-P3  94+   5
Total   224+   6

Previous Report
===

https://gcc.gnu.org/ml/gcc/2016-04/msg00103.html


Re: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL

2016-04-27 Thread Ilya Enkovich
2016-04-27 14:35 GMT+03:00 Kumar, Venkataramanan :
> Hi ,
>
>> -Original Message-
>> From: Ilya Enkovich [mailto:enkovich@gmail.com]
>> Sent: Tuesday, April 26, 2016 7:09 PM
>> To: Kumar, Venkataramanan 
>> Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak
>> (ubiz...@gmail.com) 
>> Subject: Re: Question on TARGET_MMX and
>> X86_TUNE_GENERAL_REGS_SSE_SPILL
>>
>> 2016-04-14 8:39 GMT+03:00 Kumar, Venkataramanan
>> :
>> > Hi,
>> >
>> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE regs
>> instead of memory.
>> >
>> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune-
>> ctrl=general_regs_sse_spill.
>> > I did not find any code differences.
>> >
>> > Looking at the below code to enable this tune,  mmx ISA needs to be turned
>> off.
>> >
>> > static reg_class_t
>> > ix86_spill_class (reg_class_t rclass, machine_mode mode) {
>> >   if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && !
>> TARGET_MMX
>> >   && (mode == SImode || (TARGET_64BIT && mode == DImode))
>> >   && rclass != NO_REGS && INTEGER_CLASS_P (rclass))
>> > return ALL_SSE_REGS;
>> >   return NO_REGS;
>> > }
>> >
>> > All processor variants enable MMX by default  and why we need to switch
>> off mmx?
>>
>> That really looks weird to me.  I ran SPEC2006 on Ofast + LTO with and
>> without -mno-mmx and -mno-mmx gives (Haswell machine):
>>
>> SPEC2006INT :+0.30%
>> SPEC2006FP  :+0.60%
>> SPEC2006ALL :+0.48%
>>
>> Which is quite surprising for disabling a hardware feature hardly used
>> anywhere now.
>
> As I said without mmx (-mno-mmx), the tune X86_TUNE_GENERAL_REGS_SSE_SPILL 
> may be active now.
> Not sure if there are any other reason.

Surely that should be the main reason I see performance gain.
So I want to ask the same question as you did: why does this
important performance feature requires disabled MMX.  This
restriction exists from the very start of X86_TUNE_GENERAL_REGS_SSE_SPILL
existence (at least in trunk) and no comments on why we have
this restriction.

Did you try to remove !TARGET_MMX and see what happens?

Thanks,
Ilya

>
>>
>>
>> Thanks,
>> Ilya
>>
>> >
>> > Thanks and regards,
>> > Venkat.
>
> Regards,
> Venkat.


Update gcc 7.0.0 status on main page?

2016-04-27 Thread Martin Reinecke
Hi all,

the web page at http://gcc.gnu.org still links to the gcc 7 status
report from March 10, but there is a more recent one from April 15.
Could this please be updated?

Cheers,
  Martin Reinecke


Re: Where to find global var declaration

2016-04-27 Thread David Malcolm
On Wed, 2016-04-27 at 12:34 +0300, Cristina Georgiana Opriceana wrote:
> Hello,
> 
> I tried to add a new global declaration of a pointer and I expected
> to
> see it in varpool nodes, but it does not appear there.
> 
> ustackptr = build_decl (UNKNOWN_LOCATION,
>  VAR_DECL, get_identifier ("ustackptr"),
>  build_pointer_type(void_type_node));
> TREE_ADDRESSABLE (ustackptr) = 1;
> TREE_USED (ustackptr) = 1;
> rest_of_decl_compilation (ustackptr, 1, 0);
> 
> and
> 
> struct varpool_node *node;
> FOR_EACH_VARIABLE (node) {
> fprintf(stdout, "%s\n", get_name(node->decl));
> }

FWIW, in the the jit "frontend", I wasn't aware of
rest_of_decl_compilation. Instead I have the following code for
creating a global variable, which calls varpool_node::get_create and
varpool_node::finalize_decl directly on the VAR_DECL instance.

That said, maybe rest_of_decl_compilation is the best approach, but I'm
not sure why it isn't working for you.  (I'm not an expert at this, I
copied from the C frontend and hacked it up till it worked).

This is from gcc/jit/jit-playback.c (which has a family of wrapper classes 
around "tree", but hopefully the idea is clear):

/* Construct a playback::lvalue instance (wrapping a tree).  */

playback::lvalue *
playback::context::
new_global (location *loc,
enum gcc_jit_global_kind kind,
type *type,
const char *name)
{
  gcc_assert (type);
  gcc_assert (name);
  tree inner = build_decl (UNKNOWN_LOCATION, VAR_DECL,
   get_identifier (name),
   type->as_tree ());
  TREE_PUBLIC (inner) = (kind != GCC_JIT_GLOBAL_INTERNAL);
  DECL_COMMON (inner) = 1;
  switch (kind)
{
default:
  gcc_unreachable ();

case GCC_JIT_GLOBAL_EXPORTED:
  TREE_STATIC (inner) = 1;
  break;

case GCC_JIT_GLOBAL_INTERNAL:
  TREE_STATIC (inner) = 1;
  break;

case GCC_JIT_GLOBAL_IMPORTED:
  DECL_EXTERNAL (inner) = 1;
  break;
}

  if (loc)
set_tree_location (inner, loc);

  varpool_node::get_create (inner);

  varpool_node::finalize_decl (inner);

  m_globals.safe_push (inner);

  return new lvalue (this, inner);
}


Hope this is helpful
Dave


RE: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL

2016-04-27 Thread Kumar, Venkataramanan
Hi, 

> -Original Message-
> From: Ilya Enkovich [mailto:enkovich@gmail.com]
> Sent: Wednesday, April 27, 2016 5:35 PM
> To: Kumar, Venkataramanan 
> Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak
> (ubiz...@gmail.com) 
> Subject: Re: Question on TARGET_MMX and
> X86_TUNE_GENERAL_REGS_SSE_SPILL
> 
> 2016-04-27 14:35 GMT+03:00 Kumar, Venkataramanan
> :
> > Hi ,
> >
> >> -Original Message-
> >> From: Ilya Enkovich [mailto:enkovich@gmail.com]
> >> Sent: Tuesday, April 26, 2016 7:09 PM
> >> To: Kumar, Venkataramanan 
> >> Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak
> >> (ubiz...@gmail.com) 
> >> Subject: Re: Question on TARGET_MMX and
> >> X86_TUNE_GENERAL_REGS_SSE_SPILL
> >>
> >> 2016-04-14 8:39 GMT+03:00 Kumar, Venkataramanan
> >> :
> >> > Hi,
> >> >
> >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE
> >> > regs
> >> instead of memory.
> >> >
> >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune-
> >> ctrl=general_regs_sse_spill.
> >> > I did not find any code differences.
> >> >
> >> > Looking at the below code to enable this tune,  mmx ISA needs to be
> >> > turned
> >> off.
> >> >
> >> > static reg_class_t
> >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) {
> >> >   if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && !
> >> TARGET_MMX
> >> >   && (mode == SImode || (TARGET_64BIT && mode == DImode))
> >> >   && rclass != NO_REGS && INTEGER_CLASS_P (rclass))
> >> > return ALL_SSE_REGS;
> >> >   return NO_REGS;
> >> > }
> >> >
> >> > All processor variants enable MMX by default  and why we need to
> >> > switch
> >> off mmx?
> >>
> >> That really looks weird to me.  I ran SPEC2006 on Ofast + LTO with
> >> and without -mno-mmx and -mno-mmx gives (Haswell machine):
> >>
> >> SPEC2006INT :+0.30%
> >> SPEC2006FP  :+0.60%
> >> SPEC2006ALL :+0.48%
> >>
> >> Which is quite surprising for disabling a hardware feature hardly
> >> used anywhere now.
> >
> > As I said without mmx (-mno-mmx), the tune
> X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now.
> > Not sure if there are any other reason.
> 
> Surely that should be the main reason I see performance gain.
> So I want to ask the same question as you did: why does this important
> performance feature requires disabled MMX.  This restriction exists from the
> very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in
> trunk) and no comments on why we have this restriction.

I was told by Uros,  that using TARGET_MMX is to prevent intreg <-> MMX moves 
that clobber stack registers.

> 
> Did you try to remove !TARGET_MMX and see what happens?
> 
Yes, I tried on SPEC2006 but did not find any benefit.

> Thanks,
> Ilya
> 
> >
> >>
> >>
> >> Thanks,
> >> Ilya
> >>
> >> >
> >> > Thanks and regards,
> >> > Venkat.
> >

 Regards,
Venkat.


Re: Where to find global var declaration

2016-04-27 Thread Cristina Georgiana Opriceana
On Wed, Apr 27, 2016 at 4:54 PM, David Malcolm  wrote:
> On Wed, 2016-04-27 at 12:34 +0300, Cristina Georgiana Opriceana wrote:
>> Hello,
>>
>> I tried to add a new global declaration of a pointer and I expected
>> to
>> see it in varpool nodes, but it does not appear there.
>>
>> ustackptr = build_decl (UNKNOWN_LOCATION,
>>  VAR_DECL, get_identifier ("ustackptr"),
>>  build_pointer_type(void_type_node));
>> TREE_ADDRESSABLE (ustackptr) = 1;
>> TREE_USED (ustackptr) = 1;
>> rest_of_decl_compilation (ustackptr, 1, 0);
>>
>> and
>>
>> struct varpool_node *node;
>> FOR_EACH_VARIABLE (node) {
>> fprintf(stdout, "%s\n", get_name(node->decl));
>> }
>
> FWIW, in the the jit "frontend", I wasn't aware of
> rest_of_decl_compilation. Instead I have the following code for
> creating a global variable, which calls varpool_node::get_create and
> varpool_node::finalize_decl directly on the VAR_DECL instance.
>
> That said, maybe rest_of_decl_compilation is the best approach, but I'm
> not sure why it isn't working for you.  (I'm not an expert at this, I
> copied from the C frontend and hacked it up till it worked).
>
> This is from gcc/jit/jit-playback.c (which has a family of wrapper classes 
> around "tree", but hopefully the idea is clear):
>
> /* Construct a playback::lvalue instance (wrapping a tree).  */
>
> playback::lvalue *
> playback::context::
> new_global (location *loc,
> enum gcc_jit_global_kind kind,
> type *type,
> const char *name)
> {
>   gcc_assert (type);
>   gcc_assert (name);
>   tree inner = build_decl (UNKNOWN_LOCATION, VAR_DECL,
>get_identifier (name),
>type->as_tree ());
>   TREE_PUBLIC (inner) = (kind != GCC_JIT_GLOBAL_INTERNAL);
>   DECL_COMMON (inner) = 1;
>   switch (kind)
> {
> default:
>   gcc_unreachable ();
>
> case GCC_JIT_GLOBAL_EXPORTED:
>   TREE_STATIC (inner) = 1;
>   break;
>
> case GCC_JIT_GLOBAL_INTERNAL:
>   TREE_STATIC (inner) = 1;
>   break;
>
> case GCC_JIT_GLOBAL_IMPORTED:
>   DECL_EXTERNAL (inner) = 1;
>   break;
> }
>
>   if (loc)
> set_tree_location (inner, loc);
>
>   varpool_node::get_create (inner);
>
>   varpool_node::finalize_decl (inner);
>
>   m_globals.safe_push (inner);
>
>   return new lvalue (this, inner);
> }
>
>
> Hope this is helpful

I've checked the rest_of_decl_compilation for your steps and
apparently I missed to set the storage to be static. I thought it
would be automatically set on 1 for global vars.

Thanks!
Cristina

> Dave


Re: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL

2016-04-27 Thread Ilya Enkovich
2016-04-27 17:06 GMT+03:00 Kumar, Venkataramanan :
> Hi,
>
>> -Original Message-
>> From: Ilya Enkovich [mailto:enkovich@gmail.com]
>> Sent: Wednesday, April 27, 2016 5:35 PM
>> To: Kumar, Venkataramanan 
>> Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak
>> (ubiz...@gmail.com) 
>> Subject: Re: Question on TARGET_MMX and
>> X86_TUNE_GENERAL_REGS_SSE_SPILL
>>
>> 2016-04-27 14:35 GMT+03:00 Kumar, Venkataramanan
>> :
>> > Hi ,
>> >
>> >> -Original Message-
>> >> From: Ilya Enkovich [mailto:enkovich@gmail.com]
>> >> Sent: Tuesday, April 26, 2016 7:09 PM
>> >> To: Kumar, Venkataramanan 
>> >> Cc: vmaka...@redhat.com; gcc@gcc.gnu.org; Uros Bizjak
>> >> (ubiz...@gmail.com) 
>> >> Subject: Re: Question on TARGET_MMX and
>> >> X86_TUNE_GENERAL_REGS_SSE_SPILL
>> >>
>> >> 2016-04-14 8:39 GMT+03:00 Kumar, Venkataramanan
>> >> :
>> >> > Hi,
>> >> >
>> >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE
>> >> > regs
>> >> instead of memory.
>> >> >
>> >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune-
>> >> ctrl=general_regs_sse_spill.
>> >> > I did not find any code differences.
>> >> >
>> >> > Looking at the below code to enable this tune,  mmx ISA needs to be
>> >> > turned
>> >> off.
>> >> >
>> >> > static reg_class_t
>> >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) {
>> >> >   if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && !
>> >> TARGET_MMX
>> >> >   && (mode == SImode || (TARGET_64BIT && mode == DImode))
>> >> >   && rclass != NO_REGS && INTEGER_CLASS_P (rclass))
>> >> > return ALL_SSE_REGS;
>> >> >   return NO_REGS;
>> >> > }
>> >> >
>> >> > All processor variants enable MMX by default  and why we need to
>> >> > switch
>> >> off mmx?
>> >>
>> >> That really looks weird to me.  I ran SPEC2006 on Ofast + LTO with
>> >> and without -mno-mmx and -mno-mmx gives (Haswell machine):
>> >>
>> >> SPEC2006INT :+0.30%
>> >> SPEC2006FP  :+0.60%
>> >> SPEC2006ALL :+0.48%
>> >>
>> >> Which is quite surprising for disabling a hardware feature hardly
>> >> used anywhere now.
>> >
>> > As I said without mmx (-mno-mmx), the tune
>> X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now.
>> > Not sure if there are any other reason.
>>
>> Surely that should be the main reason I see performance gain.
>> So I want to ask the same question as you did: why does this important
>> performance feature requires disabled MMX.  This restriction exists from the
>> very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in
>> trunk) and no comments on why we have this restriction.
>
> I was told by Uros,  that using TARGET_MMX is to prevent intreg <-> MMX moves 
> that clobber stack registers.

ix86_spill_class is supposed to return a register class to be used
to store general purpose registers.  It returns ALL_SSE_REGS which
doesn't intersect with MMX_REGS class.  So I don't see why
intreg <-> MMX moves may appear.  And if those moves appear we should
fix it, not disable the whole feature.

@Uros, do you have a comment here?

Thanks,
Ilya

>
>>
>> Did you try to remove !TARGET_MMX and see what happens?
>>
> Yes, I tried on SPEC2006 but did not find any benefit.
>
>> Thanks,
>> Ilya
>>
>> >
>> >>
>> >>
>> >> Thanks,
>> >> Ilya
>> >>
>> >> >
>> >> > Thanks and regards,
>> >> > Venkat.
>> >
>
>  Regards,
> Venkat.


Re: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL

2016-04-27 Thread Uros Bizjak
On Wed, Apr 27, 2016 at 4:26 PM, Ilya Enkovich  wrote:

>>> >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE
>>> >> > regs
>>> >> instead of memory.
>>> >> >
>>> >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune-
>>> >> ctrl=general_regs_sse_spill.
>>> >> > I did not find any code differences.
>>> >> >
>>> >> > Looking at the below code to enable this tune,  mmx ISA needs to be
>>> >> > turned
>>> >> off.
>>> >> >
>>> >> > static reg_class_t
>>> >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) {
>>> >> >   if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && !
>>> >> TARGET_MMX
>>> >> >   && (mode == SImode || (TARGET_64BIT && mode == DImode))
>>> >> >   && rclass != NO_REGS && INTEGER_CLASS_P (rclass))
>>> >> > return ALL_SSE_REGS;
>>> >> >   return NO_REGS;
>>> >> > }
>>> >> >
>>> >> > All processor variants enable MMX by default  and why we need to
>>> >> > switch
>>> >> off mmx?
>>> >>
>>> >> That really looks weird to me.  I ran SPEC2006 on Ofast + LTO with
>>> >> and without -mno-mmx and -mno-mmx gives (Haswell machine):
>>> >>
>>> >> SPEC2006INT :+0.30%
>>> >> SPEC2006FP  :+0.60%
>>> >> SPEC2006ALL :+0.48%
>>> >>
>>> >> Which is quite surprising for disabling a hardware feature hardly
>>> >> used anywhere now.
>>> >
>>> > As I said without mmx (-mno-mmx), the tune
>>> X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now.
>>> > Not sure if there are any other reason.
>>>
>>> Surely that should be the main reason I see performance gain.
>>> So I want to ask the same question as you did: why does this important
>>> performance feature requires disabled MMX.  This restriction exists from the
>>> very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in
>>> trunk) and no comments on why we have this restriction.
>>
>> I was told by Uros,  that using TARGET_MMX is to prevent intreg <-> MMX 
>> moves that clobber stack registers.
>
> ix86_spill_class is supposed to return a register class to be used
> to store general purpose registers.  It returns ALL_SSE_REGS which
> doesn't intersect with MMX_REGS class.  So I don't see why
> intreg <-> MMX moves may appear.  And if those moves appear we should
> fix it, not disable the whole feature.
>
> @Uros, do you have a comment here?

Looking at the implementation of ix86_spill_class, TARGET_MMX check
really looks too restrictive. However, we need to check TARGET_SSE2
and TARGET_INTERUNIT_MOVES instead, otherwise movq xmm <-> intreg
pattern gets disabled

This change should be OK then, but just in case, SSE2 enabled
-mfpmath=i387 32bit SPEC run should uncover unwanted MMX instructions.

Uros.


Re: Question on TARGET_MMX and X86_TUNE_GENERAL_REGS_SSE_SPILL

2016-04-27 Thread Uros Bizjak
On Wed, Apr 27, 2016 at 4:39 PM, Uros Bizjak  wrote:
> On Wed, Apr 27, 2016 at 4:26 PM, Ilya Enkovich  wrote:
>
 >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE
 >> > regs
 >> instead of memory.
 >> >
 >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune-
 >> ctrl=general_regs_sse_spill.
 >> > I did not find any code differences.
 >> >
 >> > Looking at the below code to enable this tune,  mmx ISA needs to be
 >> > turned
 >> off.
 >> >
 >> > static reg_class_t
 >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) {
 >> >   if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && !
 >> TARGET_MMX
 >> >   && (mode == SImode || (TARGET_64BIT && mode == DImode))
 >> >   && rclass != NO_REGS && INTEGER_CLASS_P (rclass))
 >> > return ALL_SSE_REGS;
 >> >   return NO_REGS;
 >> > }
 >> >
 >> > All processor variants enable MMX by default  and why we need to
 >> > switch
 >> off mmx?
 >>
 >> That really looks weird to me.  I ran SPEC2006 on Ofast + LTO with
 >> and without -mno-mmx and -mno-mmx gives (Haswell machine):
 >>
 >> SPEC2006INT :+0.30%
 >> SPEC2006FP  :+0.60%
 >> SPEC2006ALL :+0.48%
 >>
 >> Which is quite surprising for disabling a hardware feature hardly
 >> used anywhere now.
 >
 > As I said without mmx (-mno-mmx), the tune
 X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now.
 > Not sure if there are any other reason.

 Surely that should be the main reason I see performance gain.
 So I want to ask the same question as you did: why does this important
 performance feature requires disabled MMX.  This restriction exists from 
 the
 very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in
 trunk) and no comments on why we have this restriction.
>>>
>>> I was told by Uros,  that using TARGET_MMX is to prevent intreg <-> MMX 
>>> moves that clobber stack registers.
>>
>> ix86_spill_class is supposed to return a register class to be used
>> to store general purpose registers.  It returns ALL_SSE_REGS which
>> doesn't intersect with MMX_REGS class.  So I don't see why
>> intreg <-> MMX moves may appear.  And if those moves appear we should
>> fix it, not disable the whole feature.
>>
>> @Uros, do you have a comment here?
>
> Looking at the implementation of ix86_spill_class, TARGET_MMX check
> really looks too restrictive. However, we need to check TARGET_SSE2
> and TARGET_INTERUNIT_MOVES instead, otherwise movq xmm <-> intreg
> pattern gets disabled

I'm testing following patch:

--cut here--
Index: i386.c
===
--- i386.c  (revision 235516)
+++ i386.c  (working copy)
@@ -53560,9 +53560,12 @@
 static reg_class_t
 ix86_spill_class (reg_class_t rclass, machine_mode mode)
 {
-  if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! TARGET_MMX
+  if (TARGET_GENERAL_REGS_SSE_SPILL
+  && TARGET_SSE2
+  && TARGET_INTER_UNIT_MOVES_TO_VEC
+  && TARGET_INTER_UNIT_MOVES_FROM_VEC
   && (mode == SImode || (TARGET_64BIT && mode == DImode))
-  && rclass != NO_REGS && INTEGER_CLASS_P (rclass))
+  && INTEGER_CLASS_P (rclass))
 return ALL_SSE_REGS;
   return NO_REGS;
 }
--cut here--

Uros.


Re: [RFC6 PATCH v6 00/21] ILP32 for ARM64 - LTP results

2016-04-27 Thread Andrew Pinski
On Wed, Apr 27, 2016 at 12:30 AM, Andrew Pinski  wrote:
> On Fri, Apr 22, 2016 at 8:37 PM, Zhangjian (Bamvor)
>  wrote:
>> Hi, Yury
>>
>>
>> On 2016/4/6 6:44, Yury Norov wrote:
>>>
>>> There are about 20 failing tests of 782 in lite scenario.
>>> float_bessel
>>> float_exp_log
>>> float_iperb
>>> float_power
>>> float_trigo
>>> pipeio_1
>>> pipeio_3
>>> pipeio_5
>>> pipeio_8
>>> abort01
>>> clone02
>>> kill11
>>> mmap16
>>> open12
>>> pause01
>>> rename11
>>> rmdir02
>>> umount2_01
>>> umount2_02
>>> umount2_03
>>> utime06
>>> mtest06
>>>
>>> The list is rough because some tests fail not every time.
>>>
>>> Tests abort01 and kill11 fail for lp64 too, so maybe there's
>>> a reason unrelated to ilp32 itself.
>>>
>>> float_xxx tests fail because they call unwind() from signal context,
>>> and GCC for ilp32 has problem with it, as Andrew told.
>>
>> Is there some progress about this issue. When we talk about unwind
>> functions, do you mean the function in libgcc?
>>
>> We encountered another issue(abort not segfault) which also called
>> pthread_cancel(). The test code is in the attachment. Here is the
>> backtrace:
>
> Yes this was a known issue I knew about.  I have a patch GCC to fix
> this.  Basically REG_VALUE_IN_UNWIND_CONTEXT needs to be defined while
> building libgcc to support the correct unwind information.
> I will be posting a GCC patch to fix this tomorrow.  This was a bug
> even in the original set of ilp32 patches.  I only finally was able to
> sit down and fix it today.

Here is the link to the GCC patch which I said was going to submit today:
https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01726.html

Thanks,
Andrew

>
>
> Thanks,
> Andrew
>
>>
>> ```
>> Program received signal SIGABRT, Aborted.
>> [Switching to Thread 0xf77ee330 (LWP 2958)]
>> 0x0040f5bc in raise (sig=sig@entry=6)
>> at ../sysdeps/unix/sysv/linux/raise.c:55
>> 55  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
>> (gdb) bt
>> #0  0x0040f5bc in raise (sig=sig@entry=6)
>> at ../sysdeps/unix/sysv/linux/raise.c:55
>> #1  0x0040f884 in abort () at abort.c:89
>>
>> #2  0x004073b4 in uw_update_context_1 (
>> context=context@entry=0xf77ec820, fs=fs@entry=0xf77ebec8)
>> at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1430
>>
>> #3  0x004078c0 in uw_update_context
>> (context=context@entry=0xf77ec820,
>> fs=fs@entry=0xf77ebec8)
>>at
>> /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1506
>> #4  0x00407a9c in uw_advance_context (fs=0xf77ebec8,
>> context=0xf77ec820)
>> at
>> /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind-dw2.c:1529
>> #5  _Unwind_ForcedUnwind_Phase2 (exc=exc@entry=0xf77ee580,
>> context=context@entry=0xf77ec820)
>> at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind.inc:185
>> #6  0x00408228 in _Unwind_ForcedUnwind (exc=0xf77ee580,
>> stop=stop@entry=0x405440 , stop_argument=0xf77eddd8)
>> at /home/GCC-Build/p660/p660_build_dir/src/gcc-4.9/libgcc/unwind.inc:207
>> #7  0x004055c4 in __pthread_unwind (buf=)
>> at unwind.c:126
>> #8  0x004050b4 in __do_cancel () at ./pthreadP.h:283
>> #9  sigcancel_handler (sig=, si=,
>> ctx=) at nptl-init.c:225
>> ---Type  to continue, or q  to quit---
>> #10 
>>
>> #11 0x in ?? ()
>>
>> #12 0x00423084 in __select (nfds=-1, readfds=,
>> writefds=, exceptfds=, timeout=0x0)
>> at ../sysdeps/unix/sysv/linux/generic/select.c:45
>> #13 0x00400604 in TEST_TaskDelay (
>> uiMillSecs=)
>> at test-cancel.c:18
>> #14 0x00400680 in printids (
>> s=)
>> at test-cancel.c:38
>> #15 0x004006d0 in thr_fn (
>> arg=)
>> at test-cancel.c:49
>> #16 0x00401b28 in start_thread (arg=0x4a3000) at
>> pthread_create.c:335
>> #17 0x00401b28 in start_thread (arg=0x4a3000) at
>> pthread_create.c:335
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> ```
>>
>> Such abort is raise by the following code:
>> ```
>> static void
>> uw_update_context_1 (struct _Unwind_Context *context, _Unwind_FrameState
>> *fs)
>> {
>> //...
>>   /* Compute this frame's CFA.  */
>>   switch (fs->regs.cfa_how)
>> {
>> case CFA_REG_OFFSET:
>>   cfa = _Unwind_GetPtr (&orig_context, fs->regs.cfa_reg);
>>   cfa += fs->regs.cfa_offset;
>>   break;
>>
>> case CFA_EXP:
>>   {
>> const unsigned char *exp = fs->regs.cfa_exp;
>> _uleb128_t len;
>>
>> exp = read_uleb128 (exp, &len);
>> cfa = (void *) (_Unwind_Ptr)
>>   execute_stack_op (exp, exp + len, &orig_context, 0);
>> break;
>>   }
>>
>> default:
>>   gcc_unreachable ();
>> }
>>   context->cfa = cfa;
>> //...
>> }
>> ``
>>
>> Any suggestion is appreciated.
>>
>> CC gcc mailing list. Sorry if it is off topic.
>>
>> Regards
>>
>> Bamvor
>>
>>
>>
>>
>>> pipeio_x tests are

gcc-4.9-20160427 is now available

2016-04-27 Thread gccadmin
Snapshot gcc-4.9-20160427 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160427/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.9 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch 
revision 235537

You'll find:

 gcc-4.9-20160427.tar.bz2 Complete GCC

  MD5=f525275b0d646be9cb2293ac219a325e
  SHA1=71f295cd00023419e161513460633b52aa9f24ba

Diffs from 4.9-20160420 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: SafeStack proposal in GCC

2016-04-27 Thread Jeff Law

On 04/13/2016 07:01 AM, Cristina Georgiana Opriceana wrote:

Hello,

I bring to your attention SafeStack, part of a bigger research project
- CPI/CPS [1], which offers complete protection against stack-based
control flow hijacks. I am interested in developing SafeStack for GCC
and I would like to ask for  your feedback on this proposal.

SafeStack is a security mechanism that protects against stack based
control flow attacks, while also keeping a low runtime overhead - it
prevents all stack-based attacks in the RIPE benchmark, and has just
0.05% overhead on average on SPEC CPU2006 benchmarks [2]. Safestack
has been recently merged into the Clang/LLVM mainline [3].

Its design is based on the separation of stack-allocated memory
objects in two regions: the safe stack, where we keep the return
addresses, spilled registers and local variables proved to be only
accessed in a safe way by a static analysis pass at compilation, and
the regular region, where we move everything else.

With this separation and randomized-based isolation of the safe stack,
we ensure that no overflows from the unsafe stack can overwrite
sensitive data from the safe stack. Further on, the isolation
mechanism can be improved to use hardware segment protection or
hardware extensions, such as Intel Memory Protection Keys.

We aim to extend all of CPI into the GNU userland, but start with a
SafeStack port in GCC.

In GCC, we propose a design composed of an instrumentation module
(implemented as a GIMPLE pass) and a runtime library.

The instrumentation pass will perform static analysis to discover
stack objects that are only accessed in a safe way. It will also
insert code that allocates a stack frame for the rest of the objects,
those that did not satisfy the safety condition. The pass will run
independently, after GIMPLE lowering, scheduled on the all_passes list
and after other optimizations, such as dead code elimination. Then,
all accesses to unsafe objects have to be re-written, based on the new
stack base and offset in the unsafe stack. In the first phase of the
implementation, the unsafe stack will be allocated on the heap, and we
will rely on ASLR for the isolation.

The runtime support will have to deal with unsafe stack allocation - a
hook in the pthread create/destroy functions to create per-thread
stack regions. This runtime support might be reused from the Clang
implementation.
This all sounds good.  And I'd definitely look to re-use the runtime and 
perhaps tests from Clang.


Jeff



How to avoid instrumenting function in a particular section?

2016-04-27 Thread Vanush Vaswani
Is it possible to avoid instrumenting functions
(-finstrument-functions) if they are in a particular section?