Re: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST

2015-08-03 Thread Richard Biener
On Sun, Aug 2, 2015 at 4:13 PM, Ajit Kumar Agarwal
 wrote:
> All:
>
> The definition of the following macro that determine the statement cost that 
> adds to vectorization cost.
>
> #define TARGET_VECTORIZE_ADD_STMT_COST.
>
> In the implementation of the above macro the following is done for many 
> vectorization supported architectures like i386, ARM.
>
> if (where == vect_body && stmt_info && stmt_in_inner_loop_p (stmt_info))
> count *= 50;  /* FIXME.  */
>
> I have the  following questions.
>
> 1. Why the multiplication factor of 50 is choosen?

It's a wild guess.  See tree-vect-loop.c:vect_get_single_scalar_iteration_cost.

> 2. The comment mentions that the inner loop relative to the loop being 
> vectorized is added more weight. If more weight is added to
> the inner loop for the loop being vectorized, the chances of vectorizing the 
> inner loop decreases. Why the inner loop cost is increased
> with relative to the loop being vectorized?

In fact adding more weight to the inner loop increases the chance of
vectorizing it (if vectorizing the inner loop is profitable).
Both scalar and vector cost get biased by a factor of 50 (we assume 50
iterations of the inner loop for one iteration of the
outer loop), so a non-profitable vectorization in the outer loop can
be offsetted by profitable inner loop vectorization.

Yes, '50' can be improved if we actually know the iteration count of
the inner loop or if we have profile-feedback.

Richard.


> Thanks & Regards
> Ajit


ctype_members.cc Comparison Always True

2015-08-03 Thread Joel Sherrill

Hi

Just noticed this building the head for arm-rtems4.11. Should
the first comparison be eliminated and, maybe, a comment added?

ctype_members.cc:216:14: warning: comparison of unsigned expression >= 0 is 
always true [-Wtype-limits]
 if (__wc >= 0 && __wc < 128 && _M_narrow_ok)
  ^
ctype_members.cc: In member function 'virtual const wchar_t* 
std::ctype::do_narrow(const wchar_t*, const wchar_t*, char, char*) 
const':
ctype_members.cc:230:14: warning: comparison of unsigned expression >= 0 is 
always true [-Wtype-limits]
if (*__lo >= 0 && *__lo < 128)

--joel


emutls.c: warnings on master

2015-08-03 Thread Joel Sherrill

Hi

I saw these warnings in the build of some RTEMS targets on
the gcc head.

/home/joel/test-gcc/gcc/libgcc/emutls.c:159:7: warning: implicit declaration of 
function 'calloc' [-Wimplicit-function-declaration]
/home/joel/test-gcc/gcc/libgcc/emutls.c:159:13: warning: incompatible implicit 
declaration of built-in function 'calloc'
/home/joel/test-gcc/gcc/libgcc/emutls.c:171:7: warning: implicit declaration of 
function 'realloc' [-Wimplicit-function-declaration]
/home/joel/test-gcc/gcc/libgcc/emutls.c:171:13: warning: incompatible implicit 
declaration of built-in function 'realloc'

What magic is needed so emutls.c includes ?

--joel


Re: Controlling instruction alternative selection

2015-08-03 Thread Jim Wilson
On 07/30/2015 09:54 PM, Paul Shortis wrote:
> Resulting in ...
> error: unable to find a register to spill in class ‘GP_REGS’
> 
> enabling lra and inspecting the rtl dump indicates that both
> alternatives (R and r) seem to be equally appealing to the allocater so
> it chooses 'R' and fails.

The problem isn't in lra, it is in reload.  You want lra to use the
three address instruction, but you then want reload to use the two
address alternative.

> Using constraint disparaging (?R) eradicates the errors, but of course
> that causes the 'R' three address alternative to never be used.

You want to disparage the three address alternative in reload, but not
in lra.  There is a special code for that, you can use ^ instead of ? to
make that happen.  That may or may not help though.

There is also a hook TARGET_CLASS_LIKELY_SPILLED_P which might help.
You should try defining this to return true for the 'R' class if it
doesn't already.

Jim



CFI directives and dynamic stack alignment

2015-08-03 Thread Steve Ellcey

I don't know if there are any CFI experts out there but I am working on
dynamic stack alignment for MIPS.  I think I have it working in the 'normal'
case but when I try to do stack unwinding through a routine with an aligned
stack, then I have problems.  I was wondering if someone can help me understand
what CFI directives to generate to allow stack unwinding.  Using
gcc.dg/cleanup-8.c as an example (because it fails with my stack alignment
code), if I generate code with no dynamic stack alignment (but forcing the
use of the frame pointer), the routine fn2 looks like this on MIPS:

fn2:
.frame  $fp,32,$31  # vars= 0, regs= 2/0, args= 16, gp= 8
.mask   0xc000,-4
.fmask  0x,0
.setnoreorder
.setnomacro
lui $2,%hi(null)
addiu   $sp,$sp,-32
.cfi_def_cfa_offset 32
lw  $2,%lo(null)($2)
sw  $fp,24($sp)
.cfi_offset 30, -8
move$fp,$sp
.cfi_def_cfa_register 30
sw  $31,28($sp)
.cfi_offset 31, -4
jal abort
sb  $0,0($2)

There are .cfi directives when incrementing the stack pointer, saving the
frame pointer, and copying the stack pointer to the frame pointer.

When I generate code to dynamically align the stack my code looks like
this:

fn2:
.frame  $fp,32,$31  # vars= 0, regs= 2/0, args= 16, gp= 8
.mask   0xc000,-4
.fmask  0x,0
.setnoreorder
.setnomacro
lui $2,%hi(null)
li  $3,-16  # 0xfff0
lw  $2,%lo(null)($2)
and $sp,$sp,$3
addiu   $sp,$sp,-32
.cfi_def_cfa_offset 32
sw  $fp,24($sp)
.cfi_offset 30, -8
move$fp,$sp
.cfi_def_cfa_register 30
sw  $31,28($sp)
.cfi_offset 31, -4
jal abort
sb  $0,0($2)

The 'and' instruction is where the stack gets aligned and if I remove that
one instruction, everything works.  I think I need to put out some new CFI
psuedo-ops to handle this but I am not sure what they should be.  I am just
not very familiar with the CFI directives.

I looked at ix86_emit_save_reg_using_mov where there is some special
code for handling the drap register and for saving registers on a 
realigned stack but I don't really understand what they are trying 
to do.

Any help?

Steve Ellcey
sell...@imgtec.com

P.S. For completeness sake I have attached my current dynamic
 alignment changes in case anyone wants to see them.

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 4f9a31d..386c2ce 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5737,6 +5737,29 @@ expand_stack_alignment (void)
   gcc_assert (targetm.calls.get_drap_rtx != NULL);
   drap_rtx = targetm.calls.get_drap_rtx ();
 
+  /* I am not doing this in get_drap_rtx because we are also calling
+ that from expand_function_end in order to get/set the drap_reg
+ and vdrap_reg variables and doing these instructions at that
+ point is not working.   */
+
+  if (drap_rtx != NULL_RTX)
+{
+  rtx_insn *insn, *seq;
+
+  start_sequence ();
+  emit_move_insn (crtl->vdrap_reg, crtl->drap_reg);
+  seq = get_insns ();
+  insn = get_last_insn ();
+  end_sequence ();
+  emit_insn_at_entry (seq);
+  if (!optimize)
+{
+  add_reg_note (insn, REG_CFA_SET_VDRAP, crtl->vdrap_reg);
+  RTX_FRAME_RELATED_P (insn) = 1;
+}
+}
+
+
   /* stack_realign_drap and drap_rtx must match.  */
   gcc_assert ((stack_realign_drap != 0) == (drap_rtx != NULL));
 
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index ce21a0f..b6ab30a 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -746,6 +746,8 @@ static const struct attribute_spec mips_attribute_table[] = {
   { "use_shadow_register_set",	0, 0, false, true,  true, NULL, false },
   { "keep_interrupts_masked",	0, 0, false, true,  true, NULL, false },
   { "use_debug_exception_return", 0, 0, false, true,  true, NULL, false },
+  { "align_stack", 0, 0, true, false, false, NULL, false },
+  { "no_align_stack", 0, 0, true, false, false, NULL, false },
   { NULL,	   0, 0, false, false, false, NULL, false }
 };
 
@@ -1528,6 +1530,61 @@ mips_merge_decl_attributes (tree olddecl, tree newdecl)
 			   DECL_ATTRIBUTES (newdecl));
 }
 
+static bool
+mips_cfun_has_msa_p (void)
+{
+  /* For now, for testing, assume all functions use MSA
+ (and thus need alignment).  */
+#if 0
+  if (!cfun || !TARGET_MSA)
+return FALSE;
+
+  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
+{
+  if (MSA_SUPPORTED_MODE_P (GET_MODE (insn)))
+	return TRUE;
+}
+
+  return FALSE;
+#else
+  return TRUE;
+#endif
+}
+
+bool
+mips_align_stack_p (void)
+{
+  bool want_alignment = TARGET_ALIGN_STACK && mips_cfun_has_msa_p ();
+
+  if (current_function_decl)
+{
+  tree attr = DECL_ATTRIBUTES (cu