Re: [PATCH][i386]Fix PR 57756

2013-10-12 Thread Uros Bizjak
On Sat, Oct 12, 2013 at 2:16 AM, Sriraman Tallam  wrote:

> Ping.

 This looks nice.  I suppose you grepped for effected targets and rs6000
 was the only one besides i386.

 This is ok given target maintainers do not object within 24h and the test
 successfully bootstrapped and passed regression tests.
>>>
>>> I changed this patch  quite a bit after I realized I missed many other
>>> places where global_options field was accessed. The two specific
>>> changes are:
>>>
>>> * Particularly, ix86_function_specific_save and
>>> ix86_function_specific_restore had to be changed to copy to a specific
>>> gcc_options structure than to global_options.
>>> * Function ix86_option_override_internal accesses global_options via
>>> macros, like TARGET_64BIT etc. This had to be changed. So, I created
>>> new macros to accept a parameter and named them like TARGET_64BIT_P
>>> and replaced all the accesses in i386.c
>>>
>>> I also marked as TODO a copy in function ix86_function_specific_save.
>>> Here, the cl_target_option structure has a ix86_target_flags_explicit
>>> field whereas the global_options structure does not have one.
>>> Previously, this used to get saved to the global_options target_flags
>>> field but this did not make sense to me. I have commented this line
>>> out. Please let me know if a new field needs to be added.
>>>
>>> This patch passes bootstrap and all tests. Please take another look.

AFAICS, the patch was approved by Richard on 23. September. None of
the target maintainers have had any further objections to it.

Regarding the TODO in i386.c: some options depend on and enable other
options, for example -msse3 enables SSE2, etc, although user didn't
explicitly add -msse2 to the options. This is not the case with global
options. Since this field is used extensively through i386.c, I'd say
to play safe and save it.

So, x86 part is OK with the above change. Middle end is already
approved and rs6000 maintainer didn't object the approval, so please
go ahead and commit the patch.

Thanks,
Uros.


Re: PATCH: PR target/58690: internal compiler error: in copy_to_mode_reg, at explow.c:641

2013-10-12 Thread Uros Bizjak
On Fri, Oct 11, 2013 at 8:38 PM, H.J. Lu  wrote:

> In x32, when a TLS address is in DImode and Pmode is SImode,
> copy_addr_to_reg will fail.  This patch adds ix86_copy_addr_to_reg
> to first copy DImode address into a DImode register and then to generate
> SImode SUBREG in this case.  Tested on x32.  OK for trunk?
>
> 2013-10-11  H.J. Lu  
>
> PR target/58690
> * config/i386/i386.c (ix86_copy_addr_to_reg): New function.
> (ix86_expand_movmem): Replace copy_addr_to_reg with
> ix86_copy_addr_to_reg.
> (ix86_expand_setmem): Likewise.
>
> gcc/testsuite/
>
> 2013-10-11  H.J. Lu  
>
> PR target/58690
> * gcc.target/i386/pr58690.c: New test
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 37c1bec..e39d63db 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -22076,6 +22076,21 @@ counter_mode (rtx count_exp)
>return SImode;
>  }
>
> +/* Copy the address to a register in Pmode.  Support x32 TLS address in
> +   DImode and Pmode in SImode.  */

Copy the address to a Pmode register.  This is used for x32 to truncate
DImode TLS address to a SImode register.

> +static rtx
> +ix86_copy_addr_to_reg (rtx addr)
> +{
> +  if (GET_MODE (addr) != Pmode)
> +{
> +  gcc_assert (GET_MODE (addr) == DImode && Pmode == SImode);
> +  return gen_rtx_SUBREG (SImode, copy_to_mode_reg (DImode, addr), 0);
> +}
> +  else
> +return copy_addr_to_reg (addr);
> +}

No negative conditions please. Just switch arms of the if clause.

OK with these changes.

Do we also need this patch on 4.8?

Thanks,
Uros.


Re: [PATCH] Reassociate X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into ((X - CST1) & ~(CST2 - CST1)) == 0

2013-10-12 Thread Jakub Jelinek
On Sat, Oct 12, 2013 at 10:08:12AM +0800, Zhenqiang Chen wrote:
> As you had mentioned, the transition in this patch does not reduce
> instructions. But the preexisting optimization does. So we prefer to do the
> preexisting optimization first. E.g.
> 
> X == 10 || X == 12 || X == 26

Ok, that makes sense.

@@ -2279,6 +2275,71 @@ optimize_range_tests (enum tree_code opcode,
}
 }
 
+  /* Optimize X == CST1 || X == CST2
+ if popcount (CST2 - CST1) == 1 into
+ ((X - CST1) & ~(CST2 - CST1)) == 0.  */

Mention here also similarly to the other comment that it works also
with ranges.

+  if (BRANCH_COST (optimize_function_for_speed_p (cfun), false) >= 2)
+for (i = first; i < length; i++)
+  {
+   tree lowi, highi, lowj, highj, type, tem1, tem2, mask;
+
+   if (ranges[i].exp == NULL_TREE || ranges[i].in_p)
+ continue;
+   type = TREE_TYPE (ranges[i].exp);
+   if (!INTEGRAL_TYPE_P (type))
+ continue;
+   lowi = ranges[i].low;
+   if (lowi == NULL_TREE)
+ continue;

The other loop has here:
  if (lowi == NULL_TREE)
lowi = TYPE_MIN_VALUE (type);
instead, which is better, can you please change it?

+   highi = ranges[i].high;
+   if (highi == NULL_TREE)
+ continue;
+   for (j = i + 1; j < length && j < i + 64; j++)
+ {
+   lowj = ranges[j].low;
+   if (lowj == NULL_TREE)
+ continue;
+   highj = ranges[j].high;
+   if (highj == NULL_TREE)
+ continue;

The other loop has
if (highj == NULL_TREE)
  highj = TYPE_MAX_VALUE (type);
here instead.

+   if (ranges[j].exp == NULL_TREE || ranges[j].in_p
+   || (ranges[i].exp != ranges[j].exp))
+ continue;

Can you please move this test the lowj = assignment, and
remove the ranges[j].exp == NULL_TREE test (both here and
in the earlier loop)?  I mean, we've checked at the
beginning of for (i = first; i < length; i++) loop that
ranges[i].exp is not NULL_TREE, and if ranges[j].exp is NULL_TREE,
it will be different than ranges[i].exp.
So
if (ranges[i].exp != ranges[j].exp || ranges[j].in_p)
  continue;
(and please also collapse the two checks in the first loop,
so that both look similar).

+   /* Check lowj > highi.  */
+   tem1 = fold_binary (GT_EXPR, boolean_type_node,
+  lowj, highi);
+   if (tem1 == NULL_TREE || !integer_onep (tem1))
+ continue;
+   /* Check highi - lowi == highj - lowj.  */
+   tem1 = fold_binary (MINUS_EXPR, type, highi, lowi);
+   if (tem1 == NULL_TREE || TREE_CODE (tem1) != INTEGER_CST)
+ continue;
+   tem2 = fold_binary (MINUS_EXPR, type, highj, lowj);
+   if (tem2 == NULL_TREE || TREE_CODE (tem2) != INTEGER_CST)
+ continue;
+   if (!tree_int_cst_equal (tem1, tem2))
+ continue;
+   /* Check popcount (lowj - lowi) == 1.  */
+   tem1 = fold_binary (MINUS_EXPR, type, lowj, lowi);
+   if (tem1 == NULL_TREE || TREE_CODE (tem1) != INTEGER_CST)
+ continue;
+   if (tree_log2 (tem1) < 0)
+ continue;
+   mask = fold_build1 (BIT_NOT_EXPR, type, tem1);
+   tem1 = fold_binary (MINUS_EXPR, type, ranges[i].exp, lowi);
+   tem1 = fold_build2 (BIT_AND_EXPR, type, tem1, mask);
+   lowi = build_int_cst (type, 0);

Please use lowj instead of lowi here.  Because, if update_range_test
fails, we continue looking for further matches in following ranges,
and while lowj will be computed again, lowi will be assumed to contain
the low bound of the first range.

+   if (update_range_test (ranges + i, ranges + j, 1, opcode, ops, tem1,
+  ranges[i].in_p, lowi, tem2,

And here too.

Or, alternatively to avoid duplication, you could put the whole
for (i = first; i < length; i++)
loop from the first optimization into another
int i;
  for (pass = 0; pass < (BRANCH_COST (optimize_function_for_speed_p (cfun),
  false) >= 2) + 1; pass++)
  {
  }
loop, use whatever is common in the loop unconditionally, and just
conditionalize the rest on pass == 0 vs. pass == 1.
Or maybe even better just move this whole loop into a new function,
with ranges, opcode, first, length arguments plus another (bool?) argument
which of the two optimizations you want to perform, and return the bool
any_changes.  Then you wouldn't run into line length issues that you could
run into with the extra loop.

--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-33.c
@@ -0,0 +1,38 @@
+/* { dg-do run { target { ! "m68k*-*-* mmix*-*-* mep*-*-* bfin*-*-* v850*-*-* 
picochip*-*-* moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* 
xtensa*-*-*"} } } */
+
+/* { dg-options "-O2 -fno-inline -fdump-tree-reassoc1-details" } */
+/* { dg-additional-options "-mbranch-cost=2" { target a

Re: Apply attribute returns_nonnull in libiberty

2013-10-12 Thread Marc Glisse

On Fri, 11 Oct 2013, DJ Delorie wrote:


Alternatively, you could ask the other projects if they're ok with
changing the xmalloc rule...


Is there an official list where all the users of libiberty can be 
contacted?


--
Marc Glisse


[committed] OpenMP 4.0 affinity fixes and improvements (PR libgomp/58691)

2013-10-12 Thread Jakub Jelinek
Hi!

I've committed these changes to trunk.
The first two hunks hopefully fixes warnings (turned into -Werror) with
older glibc headers, the rest are changes requested in PR58691
- OMP_PROC_BIND default value is implementation defined, and Tobias
suggested a better implementation defined default - if OMP_PLACES
or GOMP_CPU_AFFINITY exist in environment and are successfully parsed,
then the default is omp_proc_bind_true, otherwise omp_proc_bind_false.
Explicit OMP_PROC_BIND=false makes those two env vars just parsed for
basic syntax errors, but doesn't actually build places lists from them.

2013-10-12  Jakub Jelinek  

PR libgomp/58691
* config/linux/proc.c (gomp_cpuset_popcount): Add unused attribute
to check variable.
(gomp_init_num_threads): Move i variable declaration into
#ifdef CPU_ALLOC_SIZE block.
* config/linux/affinity.c (gomp_affinity_init_level): Test
gomp_places_list_len == 0 rather than gomp_places_list == 0
when checking for topology reading error.
* team.c (gomp_team_start): Don't handle bind == omp_proc_bind_false.
* env.c (parse_affinity): Add ignore argument, if true, don't populate
gomp_places_list, only parse env var and always return false.
(parse_places_var): Likewise.  Don't check gomp_global_icv.bind_var.
(initialize_env): Always parse OMP_PLACES and GOMP_CPU_AFFINITY env
vars, default to OMP_PROC_BIND=true if OMP_PROC_BIND wasn't specified
and either of these variables were parsed correctly into a places
list.

--- libgomp/config/linux/proc.c.jj  2013-10-11 11:23:59.0 +0200
+++ libgomp/config/linux/proc.c 2013-10-11 23:17:18.533108120 +0200
@@ -59,7 +59,7 @@ gomp_cpuset_popcount (unsigned long cpus
   size_t i;
   unsigned long ret = 0;
   extern int check[sizeof (cpusetp->__bits[0]) == sizeof (unsigned long int)
-  ? 1 : -1];
+  ? 1 : -1] __attribute__((unused));
 
   for (i = 0; i < cpusetsize / sizeof (cpusetp->__bits[0]); i++)
 {
@@ -94,7 +94,6 @@ gomp_init_num_threads (void)
gomp_cpusetp);
   if (ret == 0)
{
- unsigned long i;
  /* Count only the CPUs this process can use.  */
  gomp_global_icv.nthreads_var
= gomp_cpuset_popcount (gomp_cpuset_size, gomp_cpusetp);
@@ -102,6 +101,7 @@ gomp_init_num_threads (void)
break;
  gomp_get_cpuset_size = gomp_cpuset_size;
 #ifdef CPU_ALLOC_SIZE
+ unsigned long i;
  for (i = gomp_cpuset_size * 8; i; i--)
if (CPU_ISSET_S (i - 1, gomp_cpuset_size, gomp_cpusetp))
  break;
--- libgomp/config/linux/affinity.c.jj  2013-10-11 11:23:59.0 +0200
+++ libgomp/config/linux/affinity.c 2013-10-11 23:35:05.566620759 +0200
@@ -309,7 +309,7 @@ gomp_affinity_init_level (int level, uns
fclose (f);
  }
  }
-  if (gomp_places_list == 0)
+  if (gomp_places_list_len == 0)
{
  if (!quiet)
gomp_error ("Error reading %s topology",
--- libgomp/team.c.jj   2013-10-11 11:23:59.0 +0200
+++ libgomp/team.c  2013-10-11 23:27:52.928843235 +0200
@@ -339,8 +339,6 @@ gomp_team_start (void (*fn) (void *), vo
 
   if (__builtin_expect (gomp_places_list != NULL, 0))
 {
-  if (bind == omp_proc_bind_false)
-   bind = omp_proc_bind_true;
   /* Depending on chosen proc_bind model, set subpartition
 for the master thread and initialize helper variables
 P and optionally S, K and/or REST used by later place
@@ -348,9 +346,6 @@ gomp_team_start (void (*fn) (void *), vo
   p = thr->place - 1;
   switch (bind)
{
-   case omp_proc_bind_false:
- bind = omp_proc_bind_true;
- /* FALLTHRU */
case omp_proc_bind_true:
case omp_proc_bind_close:
  if (nthreads > thr->ts.place_partition_len)
--- libgomp/env.c.jj2013-10-11 11:23:59.0 +0200
+++ libgomp/env.c   2013-10-12 00:10:02.670107433 +0200
@@ -548,7 +548,7 @@ parse_one_place (char **envp, bool *nega
 }
 
 static bool
-parse_places_var (const char *name)
+parse_places_var (const char *name, bool ignore)
 {
   char *env = getenv (name), *end;
   bool any_negate = false;
@@ -604,6 +604,10 @@ parse_places_var (const char *name)
  if (*env != '\0')
goto invalid;
}
+
+  if (ignore)
+   return false;
+
   return gomp_affinity_init_level (level, count, false);
 }
 
@@ -634,7 +638,7 @@ parse_places_var (const char *name)
 }
   while (1);
 
-  if (gomp_global_icv.bind_var == omp_proc_bind_false)
+  if (ignore)
 return false;
 
   gomp_places_list_len = 0;
@@ -911,7 +915,7 @@ parse_wait_policy (void)
present and it was successfully parsed.  */
 
 static bool
-parse_affinity (void)
+parse_affinity (bool ignore)
 {
   char *env, *end, *start;
   int pass;
@@ -928,6 +932,9 @@ parse_aff

Re: [PATCH i386] Use ordered comparisons for rounding builtins when -mno-ieee-fp

2013-10-12 Thread Uros Bizjak
Hello!

> 2013-10-11  Alexander Monakov  
>
> * config/i386/i386.c (ix86_expand_sse_compare_and_jump): Use mode
> provided by ix86_fp_compare_mode instead of CCFPUmode.
>
> testsuite/:
> * gcc.target/i386/builtin-ucmp.c: New test.

OK for mainline.

Thanks,
Uros.


Re: [AArch64] Fix early-clobber operands to vtbx[1,3]

2013-10-12 Thread James Greenhalgh
On Fri, Oct 11, 2013 at 07:52:48PM +0100, Marcus Shawcroft wrote:
> > 2013-10-11  James Greenhalgh  
> >
> > * config/aarch64/arm_neon.h
> > (vtbx<1,3>_8): Fix register constriants.
> >
> > OK?
> 
> OK, and back port to 4.8 please.
> /Marcus
> 

Hi Marcus,

I've committed this as revision 203478, but 4.8 is currently
frozen for release, so Jakub (+CC) will have to approve it.

This patch is small, not very controversial and only affects
the AArch64 tree.

Otherwise, I'll backport this when 4.8 opens again.

Thanks,
James
>From ba67f60eb238b71c55cc4363f5061b6e6810990a Mon Sep 17 00:00:00 2001
From: James Greenhalgh 
Date: Fri, 13 Sep 2013 17:18:23 +0100
Subject: [AArch64] Fix early-clobber operands to vtbx[1,3]
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="1.8.3-rc0"

This is a multi-part message in MIME format.
--1.8.3-rc0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


Hi,

The vtbx intrinsics are implemented in assembly without noting
that their tmp1 operand is early-clobber. This can, when the
wind blows the wrong way, result in us making a total mess of
the state of registers.

Fix by marking the required operands as early-clobber.

Regression tested against aarch64.exp with no problems.

OK?

Thanks,
James

---
2013-10-11  James Greenhalgh  

	* config/aarch64/arm_neon.h
	(vtbx<1,3>_8): Fix register constriants.


--1.8.3-rc0
Content-Type: text/x-patch; name="0001-AArch64-Fix-early-clobber-operands-to-vtbx-1-3.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="0001-AArch64-Fix-early-clobber-operands-to-vtbx-1-3.patch"

diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 482d7d0..f7c9db6 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -15636,7 +15636,7 @@ vtbx1_s8 (int8x8_t r, int8x8_t tab, int8x8_t idx)
 	   "cmhs %0.8b, %3.8b, %0.8b\n\t"
 	   "tbl %1.8b, {%2.16b}, %3.8b\n\t"
 	   "bsl %0.8b, %4.8b, %1.8b\n\t"
-   : "+w"(result), "=w"(tmp1)
+   : "+w"(result), "=&w"(tmp1)
: "w"(temp), "w"(idx), "w"(r)
: /* No clobbers */);
   return result;
@@ -15652,7 +15652,7 @@ vtbx1_u8 (uint8x8_t r, uint8x8_t tab, uint8x8_t idx)
 	   "cmhs %0.8b, %3.8b, %0.8b\n\t"
 	   "tbl %1.8b, {%2.16b}, %3.8b\n\t"
 	   "bsl %0.8b, %4.8b, %1.8b\n\t"
-   : "+w"(result), "=w"(tmp1)
+   : "+w"(result), "=&w"(tmp1)
: "w"(temp), "w"(idx), "w"(r)
: /* No clobbers */);
   return result;
@@ -15668,7 +15668,7 @@ vtbx1_p8 (poly8x8_t r, poly8x8_t tab, uint8x8_t idx)
 	   "cmhs %0.8b, %3.8b, %0.8b\n\t"
 	   "tbl %1.8b, {%2.16b}, %3.8b\n\t"
 	   "bsl %0.8b, %4.8b, %1.8b\n\t"
-   : "+w"(result), "=w"(tmp1)
+   : "+w"(result), "=&w"(tmp1)
: "w"(temp), "w"(idx), "w"(r)
: /* No clobbers */);
   return result;
@@ -15723,7 +15723,7 @@ vtbx3_s8 (int8x8_t r, int8x8x3_t tab, int8x8_t idx)
 	   "cmhs %0.8b, %3.8b, %0.8b\n\t"
 	   "tbl %1.8b, {v16.16b - v17.16b}, %3.8b\n\t"
 	   "bsl %0.8b, %4.8b, %1.8b\n\t"
-   : "+w"(result), "=w"(tmp1)
+   : "+w"(result), "=&w"(tmp1)
: "Q"(temp), "w"(idx), "w"(r)
: "v16", "v17", "memory");
   return result;
@@ -15742,7 +15742,7 @@ vtbx3_u8 (uint8x8_t r, uint8x8x3_t tab, uint8x8_t idx)
 	   "cmhs %0.8b, %3.8b, %0.8b\n\t"
 	   "tbl %1.8b, {v16.16b - v17.16b}, %3.8b\n\t"
 	   "bsl %0.8b, %4.8b, %1.8b\n\t"
-   : "+w"(result), "=w"(tmp1)
+   : "+w"(result), "=&w"(tmp1)
: "Q"(temp), "w"(idx), "w"(r)
: "v16", "v17", "memory");
   return result;
@@ -15761,7 +15761,7 @@ vtbx3_p8 (poly8x8_t r, poly8x8x3_t tab, uint8x8_t idx)
 	   "cmhs %0.8b, %3.8b, %0.8b\n\t"
 	   "tbl %1.8b, {v16.16b - v17.16b}, %3.8b\n\t"
 	   "bsl %0.8b, %4.8b, %1.8b\n\t"
-   : "+w"(result), "=w"(tmp1)
+   : "+w"(result), "=&w"(tmp1)
: "Q"(temp), "w"(idx), "w"(r)
: "v16", "v17", "memory");
   return result;

--1.8.3-rc0--




Re: [AArch64] Fix early-clobber operands to vtbx[1,3]

2013-10-12 Thread Jakub Jelinek
On Sat, Oct 12, 2013 at 08:57:51AM +0100, James Greenhalgh wrote:
> I've committed this as revision 203478, but 4.8 is currently
> frozen for release, so Jakub (+CC) will have to approve it.

This is ok for 4.8.2.

Jakub


Re: PATCH: PR target/58690: internal compiler error: in copy_to_mode_reg, at explow.c:641

2013-10-12 Thread H.J. Lu
On Sat, Oct 12, 2013 at 12:21 AM, Uros Bizjak  wrote:
> On Fri, Oct 11, 2013 at 8:38 PM, H.J. Lu  wrote:
>
>> In x32, when a TLS address is in DImode and Pmode is SImode,
>> copy_addr_to_reg will fail.  This patch adds ix86_copy_addr_to_reg
>> to first copy DImode address into a DImode register and then to generate
>> SImode SUBREG in this case.  Tested on x32.  OK for trunk?
>>
>> 2013-10-11  H.J. Lu  
>>
>> PR target/58690
>> * config/i386/i386.c (ix86_copy_addr_to_reg): New function.
>> (ix86_expand_movmem): Replace copy_addr_to_reg with
>> ix86_copy_addr_to_reg.
>> (ix86_expand_setmem): Likewise.
>>
>> gcc/testsuite/
>>
>> 2013-10-11  H.J. Lu  
>>
>> PR target/58690
>> * gcc.target/i386/pr58690.c: New test
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 37c1bec..e39d63db 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -22076,6 +22076,21 @@ counter_mode (rtx count_exp)
>>return SImode;
>>  }
>>
>> +/* Copy the address to a register in Pmode.  Support x32 TLS address in
>> +   DImode and Pmode in SImode.  */
>
> Copy the address to a Pmode register.  This is used for x32 to truncate
> DImode TLS address to a SImode register.
>
>> +static rtx
>> +ix86_copy_addr_to_reg (rtx addr)
>> +{
>> +  if (GET_MODE (addr) != Pmode)
>> +{
>> +  gcc_assert (GET_MODE (addr) == DImode && Pmode == SImode);
>> +  return gen_rtx_SUBREG (SImode, copy_to_mode_reg (DImode, addr), 0);
>> +}
>> +  else
>> +return copy_addr_to_reg (addr);
>> +}
>
> No negative conditions please. Just switch arms of the if clause.
>
> OK with these changes.

This is what I checked in.

> Do we also need this patch on 4.8?

Yes, it is also needed for 4.8.  OK for 4.8 after a few days?

> Thanks,
> Uros.

Thanks.

-- 
H.J.
--
gcc/

2013-10-12  H.J. Lu  

PR target/58690
* config/i386/i386.c (ix86_copy_addr_to_reg): New function.
(ix86_expand_movmem): Replace copy_addr_to_reg with
ix86_copy_addr_to_reg.
(ix86_expand_setmem): Likewise.

gcc/testsuite/

2013-10-12  H.J. Lu  

PR target/58690
* gcc.target/i386/pr58690.c: New test

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 37c1bec..aa15bc5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -22076,6 +22076,21 @@ counter_mode (rtx count_exp)
   return SImode;
 }

+/* Copy the address to a Pmode register.  This is used for x32 to
+   truncate DImode TLS address to a SImode register. */
+
+static rtx
+ix86_copy_addr_to_reg (rtx addr)
+{
+  if (GET_MODE (addr) == Pmode)
+return copy_addr_to_reg (addr);
+  else
+{
+  gcc_assert (GET_MODE (addr) == DImode && Pmode == SImode);
+  return gen_rtx_SUBREG (SImode, copy_to_mode_reg (DImode, addr), 0);
+}
+}
+
 /* When SRCPTR is non-NULL, output simple loop to move memory
pointer to SRCPTR to DESTPTR via chunks of MODE unrolled UNROLL times,
overall size is COUNT specified in bytes.  When SRCPTR is NULL, output the
@@ -23032,8 +23047,8 @@ ix86_expand_movmem (rtx dst, rtx src, rtx count_exp, rtx
 align_exp,

   if (!count)
 count_exp = copy_to_mode_reg (GET_MODE (count_exp), count_exp);
-  destreg = copy_addr_to_reg (XEXP (dst, 0));
-  srcreg = copy_addr_to_reg (XEXP (src, 0));
+  destreg = ix86_copy_addr_to_reg (XEXP (dst, 0));
+  srcreg = ix86_copy_addr_to_reg (XEXP (src, 0));

   unroll_factor = 1;
   move_mode = word_mode;
@@ -23436,7 +23451,7 @@ ix86_expand_setmem (rtx dst, rtx count_exp, rtx val_exp,
 rtx align_exp,

   if (!count)
 count_exp = copy_to_mode_reg (counter_mode (count_exp), count_exp);
-  destreg = copy_addr_to_reg (XEXP (dst, 0));
+  destreg = ix86_copy_addr_to_reg (XEXP (dst, 0));

   move_mode = word_mode;
   unroll_factor = 1;
diff --git a/gcc/testsuite/gcc.target/i386/pr58690.c b/gcc/testsuite/gcc.target/
i386/pr58690.c
new file mode 100644
index 000..87a87cc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr58690.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { ! { ia32 } } } } */
+/* { dg-require-effective-target maybe_x32 } */
+/* { dg-options "-O2 -mx32 -maddress-mode=short" } */
+
+struct gomp_thread
+{
+  char foo[41];
+};
+extern __thread struct gomp_thread gomp_tls_data;
+void
+foo (void)
+{
+  __builtin_memset (&gomp_tls_data, '\0', sizeof (gomp_tls_data));
+}
gnu-6:pts/14[22]> cat /tmp/pr58690.patch /export/gnu/import/git/src
gcc/

2013-10-12  H.J. Lu  

PR target/58690
* config/i386/i386.c (ix86_copy_addr_to_reg): New function.
(ix86_expand_movmem): Replace copy_addr_to_reg with
ix86_copy_addr_to_reg.
(ix86_expand_setmem): Likewise.

gcc/testsuite/

2013-10-12  H.J. Lu  

PR target/58690
* gcc.target/i386/pr58690.c: New test

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 37c1bec..aa15bc5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -22076,6 +22076,21 @@ counter_mode (rtx count_exp)
   return SImode;
 }

+/

Re: PATCH: PR target/58690: internal compiler error: in copy_to_mode_reg, at explow.c:641

2013-10-12 Thread Uros Bizjak
On Sat, Oct 12, 2013 at 4:57 PM, H.J. Lu  wrote:

>>> In x32, when a TLS address is in DImode and Pmode is SImode,
>>> copy_addr_to_reg will fail.  This patch adds ix86_copy_addr_to_reg
>>> to first copy DImode address into a DImode register and then to generate
>>> SImode SUBREG in this case.  Tested on x32.  OK for trunk?
>>>
>>> 2013-10-11  H.J. Lu  
>>>
>>> PR target/58690
>>> * config/i386/i386.c (ix86_copy_addr_to_reg): New function.
>>> (ix86_expand_movmem): Replace copy_addr_to_reg with
>>> ix86_copy_addr_to_reg.
>>> (ix86_expand_setmem): Likewise.
>>>
>>> gcc/testsuite/
>>>
>>> 2013-10-11  H.J. Lu  
>>>
>>> PR target/58690
>>> * gcc.target/i386/pr58690.c: New test
>
>> Do we also need this patch on 4.8?
>
> Yes, it is also needed for 4.8.  OK for 4.8 after a few days?

OK after the tree opens for check-ins again.

Thanks,
Uros.


Re: [C++ Patch] PR 58466

2013-10-12 Thread Paolo Carlini

On 10/11/2013 09:31 PM, Paolo Carlini wrote:

On 10/11/2013 08:29 PM, Paolo Carlini wrote:
Are we maybe failing to bump processing_template_decl somewhere while 
processing the specialization?

... I'm finishing testing this, already past g++.dg/dg.exp...

In the meanwhile testing completed successfully.

Paolo.


Re: [PATCH PING] Move edge_within_scc to ipa-utils.c

2013-10-12 Thread Jan Hubicka
> Ping.
OK,
thanks!
Honza
> 
> Thanks,
> 
> Martin
> 
> On Wed, Sep 11, 2013 at 03:02:02PM +0200, Martin Jambor wrote:
> > Hi,
> > 
> > edge_within_scc should really be a part of API accompanying
> > ipa_reduced_postorder just like ipa_get_nodes_in_cycle and so this
> > patch moves it to ipa-utils.c and gives it the ipa_ prefix.
> > 
> > Bootstrapped and tested on x86_64-linux.  OK for trunk?
> > 
> > Thanks,
> > 
> > Martin
> > 
> > 
> > 
> > 2013-09-10  Martin Jambor  
> > 
> > * ipa-utils.h (ipa_edge_within_scc): Declare.
> > * ipa-cp.c (edge_within_scc): Moved...
> > * ipa-utils.c (ipa_edge_within_scc): ...here.  Updated all callers.
> > 
> > Index: src/gcc/ipa-utils.c
> > ===
> > --- src.orig/gcc/ipa-utils.c
> > +++ src/gcc/ipa-utils.c
> > @@ -253,6 +253,22 @@ ipa_get_nodes_in_cycle (struct cgraph_no
> >return v;
> >  }
> >  
> > +/* Return true iff the CS is an edge within a strongly connected component 
> > as
> > +   computed by ipa_reduced_postorder.  */
> > +
> > +bool
> > +ipa_edge_within_scc (struct cgraph_edge *cs)
> > +{
> > +  struct ipa_dfs_info *caller_dfs = (struct ipa_dfs_info *) 
> > cs->caller->symbol.aux;
> > +  struct ipa_dfs_info *callee_dfs;
> > +  struct cgraph_node *callee = cgraph_function_node (cs->callee, NULL);
> > +
> > +  callee_dfs = (struct ipa_dfs_info *) callee->symbol.aux;
> > +  return (caller_dfs
> > + && callee_dfs
> > + && caller_dfs->scc_no == callee_dfs->scc_no);
> > +}
> > +
> >  struct postorder_stack
> >  {
> >struct cgraph_node *node;
> > Index: src/gcc/ipa-utils.h
> > ===
> > --- src.orig/gcc/ipa-utils.h
> > +++ src/gcc/ipa-utils.h
> > @@ -42,6 +42,7 @@ int ipa_reduced_postorder (struct cgraph
> >   bool (*ignore_edge) (struct cgraph_edge *));
> >  void ipa_free_postorder_info (void);
> >  vec ipa_get_nodes_in_cycle (struct cgraph_node *);
> > +bool ipa_edge_within_scc (struct cgraph_edge *);
> >  int ipa_reverse_postorder (struct cgraph_node **);
> >  tree get_base_var (tree);
> >  void ipa_merge_profiles (struct cgraph_node *dst,
> > Index: src/gcc/ipa-cp.c
> > ===
> > --- src.orig/gcc/ipa-cp.c
> > +++ src/gcc/ipa-cp.c
> > @@ -287,22 +287,6 @@ ipa_lat_is_single_const (struct ipcp_lat
> >  return true;
> >  }
> >  
> > -/* Return true iff the CS is an edge within a strongly connected component 
> > as
> > -   computed by ipa_reduced_postorder.  */
> > -
> > -static inline bool
> > -edge_within_scc (struct cgraph_edge *cs)
> > -{
> > -  struct ipa_dfs_info *caller_dfs = (struct ipa_dfs_info *) 
> > cs->caller->symbol.aux;
> > -  struct ipa_dfs_info *callee_dfs;
> > -  struct cgraph_node *callee = cgraph_function_node (cs->callee, NULL);
> > -
> > -  callee_dfs = (struct ipa_dfs_info *) callee->symbol.aux;
> > -  return (caller_dfs
> > - && callee_dfs
> > - && caller_dfs->scc_no == callee_dfs->scc_no);
> > -}
> > -
> >  /* Print V which is extracted from a value in a lattice to F.  */
> >  
> >  static void
> > @@ -957,7 +941,7 @@ add_value_to_lattice (struct ipcp_lattic
> >for (val = lat->values; val; val = val->next)
> >  if (values_equal_for_ipcp_p (val->value, newval))
> >{
> > -   if (edge_within_scc (cs))
> > +   if (ipa_edge_within_scc (cs))
> >   {
> > struct ipcp_value_source *s;
> > for (s = val->sources; s ; s = s->next)
> > @@ -1030,7 +1014,7 @@ propagate_vals_accross_pass_through (str
> >   are arithmetic functions with circular dependencies, there is infinite
> >   number of them and we would just make lattices bottom.  */
> >if ((ipa_get_jf_pass_through_operation (jfunc) != NOP_EXPR)
> > -  and edge_within_scc (cs))
> > +  && ipa_edge_within_scc (cs))
> >  ret = set_lattice_contains_variable (dest_lat);
> >else
> >  for (src_val = src_lat->values; src_val; src_val = src_val->next)
> > @@ -1061,7 +1045,7 @@ propagate_vals_accross_ancestor (struct
> >struct ipcp_value *src_val;
> >bool ret = false;
> >  
> > -  if (edge_within_scc (cs))
> > +  if (ipa_edge_within_scc (cs))
> >  return set_lattice_contains_variable (dest_lat);
> >  
> >for (src_val = src_lat->values; src_val; src_val = src_val->next)
> > @@ -2129,7 +2113,7 @@ propagate_constants_topo (struct topo_in
> >   struct cgraph_edge *cs;
> >  
> >   for (cs = v->callees; cs; cs = cs->next_callee)
> > -   if (edge_within_scc (cs)
> > +   if (ipa_edge_within_scc (cs)
> > && propagate_constants_accross_call (cs))
> >   push_node_to_stack (topo, cs->callee);
> >   v = pop_node_from_stack (topo);
> > @@ -2146,7 +2130,7 @@ propagate_constants_topo (struct topo_in
> > estimate_local_effects (v);
> > add_all_node_vals_to_toposort (v);
> > for (cs = v->callees; cs; cs = cs->next_callee)
> > - if (!

[C++ Patch] PR 58700

2013-10-12 Thread Paolo Carlini

Hi,

this ICE on invalid, a 4.8/4.9 Regression, simply started when 
build_lang_decl_loc was introduced, which wants a location as first 
argument: when the declarator is null, we can't pass the location as 
declarator->id_loc. For now input_location can do, I think, restores the 
old behavior.


Tested x86_64-linux.

Thanks,
Paolo.

/


/cp
2013-10-14  Paolo Carlini  

PR c++/58700
* decl.c (grokdeclarator): Don't try to pass declarator->id_loc
to build_lang_decl_loc when declarator is null.

/testsuite
2013-10-14  Paolo Carlini  

PR c++/58700
* g++.dg/parse/bitfield4.C: New.
Index: cp/decl.c
===
--- cp/decl.c   (revision 203495)
+++ cp/decl.c   (working copy)
@@ -10638,7 +10638,9 @@ grokdeclarator (const cp_declarator *declarator,
  {
/* C++ allows static class members.  All other work
   for this is done by grokfield.  */
-   decl = build_lang_decl_loc (declarator->id_loc,
+   decl = build_lang_decl_loc (declarator
+   ? declarator->id_loc
+   : input_location,
VAR_DECL, unqualified_id, type);
set_linkage_for_static_data_member (decl);
/* Even if there is an in-class initialization, DECL
Index: testsuite/g++.dg/parse/bitfield4.C
===
--- testsuite/g++.dg/parse/bitfield4.C  (revision 0)
+++ testsuite/g++.dg/parse/bitfield4.C  (working copy)
@@ -0,0 +1,6 @@
+// PR c++/58700
+
+struct A
+{
+  static int : 4;  // { dg-error "bit-field" }
+};


Go patch committed: Fix import of struct with embedded builtin type

2013-10-12 Thread Ian Lance Taylor
This patch to the Go frontend fixes the handling of an imported struct
with an embedded builtin type.  We used to permit the importing package
to refer to the field, which is wrong because builtin types are not
exported.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.  Will commit to 4.8
branch when it opens.

Ian

diff -r f6abee26f176 go/import.h
--- a/go/import.h	Fri Oct 11 15:51:49 2013 -0700
+++ b/go/import.h	Sat Oct 12 20:57:20 2013 -0700
@@ -149,6 +149,11 @@
   location() const
   { return this->location_; }
 
+  // Return the package we are importing.
+  Package*
+  package() const
+  { return this->package_; }
+
   // Return the next character.
   int
   peek_char()
diff -r f6abee26f176 go/types.cc
--- a/go/types.cc	Fri Oct 11 15:51:49 2013 -0700
+++ b/go/types.cc	Sat Oct 12 20:57:20 2013 -0700
@@ -5258,6 +5258,17 @@
 	}
 	  Type* ftype = imp->read_type();
 
+	  // We don't pack the names of builtin types.  In
+	  // Struct_field::is_field_name we cope with a hack.  Now we
+	  // need another hack so that we don't accidentally think
+	  // that an embedded builtin type is accessible from another
+	  // package (we know that all the builtin types are not
+	  // exported).
+	  if (ftype->named_type() != NULL
+	  && ftype->named_type()->is_builtin())
+	name = ('.' + imp->package()->pkgpath() + '.'
+		+ ftype->named_type()->name());
+
 	  Struct_field sf(Typed_identifier(name, ftype, imp->location()));
 
 	  if (imp->peek_char() == ' ')


RE: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-10-12 Thread Bernd Edlinger
Hi,

>> That would definitely be a good move. Maybe someone should approve it?
>
> Yes, but I agree that we ought to restrict it to trailing zero-sized arrays.
> That would help to allay Jakub's concerns about possible ABI fallouts.
>
> --
> Eric Botcazou

Other uses of zero-sized arrays in structures and unions are
at least questionable, and should be rejected. 

While this construct produces an error:

struct s
{
   int a[];
   float b;
};

error: flexible array member not at end of struct

This does not even produce a waning:

struct s
{
   int a[0];
   float b;
};

Would you agree that this "error: flexible array member"
should also be emitted for a zero-sized array member,
maybe as "error: zero-sized array member not at end of struct"?


Regards
Bernd.