Re: [PATCH][i386]Fix PR 57756
On Sat, Oct 12, 2013 at 2:16 AM, Sriraman Tallam wrote: > Ping. This looks nice. I suppose you grepped for effected targets and rs6000 was the only one besides i386. This is ok given target maintainers do not object within 24h and the test successfully bootstrapped and passed regression tests. >>> >>> I changed this patch quite a bit after I realized I missed many other >>> places where global_options field was accessed. The two specific >>> changes are: >>> >>> * Particularly, ix86_function_specific_save and >>> ix86_function_specific_restore had to be changed to copy to a specific >>> gcc_options structure than to global_options. >>> * Function ix86_option_override_internal accesses global_options via >>> macros, like TARGET_64BIT etc. This had to be changed. So, I created >>> new macros to accept a parameter and named them like TARGET_64BIT_P >>> and replaced all the accesses in i386.c >>> >>> I also marked as TODO a copy in function ix86_function_specific_save. >>> Here, the cl_target_option structure has a ix86_target_flags_explicit >>> field whereas the global_options structure does not have one. >>> Previously, this used to get saved to the global_options target_flags >>> field but this did not make sense to me. I have commented this line >>> out. Please let me know if a new field needs to be added. >>> >>> This patch passes bootstrap and all tests. Please take another look. AFAICS, the patch was approved by Richard on 23. September. None of the target maintainers have had any further objections to it. Regarding the TODO in i386.c: some options depend on and enable other options, for example -msse3 enables SSE2, etc, although user didn't explicitly add -msse2 to the options. This is not the case with global options. Since this field is used extensively through i386.c, I'd say to play safe and save it. So, x86 part is OK with the above change. Middle end is already approved and rs6000 maintainer didn't object the approval, so please go ahead and commit the patch. Thanks, Uros.
Re: PATCH: PR target/58690: internal compiler error: in copy_to_mode_reg, at explow.c:641
On Fri, Oct 11, 2013 at 8:38 PM, H.J. Lu wrote: > In x32, when a TLS address is in DImode and Pmode is SImode, > copy_addr_to_reg will fail. This patch adds ix86_copy_addr_to_reg > to first copy DImode address into a DImode register and then to generate > SImode SUBREG in this case. Tested on x32. OK for trunk? > > 2013-10-11 H.J. Lu > > PR target/58690 > * config/i386/i386.c (ix86_copy_addr_to_reg): New function. > (ix86_expand_movmem): Replace copy_addr_to_reg with > ix86_copy_addr_to_reg. > (ix86_expand_setmem): Likewise. > > gcc/testsuite/ > > 2013-10-11 H.J. Lu > > PR target/58690 > * gcc.target/i386/pr58690.c: New test > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 37c1bec..e39d63db 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -22076,6 +22076,21 @@ counter_mode (rtx count_exp) >return SImode; > } > > +/* Copy the address to a register in Pmode. Support x32 TLS address in > + DImode and Pmode in SImode. */ Copy the address to a Pmode register. This is used for x32 to truncate DImode TLS address to a SImode register. > +static rtx > +ix86_copy_addr_to_reg (rtx addr) > +{ > + if (GET_MODE (addr) != Pmode) > +{ > + gcc_assert (GET_MODE (addr) == DImode && Pmode == SImode); > + return gen_rtx_SUBREG (SImode, copy_to_mode_reg (DImode, addr), 0); > +} > + else > +return copy_addr_to_reg (addr); > +} No negative conditions please. Just switch arms of the if clause. OK with these changes. Do we also need this patch on 4.8? Thanks, Uros.
Re: [PATCH] Reassociate X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into ((X - CST1) & ~(CST2 - CST1)) == 0
On Sat, Oct 12, 2013 at 10:08:12AM +0800, Zhenqiang Chen wrote: > As you had mentioned, the transition in this patch does not reduce > instructions. But the preexisting optimization does. So we prefer to do the > preexisting optimization first. E.g. > > X == 10 || X == 12 || X == 26 Ok, that makes sense. @@ -2279,6 +2275,71 @@ optimize_range_tests (enum tree_code opcode, } } + /* Optimize X == CST1 || X == CST2 + if popcount (CST2 - CST1) == 1 into + ((X - CST1) & ~(CST2 - CST1)) == 0. */ Mention here also similarly to the other comment that it works also with ranges. + if (BRANCH_COST (optimize_function_for_speed_p (cfun), false) >= 2) +for (i = first; i < length; i++) + { + tree lowi, highi, lowj, highj, type, tem1, tem2, mask; + + if (ranges[i].exp == NULL_TREE || ranges[i].in_p) + continue; + type = TREE_TYPE (ranges[i].exp); + if (!INTEGRAL_TYPE_P (type)) + continue; + lowi = ranges[i].low; + if (lowi == NULL_TREE) + continue; The other loop has here: if (lowi == NULL_TREE) lowi = TYPE_MIN_VALUE (type); instead, which is better, can you please change it? + highi = ranges[i].high; + if (highi == NULL_TREE) + continue; + for (j = i + 1; j < length && j < i + 64; j++) + { + lowj = ranges[j].low; + if (lowj == NULL_TREE) + continue; + highj = ranges[j].high; + if (highj == NULL_TREE) + continue; The other loop has if (highj == NULL_TREE) highj = TYPE_MAX_VALUE (type); here instead. + if (ranges[j].exp == NULL_TREE || ranges[j].in_p + || (ranges[i].exp != ranges[j].exp)) + continue; Can you please move this test the lowj = assignment, and remove the ranges[j].exp == NULL_TREE test (both here and in the earlier loop)? I mean, we've checked at the beginning of for (i = first; i < length; i++) loop that ranges[i].exp is not NULL_TREE, and if ranges[j].exp is NULL_TREE, it will be different than ranges[i].exp. So if (ranges[i].exp != ranges[j].exp || ranges[j].in_p) continue; (and please also collapse the two checks in the first loop, so that both look similar). + /* Check lowj > highi. */ + tem1 = fold_binary (GT_EXPR, boolean_type_node, + lowj, highi); + if (tem1 == NULL_TREE || !integer_onep (tem1)) + continue; + /* Check highi - lowi == highj - lowj. */ + tem1 = fold_binary (MINUS_EXPR, type, highi, lowi); + if (tem1 == NULL_TREE || TREE_CODE (tem1) != INTEGER_CST) + continue; + tem2 = fold_binary (MINUS_EXPR, type, highj, lowj); + if (tem2 == NULL_TREE || TREE_CODE (tem2) != INTEGER_CST) + continue; + if (!tree_int_cst_equal (tem1, tem2)) + continue; + /* Check popcount (lowj - lowi) == 1. */ + tem1 = fold_binary (MINUS_EXPR, type, lowj, lowi); + if (tem1 == NULL_TREE || TREE_CODE (tem1) != INTEGER_CST) + continue; + if (tree_log2 (tem1) < 0) + continue; + mask = fold_build1 (BIT_NOT_EXPR, type, tem1); + tem1 = fold_binary (MINUS_EXPR, type, ranges[i].exp, lowi); + tem1 = fold_build2 (BIT_AND_EXPR, type, tem1, mask); + lowi = build_int_cst (type, 0); Please use lowj instead of lowi here. Because, if update_range_test fails, we continue looking for further matches in following ranges, and while lowj will be computed again, lowi will be assumed to contain the low bound of the first range. + if (update_range_test (ranges + i, ranges + j, 1, opcode, ops, tem1, + ranges[i].in_p, lowi, tem2, And here too. Or, alternatively to avoid duplication, you could put the whole for (i = first; i < length; i++) loop from the first optimization into another int i; for (pass = 0; pass < (BRANCH_COST (optimize_function_for_speed_p (cfun), false) >= 2) + 1; pass++) { } loop, use whatever is common in the loop unconditionally, and just conditionalize the rest on pass == 0 vs. pass == 1. Or maybe even better just move this whole loop into a new function, with ranges, opcode, first, length arguments plus another (bool?) argument which of the two optimizations you want to perform, and return the bool any_changes. Then you wouldn't run into line length issues that you could run into with the extra loop. --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-33.c @@ -0,0 +1,38 @@ +/* { dg-do run { target { ! "m68k*-*-* mmix*-*-* mep*-*-* bfin*-*-* v850*-*-* picochip*-*-* moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* xtensa*-*-*"} } } */ + +/* { dg-options "-O2 -fno-inline -fdump-tree-reassoc1-details" } */ +/* { dg-additional-options "-mbranch-cost=2" { target a
Re: Apply attribute returns_nonnull in libiberty
On Fri, 11 Oct 2013, DJ Delorie wrote: Alternatively, you could ask the other projects if they're ok with changing the xmalloc rule... Is there an official list where all the users of libiberty can be contacted? -- Marc Glisse
[committed] OpenMP 4.0 affinity fixes and improvements (PR libgomp/58691)
Hi! I've committed these changes to trunk. The first two hunks hopefully fixes warnings (turned into -Werror) with older glibc headers, the rest are changes requested in PR58691 - OMP_PROC_BIND default value is implementation defined, and Tobias suggested a better implementation defined default - if OMP_PLACES or GOMP_CPU_AFFINITY exist in environment and are successfully parsed, then the default is omp_proc_bind_true, otherwise omp_proc_bind_false. Explicit OMP_PROC_BIND=false makes those two env vars just parsed for basic syntax errors, but doesn't actually build places lists from them. 2013-10-12 Jakub Jelinek PR libgomp/58691 * config/linux/proc.c (gomp_cpuset_popcount): Add unused attribute to check variable. (gomp_init_num_threads): Move i variable declaration into #ifdef CPU_ALLOC_SIZE block. * config/linux/affinity.c (gomp_affinity_init_level): Test gomp_places_list_len == 0 rather than gomp_places_list == 0 when checking for topology reading error. * team.c (gomp_team_start): Don't handle bind == omp_proc_bind_false. * env.c (parse_affinity): Add ignore argument, if true, don't populate gomp_places_list, only parse env var and always return false. (parse_places_var): Likewise. Don't check gomp_global_icv.bind_var. (initialize_env): Always parse OMP_PLACES and GOMP_CPU_AFFINITY env vars, default to OMP_PROC_BIND=true if OMP_PROC_BIND wasn't specified and either of these variables were parsed correctly into a places list. --- libgomp/config/linux/proc.c.jj 2013-10-11 11:23:59.0 +0200 +++ libgomp/config/linux/proc.c 2013-10-11 23:17:18.533108120 +0200 @@ -59,7 +59,7 @@ gomp_cpuset_popcount (unsigned long cpus size_t i; unsigned long ret = 0; extern int check[sizeof (cpusetp->__bits[0]) == sizeof (unsigned long int) - ? 1 : -1]; + ? 1 : -1] __attribute__((unused)); for (i = 0; i < cpusetsize / sizeof (cpusetp->__bits[0]); i++) { @@ -94,7 +94,6 @@ gomp_init_num_threads (void) gomp_cpusetp); if (ret == 0) { - unsigned long i; /* Count only the CPUs this process can use. */ gomp_global_icv.nthreads_var = gomp_cpuset_popcount (gomp_cpuset_size, gomp_cpusetp); @@ -102,6 +101,7 @@ gomp_init_num_threads (void) break; gomp_get_cpuset_size = gomp_cpuset_size; #ifdef CPU_ALLOC_SIZE + unsigned long i; for (i = gomp_cpuset_size * 8; i; i--) if (CPU_ISSET_S (i - 1, gomp_cpuset_size, gomp_cpusetp)) break; --- libgomp/config/linux/affinity.c.jj 2013-10-11 11:23:59.0 +0200 +++ libgomp/config/linux/affinity.c 2013-10-11 23:35:05.566620759 +0200 @@ -309,7 +309,7 @@ gomp_affinity_init_level (int level, uns fclose (f); } } - if (gomp_places_list == 0) + if (gomp_places_list_len == 0) { if (!quiet) gomp_error ("Error reading %s topology", --- libgomp/team.c.jj 2013-10-11 11:23:59.0 +0200 +++ libgomp/team.c 2013-10-11 23:27:52.928843235 +0200 @@ -339,8 +339,6 @@ gomp_team_start (void (*fn) (void *), vo if (__builtin_expect (gomp_places_list != NULL, 0)) { - if (bind == omp_proc_bind_false) - bind = omp_proc_bind_true; /* Depending on chosen proc_bind model, set subpartition for the master thread and initialize helper variables P and optionally S, K and/or REST used by later place @@ -348,9 +346,6 @@ gomp_team_start (void (*fn) (void *), vo p = thr->place - 1; switch (bind) { - case omp_proc_bind_false: - bind = omp_proc_bind_true; - /* FALLTHRU */ case omp_proc_bind_true: case omp_proc_bind_close: if (nthreads > thr->ts.place_partition_len) --- libgomp/env.c.jj2013-10-11 11:23:59.0 +0200 +++ libgomp/env.c 2013-10-12 00:10:02.670107433 +0200 @@ -548,7 +548,7 @@ parse_one_place (char **envp, bool *nega } static bool -parse_places_var (const char *name) +parse_places_var (const char *name, bool ignore) { char *env = getenv (name), *end; bool any_negate = false; @@ -604,6 +604,10 @@ parse_places_var (const char *name) if (*env != '\0') goto invalid; } + + if (ignore) + return false; + return gomp_affinity_init_level (level, count, false); } @@ -634,7 +638,7 @@ parse_places_var (const char *name) } while (1); - if (gomp_global_icv.bind_var == omp_proc_bind_false) + if (ignore) return false; gomp_places_list_len = 0; @@ -911,7 +915,7 @@ parse_wait_policy (void) present and it was successfully parsed. */ static bool -parse_affinity (void) +parse_affinity (bool ignore) { char *env, *end, *start; int pass; @@ -928,6 +932,9 @@ parse_aff
Re: [PATCH i386] Use ordered comparisons for rounding builtins when -mno-ieee-fp
Hello! > 2013-10-11 Alexander Monakov > > * config/i386/i386.c (ix86_expand_sse_compare_and_jump): Use mode > provided by ix86_fp_compare_mode instead of CCFPUmode. > > testsuite/: > * gcc.target/i386/builtin-ucmp.c: New test. OK for mainline. Thanks, Uros.
Re: [AArch64] Fix early-clobber operands to vtbx[1,3]
On Fri, Oct 11, 2013 at 07:52:48PM +0100, Marcus Shawcroft wrote: > > 2013-10-11 James Greenhalgh > > > > * config/aarch64/arm_neon.h > > (vtbx<1,3>_8): Fix register constriants. > > > > OK? > > OK, and back port to 4.8 please. > /Marcus > Hi Marcus, I've committed this as revision 203478, but 4.8 is currently frozen for release, so Jakub (+CC) will have to approve it. This patch is small, not very controversial and only affects the AArch64 tree. Otherwise, I'll backport this when 4.8 opens again. Thanks, James >From ba67f60eb238b71c55cc4363f5061b6e6810990a Mon Sep 17 00:00:00 2001 From: James Greenhalgh Date: Fri, 13 Sep 2013 17:18:23 +0100 Subject: [AArch64] Fix early-clobber operands to vtbx[1,3] MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="1.8.3-rc0" This is a multi-part message in MIME format. --1.8.3-rc0 Content-Type: text/plain; charset=UTF-8; format=fixed Content-Transfer-Encoding: 8bit Hi, The vtbx intrinsics are implemented in assembly without noting that their tmp1 operand is early-clobber. This can, when the wind blows the wrong way, result in us making a total mess of the state of registers. Fix by marking the required operands as early-clobber. Regression tested against aarch64.exp with no problems. OK? Thanks, James --- 2013-10-11 James Greenhalgh * config/aarch64/arm_neon.h (vtbx<1,3>_8): Fix register constriants. --1.8.3-rc0 Content-Type: text/x-patch; name="0001-AArch64-Fix-early-clobber-operands-to-vtbx-1-3.patch" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename="0001-AArch64-Fix-early-clobber-operands-to-vtbx-1-3.patch" diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 482d7d0..f7c9db6 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -15636,7 +15636,7 @@ vtbx1_s8 (int8x8_t r, int8x8_t tab, int8x8_t idx) "cmhs %0.8b, %3.8b, %0.8b\n\t" "tbl %1.8b, {%2.16b}, %3.8b\n\t" "bsl %0.8b, %4.8b, %1.8b\n\t" - : "+w"(result), "=w"(tmp1) + : "+w"(result), "=&w"(tmp1) : "w"(temp), "w"(idx), "w"(r) : /* No clobbers */); return result; @@ -15652,7 +15652,7 @@ vtbx1_u8 (uint8x8_t r, uint8x8_t tab, uint8x8_t idx) "cmhs %0.8b, %3.8b, %0.8b\n\t" "tbl %1.8b, {%2.16b}, %3.8b\n\t" "bsl %0.8b, %4.8b, %1.8b\n\t" - : "+w"(result), "=w"(tmp1) + : "+w"(result), "=&w"(tmp1) : "w"(temp), "w"(idx), "w"(r) : /* No clobbers */); return result; @@ -15668,7 +15668,7 @@ vtbx1_p8 (poly8x8_t r, poly8x8_t tab, uint8x8_t idx) "cmhs %0.8b, %3.8b, %0.8b\n\t" "tbl %1.8b, {%2.16b}, %3.8b\n\t" "bsl %0.8b, %4.8b, %1.8b\n\t" - : "+w"(result), "=w"(tmp1) + : "+w"(result), "=&w"(tmp1) : "w"(temp), "w"(idx), "w"(r) : /* No clobbers */); return result; @@ -15723,7 +15723,7 @@ vtbx3_s8 (int8x8_t r, int8x8x3_t tab, int8x8_t idx) "cmhs %0.8b, %3.8b, %0.8b\n\t" "tbl %1.8b, {v16.16b - v17.16b}, %3.8b\n\t" "bsl %0.8b, %4.8b, %1.8b\n\t" - : "+w"(result), "=w"(tmp1) + : "+w"(result), "=&w"(tmp1) : "Q"(temp), "w"(idx), "w"(r) : "v16", "v17", "memory"); return result; @@ -15742,7 +15742,7 @@ vtbx3_u8 (uint8x8_t r, uint8x8x3_t tab, uint8x8_t idx) "cmhs %0.8b, %3.8b, %0.8b\n\t" "tbl %1.8b, {v16.16b - v17.16b}, %3.8b\n\t" "bsl %0.8b, %4.8b, %1.8b\n\t" - : "+w"(result), "=w"(tmp1) + : "+w"(result), "=&w"(tmp1) : "Q"(temp), "w"(idx), "w"(r) : "v16", "v17", "memory"); return result; @@ -15761,7 +15761,7 @@ vtbx3_p8 (poly8x8_t r, poly8x8x3_t tab, uint8x8_t idx) "cmhs %0.8b, %3.8b, %0.8b\n\t" "tbl %1.8b, {v16.16b - v17.16b}, %3.8b\n\t" "bsl %0.8b, %4.8b, %1.8b\n\t" - : "+w"(result), "=w"(tmp1) + : "+w"(result), "=&w"(tmp1) : "Q"(temp), "w"(idx), "w"(r) : "v16", "v17", "memory"); return result; --1.8.3-rc0--
Re: [AArch64] Fix early-clobber operands to vtbx[1,3]
On Sat, Oct 12, 2013 at 08:57:51AM +0100, James Greenhalgh wrote: > I've committed this as revision 203478, but 4.8 is currently > frozen for release, so Jakub (+CC) will have to approve it. This is ok for 4.8.2. Jakub
Re: PATCH: PR target/58690: internal compiler error: in copy_to_mode_reg, at explow.c:641
On Sat, Oct 12, 2013 at 12:21 AM, Uros Bizjak wrote: > On Fri, Oct 11, 2013 at 8:38 PM, H.J. Lu wrote: > >> In x32, when a TLS address is in DImode and Pmode is SImode, >> copy_addr_to_reg will fail. This patch adds ix86_copy_addr_to_reg >> to first copy DImode address into a DImode register and then to generate >> SImode SUBREG in this case. Tested on x32. OK for trunk? >> >> 2013-10-11 H.J. Lu >> >> PR target/58690 >> * config/i386/i386.c (ix86_copy_addr_to_reg): New function. >> (ix86_expand_movmem): Replace copy_addr_to_reg with >> ix86_copy_addr_to_reg. >> (ix86_expand_setmem): Likewise. >> >> gcc/testsuite/ >> >> 2013-10-11 H.J. Lu >> >> PR target/58690 >> * gcc.target/i386/pr58690.c: New test >> >> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c >> index 37c1bec..e39d63db 100644 >> --- a/gcc/config/i386/i386.c >> +++ b/gcc/config/i386/i386.c >> @@ -22076,6 +22076,21 @@ counter_mode (rtx count_exp) >>return SImode; >> } >> >> +/* Copy the address to a register in Pmode. Support x32 TLS address in >> + DImode and Pmode in SImode. */ > > Copy the address to a Pmode register. This is used for x32 to truncate > DImode TLS address to a SImode register. > >> +static rtx >> +ix86_copy_addr_to_reg (rtx addr) >> +{ >> + if (GET_MODE (addr) != Pmode) >> +{ >> + gcc_assert (GET_MODE (addr) == DImode && Pmode == SImode); >> + return gen_rtx_SUBREG (SImode, copy_to_mode_reg (DImode, addr), 0); >> +} >> + else >> +return copy_addr_to_reg (addr); >> +} > > No negative conditions please. Just switch arms of the if clause. > > OK with these changes. This is what I checked in. > Do we also need this patch on 4.8? Yes, it is also needed for 4.8. OK for 4.8 after a few days? > Thanks, > Uros. Thanks. -- H.J. -- gcc/ 2013-10-12 H.J. Lu PR target/58690 * config/i386/i386.c (ix86_copy_addr_to_reg): New function. (ix86_expand_movmem): Replace copy_addr_to_reg with ix86_copy_addr_to_reg. (ix86_expand_setmem): Likewise. gcc/testsuite/ 2013-10-12 H.J. Lu PR target/58690 * gcc.target/i386/pr58690.c: New test diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 37c1bec..aa15bc5 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -22076,6 +22076,21 @@ counter_mode (rtx count_exp) return SImode; } +/* Copy the address to a Pmode register. This is used for x32 to + truncate DImode TLS address to a SImode register. */ + +static rtx +ix86_copy_addr_to_reg (rtx addr) +{ + if (GET_MODE (addr) == Pmode) +return copy_addr_to_reg (addr); + else +{ + gcc_assert (GET_MODE (addr) == DImode && Pmode == SImode); + return gen_rtx_SUBREG (SImode, copy_to_mode_reg (DImode, addr), 0); +} +} + /* When SRCPTR is non-NULL, output simple loop to move memory pointer to SRCPTR to DESTPTR via chunks of MODE unrolled UNROLL times, overall size is COUNT specified in bytes. When SRCPTR is NULL, output the @@ -23032,8 +23047,8 @@ ix86_expand_movmem (rtx dst, rtx src, rtx count_exp, rtx align_exp, if (!count) count_exp = copy_to_mode_reg (GET_MODE (count_exp), count_exp); - destreg = copy_addr_to_reg (XEXP (dst, 0)); - srcreg = copy_addr_to_reg (XEXP (src, 0)); + destreg = ix86_copy_addr_to_reg (XEXP (dst, 0)); + srcreg = ix86_copy_addr_to_reg (XEXP (src, 0)); unroll_factor = 1; move_mode = word_mode; @@ -23436,7 +23451,7 @@ ix86_expand_setmem (rtx dst, rtx count_exp, rtx val_exp, rtx align_exp, if (!count) count_exp = copy_to_mode_reg (counter_mode (count_exp), count_exp); - destreg = copy_addr_to_reg (XEXP (dst, 0)); + destreg = ix86_copy_addr_to_reg (XEXP (dst, 0)); move_mode = word_mode; unroll_factor = 1; diff --git a/gcc/testsuite/gcc.target/i386/pr58690.c b/gcc/testsuite/gcc.target/ i386/pr58690.c new file mode 100644 index 000..87a87cc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr58690.c @@ -0,0 +1,14 @@ +/* { dg-do compile { target { ! { ia32 } } } } */ +/* { dg-require-effective-target maybe_x32 } */ +/* { dg-options "-O2 -mx32 -maddress-mode=short" } */ + +struct gomp_thread +{ + char foo[41]; +}; +extern __thread struct gomp_thread gomp_tls_data; +void +foo (void) +{ + __builtin_memset (&gomp_tls_data, '\0', sizeof (gomp_tls_data)); +} gnu-6:pts/14[22]> cat /tmp/pr58690.patch /export/gnu/import/git/src gcc/ 2013-10-12 H.J. Lu PR target/58690 * config/i386/i386.c (ix86_copy_addr_to_reg): New function. (ix86_expand_movmem): Replace copy_addr_to_reg with ix86_copy_addr_to_reg. (ix86_expand_setmem): Likewise. gcc/testsuite/ 2013-10-12 H.J. Lu PR target/58690 * gcc.target/i386/pr58690.c: New test diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 37c1bec..aa15bc5 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -22076,6 +22076,21 @@ counter_mode (rtx count_exp) return SImode; } +/
Re: PATCH: PR target/58690: internal compiler error: in copy_to_mode_reg, at explow.c:641
On Sat, Oct 12, 2013 at 4:57 PM, H.J. Lu wrote: >>> In x32, when a TLS address is in DImode and Pmode is SImode, >>> copy_addr_to_reg will fail. This patch adds ix86_copy_addr_to_reg >>> to first copy DImode address into a DImode register and then to generate >>> SImode SUBREG in this case. Tested on x32. OK for trunk? >>> >>> 2013-10-11 H.J. Lu >>> >>> PR target/58690 >>> * config/i386/i386.c (ix86_copy_addr_to_reg): New function. >>> (ix86_expand_movmem): Replace copy_addr_to_reg with >>> ix86_copy_addr_to_reg. >>> (ix86_expand_setmem): Likewise. >>> >>> gcc/testsuite/ >>> >>> 2013-10-11 H.J. Lu >>> >>> PR target/58690 >>> * gcc.target/i386/pr58690.c: New test > >> Do we also need this patch on 4.8? > > Yes, it is also needed for 4.8. OK for 4.8 after a few days? OK after the tree opens for check-ins again. Thanks, Uros.
Re: [C++ Patch] PR 58466
On 10/11/2013 09:31 PM, Paolo Carlini wrote: On 10/11/2013 08:29 PM, Paolo Carlini wrote: Are we maybe failing to bump processing_template_decl somewhere while processing the specialization? ... I'm finishing testing this, already past g++.dg/dg.exp... In the meanwhile testing completed successfully. Paolo.
Re: [PATCH PING] Move edge_within_scc to ipa-utils.c
> Ping. OK, thanks! Honza > > Thanks, > > Martin > > On Wed, Sep 11, 2013 at 03:02:02PM +0200, Martin Jambor wrote: > > Hi, > > > > edge_within_scc should really be a part of API accompanying > > ipa_reduced_postorder just like ipa_get_nodes_in_cycle and so this > > patch moves it to ipa-utils.c and gives it the ipa_ prefix. > > > > Bootstrapped and tested on x86_64-linux. OK for trunk? > > > > Thanks, > > > > Martin > > > > > > > > 2013-09-10 Martin Jambor > > > > * ipa-utils.h (ipa_edge_within_scc): Declare. > > * ipa-cp.c (edge_within_scc): Moved... > > * ipa-utils.c (ipa_edge_within_scc): ...here. Updated all callers. > > > > Index: src/gcc/ipa-utils.c > > === > > --- src.orig/gcc/ipa-utils.c > > +++ src/gcc/ipa-utils.c > > @@ -253,6 +253,22 @@ ipa_get_nodes_in_cycle (struct cgraph_no > >return v; > > } > > > > +/* Return true iff the CS is an edge within a strongly connected component > > as > > + computed by ipa_reduced_postorder. */ > > + > > +bool > > +ipa_edge_within_scc (struct cgraph_edge *cs) > > +{ > > + struct ipa_dfs_info *caller_dfs = (struct ipa_dfs_info *) > > cs->caller->symbol.aux; > > + struct ipa_dfs_info *callee_dfs; > > + struct cgraph_node *callee = cgraph_function_node (cs->callee, NULL); > > + > > + callee_dfs = (struct ipa_dfs_info *) callee->symbol.aux; > > + return (caller_dfs > > + && callee_dfs > > + && caller_dfs->scc_no == callee_dfs->scc_no); > > +} > > + > > struct postorder_stack > > { > >struct cgraph_node *node; > > Index: src/gcc/ipa-utils.h > > === > > --- src.orig/gcc/ipa-utils.h > > +++ src/gcc/ipa-utils.h > > @@ -42,6 +42,7 @@ int ipa_reduced_postorder (struct cgraph > > bool (*ignore_edge) (struct cgraph_edge *)); > > void ipa_free_postorder_info (void); > > vec ipa_get_nodes_in_cycle (struct cgraph_node *); > > +bool ipa_edge_within_scc (struct cgraph_edge *); > > int ipa_reverse_postorder (struct cgraph_node **); > > tree get_base_var (tree); > > void ipa_merge_profiles (struct cgraph_node *dst, > > Index: src/gcc/ipa-cp.c > > === > > --- src.orig/gcc/ipa-cp.c > > +++ src/gcc/ipa-cp.c > > @@ -287,22 +287,6 @@ ipa_lat_is_single_const (struct ipcp_lat > > return true; > > } > > > > -/* Return true iff the CS is an edge within a strongly connected component > > as > > - computed by ipa_reduced_postorder. */ > > - > > -static inline bool > > -edge_within_scc (struct cgraph_edge *cs) > > -{ > > - struct ipa_dfs_info *caller_dfs = (struct ipa_dfs_info *) > > cs->caller->symbol.aux; > > - struct ipa_dfs_info *callee_dfs; > > - struct cgraph_node *callee = cgraph_function_node (cs->callee, NULL); > > - > > - callee_dfs = (struct ipa_dfs_info *) callee->symbol.aux; > > - return (caller_dfs > > - && callee_dfs > > - && caller_dfs->scc_no == callee_dfs->scc_no); > > -} > > - > > /* Print V which is extracted from a value in a lattice to F. */ > > > > static void > > @@ -957,7 +941,7 @@ add_value_to_lattice (struct ipcp_lattic > >for (val = lat->values; val; val = val->next) > > if (values_equal_for_ipcp_p (val->value, newval)) > >{ > > - if (edge_within_scc (cs)) > > + if (ipa_edge_within_scc (cs)) > > { > > struct ipcp_value_source *s; > > for (s = val->sources; s ; s = s->next) > > @@ -1030,7 +1014,7 @@ propagate_vals_accross_pass_through (str > > are arithmetic functions with circular dependencies, there is infinite > > number of them and we would just make lattices bottom. */ > >if ((ipa_get_jf_pass_through_operation (jfunc) != NOP_EXPR) > > - and edge_within_scc (cs)) > > + && ipa_edge_within_scc (cs)) > > ret = set_lattice_contains_variable (dest_lat); > >else > > for (src_val = src_lat->values; src_val; src_val = src_val->next) > > @@ -1061,7 +1045,7 @@ propagate_vals_accross_ancestor (struct > >struct ipcp_value *src_val; > >bool ret = false; > > > > - if (edge_within_scc (cs)) > > + if (ipa_edge_within_scc (cs)) > > return set_lattice_contains_variable (dest_lat); > > > >for (src_val = src_lat->values; src_val; src_val = src_val->next) > > @@ -2129,7 +2113,7 @@ propagate_constants_topo (struct topo_in > > struct cgraph_edge *cs; > > > > for (cs = v->callees; cs; cs = cs->next_callee) > > - if (edge_within_scc (cs) > > + if (ipa_edge_within_scc (cs) > > && propagate_constants_accross_call (cs)) > > push_node_to_stack (topo, cs->callee); > > v = pop_node_from_stack (topo); > > @@ -2146,7 +2130,7 @@ propagate_constants_topo (struct topo_in > > estimate_local_effects (v); > > add_all_node_vals_to_toposort (v); > > for (cs = v->callees; cs; cs = cs->next_callee) > > - if (!
[C++ Patch] PR 58700
Hi, this ICE on invalid, a 4.8/4.9 Regression, simply started when build_lang_decl_loc was introduced, which wants a location as first argument: when the declarator is null, we can't pass the location as declarator->id_loc. For now input_location can do, I think, restores the old behavior. Tested x86_64-linux. Thanks, Paolo. / /cp 2013-10-14 Paolo Carlini PR c++/58700 * decl.c (grokdeclarator): Don't try to pass declarator->id_loc to build_lang_decl_loc when declarator is null. /testsuite 2013-10-14 Paolo Carlini PR c++/58700 * g++.dg/parse/bitfield4.C: New. Index: cp/decl.c === --- cp/decl.c (revision 203495) +++ cp/decl.c (working copy) @@ -10638,7 +10638,9 @@ grokdeclarator (const cp_declarator *declarator, { /* C++ allows static class members. All other work for this is done by grokfield. */ - decl = build_lang_decl_loc (declarator->id_loc, + decl = build_lang_decl_loc (declarator + ? declarator->id_loc + : input_location, VAR_DECL, unqualified_id, type); set_linkage_for_static_data_member (decl); /* Even if there is an in-class initialization, DECL Index: testsuite/g++.dg/parse/bitfield4.C === --- testsuite/g++.dg/parse/bitfield4.C (revision 0) +++ testsuite/g++.dg/parse/bitfield4.C (working copy) @@ -0,0 +1,6 @@ +// PR c++/58700 + +struct A +{ + static int : 4; // { dg-error "bit-field" } +};
Go patch committed: Fix import of struct with embedded builtin type
This patch to the Go frontend fixes the handling of an imported struct with an embedded builtin type. We used to permit the importing package to refer to the field, which is wrong because builtin types are not exported. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Will commit to 4.8 branch when it opens. Ian diff -r f6abee26f176 go/import.h --- a/go/import.h Fri Oct 11 15:51:49 2013 -0700 +++ b/go/import.h Sat Oct 12 20:57:20 2013 -0700 @@ -149,6 +149,11 @@ location() const { return this->location_; } + // Return the package we are importing. + Package* + package() const + { return this->package_; } + // Return the next character. int peek_char() diff -r f6abee26f176 go/types.cc --- a/go/types.cc Fri Oct 11 15:51:49 2013 -0700 +++ b/go/types.cc Sat Oct 12 20:57:20 2013 -0700 @@ -5258,6 +5258,17 @@ } Type* ftype = imp->read_type(); + // We don't pack the names of builtin types. In + // Struct_field::is_field_name we cope with a hack. Now we + // need another hack so that we don't accidentally think + // that an embedded builtin type is accessible from another + // package (we know that all the builtin types are not + // exported). + if (ftype->named_type() != NULL + && ftype->named_type()->is_builtin()) + name = ('.' + imp->package()->pkgpath() + '.' + + ftype->named_type()->name()); + Struct_field sf(Typed_identifier(name, ftype, imp->location())); if (imp->peek_char() == ' ')
RE: [PATCH, PR 57748] Check for out of bounds access, Part 2
Hi, >> That would definitely be a good move. Maybe someone should approve it? > > Yes, but I agree that we ought to restrict it to trailing zero-sized arrays. > That would help to allay Jakub's concerns about possible ABI fallouts. > > -- > Eric Botcazou Other uses of zero-sized arrays in structures and unions are at least questionable, and should be rejected. While this construct produces an error: struct s { int a[]; float b; }; error: flexible array member not at end of struct This does not even produce a waning: struct s { int a[0]; float b; }; Would you agree that this "error: flexible array member" should also be emitted for a zero-sized array member, maybe as "error: zero-sized array member not at end of struct"? Regards Bernd.