Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On 10/13/2011 10:07 PM, H.J. Lu wrote: On Thu, Oct 13, 2011 at 11:15 AM, Richard Kenner wrote: The answer to H.J.'s "Why do we do it for MEM then?" is simply "because no one ever thought about not doing it" No, that's false. The same expand_compound_operation / make_compound_operation pair

Re: [Patch,AVR] Print no-return functions as JMP

2011-10-13 Thread Georg-Johann Lay
Richard Henderson schrieb: On 10/13/2011 12:00 PM, Georg-Johann Lay wrote: What do you propose? o A command line option that is on per default like -mnoreturn-tail-calls or -mjmp-noreturn The command-line-option. I think I prefer -mjump-noreturn, as the inverse -mno-noreturn-tail-calls is

[PATCH] Add mulv4di3 expander

2011-10-13 Thread Jakub Jelinek
Hi! mulv2di3 can be expanded the same as mulv2di3. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-10-14 Jakub Jelinek * config/i386/sse.md (mulv2di3): Macroize using VI8_AVX2 iterator. (ashl3): Use VI248_AVX2 iterator instead of VI248_128.

[PATCH] 32-byte integer vec_interleave_{high,low}

2011-10-13 Thread Jakub Jelinek
Hi! This patch adds VI_256 vec_interleave_{high,low} as well as using it in the vector expander. While it needs 3 insns for each, the first two will be actually CSEd if both patterns are expanded (the usual case from the vectorizer, e.g. for vect-strided-store-u32-i2.c), so we end up with 2 vunpck

Re: [PATCH 0/6] Cleanups for generic vector permutation.

2011-10-13 Thread David Miller
From: r...@redhat.com Date: Thu, 13 Oct 2011 20:43:19 -0700 > These patches allow __builtin_shuffle to handle any vector permutation > via optabs. It allows for a not-uncommon fallback to byte permutation > at rtl expansion time, while leaving the tree/gimple-level permutation > as element-based.

Re: [google] support for building Linux kernel with FDO (issue4523061)

2011-10-13 Thread Xinliang David Li
This patch is for google/main which is 4.7 based, but the validated version is in google_46 branch (which is based on 4.6). By the way (given that you are from intel), do you know if linux kernel can be built with icc with PGO turned on? Our intern Xiaotian has tried to use icc (12.0) to built ke

[PATCH] Merge sparc plus/minus vector operations using a code iterator.

2011-10-13 Thread David Miller
This is based upon suggestions from David Bremner. Committed to trunk. gcc/ * config/sparc/sparc.md (plusminus): New code iterator. (plusminus_insn): New code attr. (addv2si3, subv2si3, addv4hi3, subv4hi3, addv2hi3, subv2hi3): Merge using plusminus and plusminus_

[PATCH 6/6] Expand vector permutation with vec_perm and vec_perm_const.

2011-10-13 Thread rth
From: Richard Henderson --- gcc/doc/md.texi |6 ++ gcc/genopinit.c |1 + gcc/optabs.c| 216 --- gcc/optabs.h| 12 ++- gcc/tree-vect-generic.c |2 +- 5 files changed, 181 insertions(+), 56 deletions

[PATCH 4/6] Move lowering of vector shifts from v/s to v/v to rtl.

2011-10-13 Thread rth
From: Richard Henderson This allows other rtl expanders to rely on shifts of vector by scalar. This replaces the patch posted a couple of days ago that adds these scalar shifts to the rs6000 backend, following the info that Sparc needs this fallback as well. --- gcc/optabs.c

[PATCH 3/6] i386: Implement vec_perm_const.

2011-10-13 Thread rth
From: Richard Henderson --- gcc/config/i386/i386-protos.h |1 + gcc/config/i386/i386.c| 61 + gcc/config/i386/sse.md| 21 ++ 3 files changed, 83 insertions(+), 0 deletions(-) diff --git a/gcc/config/i386/i386-protos.h b

[PATCH 5/6] rs6000: Fix typo in rs6000_expand_vector_init

2011-10-13 Thread rth
From: Richard Henderson Of course we don't support vectors of size <= 4. We're supposed to be checking the vector element size. --- gcc/config/rs6000/rs6000.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 4

[PATCH 2/6] spu: Implement vec_permv16qi.

2011-10-13 Thread rth
From: Richard Henderson --- gcc/config/spu/spu.md | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/gcc/config/spu/spu.md b/gcc/config/spu/spu.md index 676d54e..00cfaa4 100644 --- a/gcc/config/spu/spu.md +++ b/gcc/config/spu/spu.md @@ -4395,6 +4395,18 @@ selb\t

[PATCH 0/6] Cleanups for generic vector permutation.

2011-10-13 Thread rth
From: Richard Henderson These patches allow __builtin_shuffle to handle any vector permutation via optabs. It allows for a not-uncommon fallback to byte permutation at rtl expansion time, while leaving the tree/gimple-level permutation as element-based. All three targets which heretofore suppor

[PATCH 1/6] rs6000: Implement vec_permv16qi.

2011-10-13 Thread rth
From: Richard Henderson --- gcc/config/rs6000/altivec.md |9 + 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 9e7437e..84c5444 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.m

[C++ Patch / RFC] PR 38174

2011-10-13 Thread Paolo Carlini
Hi, so, assuming I understood correctly Jason's tips (thanks again for your patience ;) the fix for this pretty old issue seems even simpler than I guessed at triage time, because we already have available composite_pointer_type, doing all the real work. The below passes the testsuite on x86_

Re: [google] support for building Linux kernel with FDO (issue4523061)

2011-10-13 Thread vulcansh
Rong Xu wrote: > > That will be good. > But you never know, we internally have fixed some bugs that filed to > us because people use kernel's old gcov code (many versions guarded by > ifdef) for their tests. > > -Rong > Has there been any progress one this patch? What version of gcc is this

[v3] libstdc++/50714

2011-10-13 Thread Paolo Carlini
Hi, tested x86_64-linux, committed to mainline. Thanks, Paolo. / 2011-10-13 Paolo Carlini PR libstdc++/50714 * include/bits/codecvt.h (codecvt<>::codecvt(size_t)): Initialize _M_c_locale_codecvt member. * testsuite/22_locale/codecvt_byname

RE: ObjC/ObjC++ Patch: rewrite objc/objc++ frontend hashtables

2011-10-13 Thread Nicola Pero
I actually forgot to post a tiny bit that is required to support the additional objc/objc-map.h and objc/objc-map.c files. It's part of the same patch. Apologies. Thanks Index: gengtype.c === --- gengtype.c (revision 179947) +++ g

ObjC/ObjC++ Patch: rewrite objc/objc++ frontend hashtables

2011-10-13 Thread Nicola Pero
This patch finally rewrites the hashtables used by the ObjC (and ObjC++) frontend. The new code speeds up the compiler by about 4% when compiling the standard GNUstep ObjC system headers with -fsyntax-only. That's quite good for a change that does nothing but swap a hashtable implementation wi

Re: [C++ Patch] PR 17212

2011-10-13 Thread Paolo Carlini
On 10/13/2011 04:24 PM, Jason Merrill wrote: On 10/13/2011 09:53 AM, Paolo Carlini wrote: Yes I briefly wondered that but I know *so* little about that front end... Do you think we can just add it? Probably yes ;) Definitely. Anything supported in C++ should also be in Obj-C++ by default. Ok,

Re: [PR50672, PATCH] Fix ice triggered by -ftree-tail-merge: verify_ssa failed: no immediate_use list

2011-10-13 Thread Tom de Vries
On 10/12/2011 02:19 PM, Richard Guenther wrote: > On Wed, Oct 12, 2011 at 8:35 AM, Tom de Vries wrote: >> Richard, >> >> I have a patch for PR50672. >> >> When compiling the testcase from the PR with -ftree-tail-merge, the scenario >> is >> as follows: >> >> We start out tail_merge_optimize with

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 3:52 PM, Richard Kenner wrote: >> Like ths? > > Yes, that's what I meant.  Thanks. > > Again, I'd suggest doing some performance testing on this just to verify > that it doesn't pessimize things. > I will run SPEC CPU 2K/2006 on ia32, x86-64 and x32. -- H.J.

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> Like ths? Yes, that's what I meant. Thanks. Again, I'd suggest doing some performance testing on this just to verify that it doesn't pessimize things.

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 3:33 PM, Richard Kenner wrote: >> I am testing this patch.  The difference is it checks nonzero >> bits of the first operand. > > I would suggest moving (and expanding) the comments from the existing block > into your new block. > Like ths? -- H.J. --- diff --git a/gcc/c

Re: Ping shrink wrap patches

2011-10-13 Thread Alan Modra
On Thu, Oct 13, 2011 at 07:04:59PM +0200, Bernd Schmidt wrote: > On 10/13/11 18:50, Bernd Schmidt wrote: > > On 10/13/11 14:27, Alan Modra wrote: > >> Without the ifcvt > >> optimization for a function "int foo (int x)" we might have something > >> like > >> > >> r29 = r3; // save r3 in callee sav

Re: [Patch, Darwin] fix PR50699.

2011-10-13 Thread Iain Sandoe
On 13 Oct 2011, at 23:22, Mike Stump wrote: +/* Add $LDBL128 suffix to long double builtins for ppc darwin. */ static void -darwin_patch_builtin (int fncode) +darwin_patch_builtin (enum built_in_function fncode) This is a property of the target machine. DARWIN_PPC is a property of the tar

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> I am testing this patch. The difference is it checks nonzero > bits of the first operand. I would suggest moving (and expanding) the comments from the existing block into your new block.

Re: [Patch, Darwin] fix PR50699.

2011-10-13 Thread Mike Stump
On Oct 13, 2011, at 8:22 AM, Iain Sandoe wrote: > .. this looks like an (almost) obvious fix for the bootstrap breakage... No... > -/* Add $LDBL128 suffix to long double builtins. */ > +#if defined (__ppc__) || defined (__ppc64__) __ppc__ is a property of the host machine. > +/* Add $LDBL128 s

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> But the current code converts (and X 3) into a bit extraction > since ((i = exact_log2 (UINTVAL (XEXP (x, 1)) + 1)) >= 0) is true > when UINTVAL (XEXP (x, 1)) == 3. Should we do it or not? By adding the test for nonzero bits, you'd potentially be doing the conversion more often (which is the po

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 2:45 PM, H.J. Lu wrote: > On Thu, Oct 13, 2011 at 2:30 PM, Richard Kenner > wrote: >>> It is because mask 0x is optimized to 0xfffc by keeping track >>> of non-zero bits in registers and the above code doesn't take that >>> into account. >> >> Then I'd suggest

Re: [pph] Make libcpp symbol validation a warning (issue5235061)

2011-10-13 Thread Gabriel Charette
Just looked at the line_table related sections, but see comments below: On Tue, Oct 11, 2011 at 4:26 PM, Diego Novillo wrote: > > Currently, the consistency check done on pre-processor symbols is > triggering on symbols that are not really problematic (e.g., symbols > used for double-include guar

Re: [PATCH] vec_unpack{s,u}_float_{hi,lo}_{v8hi,v4si} support

2011-10-13 Thread Richard Henderson
On 10/13/2011 02:35 PM, Jakub Jelinek wrote: > * config/i386/sse.md (*avx_cvtdq2pd256_2): Rename to... > (avx_cvtdq2pd256_2): ... this. > (sseunpackfltmode): New mode attr. > (vec_unpacks_float_hi_v8hi, vec_unpacks_float_lo_v8hi, > vec_unpacku_float_hi_v8hi, vec_unpack

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 2:30 PM, Richard Kenner wrote: >> It is because mask 0x is optimized to 0xfffc by keeping track >> of non-zero bits in registers and the above code doesn't take that >> into account. > > Then I'd suggest modifying that code so that it does rather than > essentia

[pph] Triage test status. (issue5271044)

2011-10-13 Thread Lawrence Crowl
Mark test x3hardorder.cc as passing. Update many other tests to indicate their current failure reason. Fix the readme. Index: gcc/testsuite/ChangeLog.pph 2011-10-13 Lawrence Crowl * g++.dg/pph/README: Put z files in regular expression. * g++.dg/pph/x3hardorder.cc: Mark pas

[PATCH] vec_unpack{s,u}_float_{hi,lo}_{v8hi,v4si} support

2011-10-13 Thread Jakub Jelinek
Hi! This patch allows 32-byte vectorization of e.g. short a[512]; unsigned short b[512]; int c[512]; unsigned int d[512]; float e[512]; double f[512]; void f1 (void) { int i; for (i = 0; i < 512; ++i) e[i] = a[i]; } void f2 (void) { int i; for (i = 0; i < 512; ++i) e[i] = b[i]; }

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> It is because mask 0x is optimized to 0xfffc by keeping track > of non-zero bits in registers and the above code doesn't take that > into account. Then I'd suggest modifying that code so that it does rather than essentially duplicating it. But I'd recommend running some performance

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 2:23 PM, Richard Kenner wrote: >> Does it look OK? > > No. > > If I understand your code correctly, there's essentially the same code > as you have a bit above that: > >      /* If the constant is one less than a power of two, this might be >         representable by an ext

Re: [Patch,AVR] Fix PR46278, Take #3

2011-10-13 Thread Georg-Johann Lay
Weddington, Eric a écrit: Georg-Johann Lay wrote: This is yet another attempt to fix PR46278 (fake X addressing). After the previous clean-ups it is just a small change. caller-saves.c tries to eliminate call-clobbered hard-regs allocated to pseudos around function calls and that leads to

C++ PATCH for c++/50614 (ICE with NSDMI and -fcompare-debug)

2011-10-13 Thread Jason Merrill
The problem here was that with -fcompare-debug, execute_cleanup_cfg_post_optimizing wants to print out all the decls used in a function, which involves printing the DECL_INITIAL, and the instantiation of a FIELD_DECL with an NSDMI had an uninstantiated DECL_INITIAL, so the dumper got confused b

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> Does it look OK? No. If I understand your code correctly, there's essentially the same code as you have a bit above that: /* If the constant is one less than a power of two, this might be representable by an extraction even if no shift is present. If it doesn't end

[PATCH] Fix the RTL of some sparc VIS patterns.

2011-10-13 Thread David Miller
Based upon a review of the sparc VIS support by Richard Henderson. Committed to trunk. gcc/ * config/sparc/sparc.md (UNSPEC_FPMERGE): Delete. (UNSPEC_MUL16AU, UNSPEC_MUL8, UNSPEC_MUL8SU, UNSPEC_MULDSU): New unspecs. (fpmerge_vis): Remove inaccurate comment, repre

C++ PATCH for c++/50437 (ICE on auto with lambda in template)

2011-10-13 Thread Jason Merrill
The problem here was that auto deduced the closure type of the lambda in the template, and then instantiation tried to instantiate the closure outside of the context of the LAMBDA_EXPR, which doesn't work. So I've changed LAMBDA_EXPR to always have a TREE_TYPE of NULL_TREE, and put the closure

Re: [rs6000] Enable scalar shifts of vectors

2011-10-13 Thread David Edelsohn
On Wed, Oct 12, 2011 at 6:32 PM, Richard Henderson wrote: > I suppose technically the middle-end could be improved to implement > ashl as vashl by broadcasting the scalar, but Altivec > is the only extant SIMD ISA that would make use of this.  All of > the others can arrange for constant shifts to

Re: ifcvt cond_exec support rewrite

2011-10-13 Thread Bernd Schmidt
Ping. Better support for nested if-then-else structures: http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01935.html Bernd

Re: [trans-mem] Add gl_wt TM method.

2011-10-13 Thread Torvald Riegel
On Tue, 2011-08-30 at 00:33 +0200, Torvald Riegel wrote: > The attached patches are several changes required for a new TM method, > gl_wt (global lock, write-through), which is added by the last patch > > patch1: Add TM-method-specific begin code. All time-based TMs need to > know at which point i

Re: Predication during scheduling

2011-10-13 Thread Bernd Schmidt
On 09/30/11 17:29, Bernd Schmidt wrote: > This patch allows a backend to set a new scheduler flag, DO_PREDICATION, > which will make the haifa scheduler try to move insns across jumps by > predicating them. On C6X, the primary benefit is to fill jump delay slots. Ping. http://gcc.gnu.org/ml/gcc-p

Re: [patch] dwarf2out: Drop the size + performance overhead of DW_AT_sibling

2011-10-13 Thread Jan Kratochvil
On Wed, 12 Oct 2011 16:18:07 +0200, Jan Kratochvil wrote: > On Wed, 12 Oct 2011 16:07:24 +0200, Tristan Gingold wrote: > > I fear that this may degrade performance of other debuggers. What about > > adding a command line option ? > > I can test idb, I do not find the difference measurable. Drop

Re: [PATCH] Add explicit VIS intrinsics for addition and subtraction.

2011-10-13 Thread David Miller
From: Eric Botcazou Date: Thu, 29 Sep 2011 00:38:49 +0200 > [Vlad, if you have a few minutes, would you mind having a look at the couple > of > questions at the end of the message? Thanks in advance]. Vlad, ping?

[lra] patch to improve elimination and inheritance

2011-10-13 Thread Vladimir Makarov
The following patch contains some of my work for last 2 weeks. First of all, it improves register elimination to permit elimination a register to itself. It resulted in fixing SPEC2000 code size degradation for ppc64. The patch also contains improving inheritance by assigning the same hard

[pph] Fix builtin merges (issue5276044)

2011-10-13 Thread Diego Novillo
Computing the assembler name of a builtin function prevents the middle-end from open coding the builtin. This was causing assembly differences between the non-pph and pph compiles. Tested on x86_64. Committed to branch. Diego. cp/ChangeLog.pph * pph-streamer.c (pph_merge_name): Do n

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 11:15 AM, Richard Kenner wrote: >> The answer to H.J.'s "Why do we do it for MEM then?" is simply >> "because no one ever thought about not doing it" > > No, that's false.  The same expand_compound_operation / > make_compound_operation > pair is present in the MEM case as

RE: [Patch,AVR] Print no-return functions as JMP

2011-10-13 Thread Paul_Koning
>> You should have a way to turn this off. Otherwise this makes >> debugging the call to abort impossible. > >What do you propose? > >o A command line option that is on per default like > -mnoreturn-tail-calls or -mjmp-noreturn > >o Hard-coded factor out some function names like "abort", > "exi

Re: [Patch,AVR] Print no-return functions as JMP

2011-10-13 Thread Richard Henderson
On 10/13/2011 12:00 PM, Georg-Johann Lay wrote: > What do you propose? > > o A command line option that is on per default like > -mnoreturn-tail-calls or -mjmp-noreturn The command-line-option. I think I prefer -mjump-noreturn, as the inverse -mno-noreturn-tail-calls is too awkward. r~

Re: [Patch,AVR] Print no-return functions as JMP

2011-10-13 Thread Georg-Johann Lay
Richard Henderson schrieb: > On 10/13/2011 11:16 AM, Georg-Johann Lay wrote: >> This patch saves some ticks and bytes on stack by JUMPing to no-return >> functions instead of CALLing them. >> >> Passes without regression. >> >> Ok for trunk? >> >> Johann >> >> * config/avr/avr-protos.h (avr_ou

Re: [C++ Patch] PR 17212

2011-10-13 Thread Mike Stump
On Oct 13, 2011, at 6:53 AM, Paolo Carlini wrote: >> Why not support it in Obj-C++, too? > > Yes I briefly wondered that but I know *so* little about that front end... Do > you think we can just add it? Probably yes ;) The ground rule is, make ObjC behave just like C, unless an ObjC expert decid

RE: [Patch,AVR] Fix PR46278, Take #3

2011-10-13 Thread Weddington, Eric
> -Original Message- > From: Georg-Johann Lay [mailto:a...@gjlay.de] > Sent: Thursday, October 13, 2011 8:32 AM > To: gcc-patches@gcc.gnu.org > Cc: Anatoly Sokolov; Denis Chertykov; Weddington, Eric > Subject: [Patch,AVR] Fix PR46278, Take #3 > > This is yet another attempt to fix PR4627

Re: [rs6000] Enable scalar shifts of vectors

2011-10-13 Thread Richard Henderson
On 10/13/2011 11:36 AM, David Edelsohn wrote: > Are there testcases in the GCC testsuite that exercise these patterns? I thought the vectorizer would use them. E.g. gcc.dg/vect/vect-shift-3.c. I see that I should have added ppc to check_effective_target_vect_shift_scalar, though, to enable even

Re: [Patch,AVR] Print no-return functions as JMP

2011-10-13 Thread Richard Henderson
On 10/13/2011 11:16 AM, Georg-Johann Lay wrote: > This patch saves some ticks and bytes on stack by JUMPing to no-return > functions instead of CALLing them. > > Passes without regression. > > Ok for trunk? > > Johann > > * config/avr/avr-protos.h (avr_out_call): New prototype. > *

Re: [rs6000] Enable scalar shifts of vectors

2011-10-13 Thread David Edelsohn
On Wed, Oct 12, 2011 at 6:32 PM, Richard Henderson wrote: > I suppose technically the middle-end could be improved to implement > ashl as vashl by broadcasting the scalar, but Altivec > is the only extant SIMD ISA that would make use of this.  All of > the others can arrange for constant shifts to

[Patch,AVR] Print no-return functions as JMP

2011-10-13 Thread Georg-Johann Lay
This patch saves some ticks and bytes on stack by JUMPing to no-return functions instead of CALLing them. Passes without regression. Ok for trunk? Johann * config/avr/avr-protos.h (avr_out_call): New prototype. * config/avr/avr.md (adjust_len): Add alternative "call". (c

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> The answer to H.J.'s "Why do we do it for MEM then?" is simply > "because no one ever thought about not doing it" No, that's false. The same expand_compound_operation / make_compound_operation pair is present in the MEM case as in the SET case. It's just that there's some bug here that's noti

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> We first expand zero_extend:DI address to and:DI and then try > to restore zero_extend:DI. Why do we do this transformation > to begin with? Suppose there were an outer AND that duplicated what this one did. Then when you combine those two, you merge it to one AND. Then make_compound_operatio

C++ PATCH for c++/50618 (wrong-code with virtual bases)

2011-10-13 Thread Jason Merrill
When an object is value-initialized, if the type doesn't have a user-provided default constructor, the object is zero-initialized first, and then the synthesized constructor is called. The problem in this PR was that when value-initializing a base in a constructor we were zero-initializing vir

Re: [PATCH, ARM] Unaligned accesses for builtin memcpy [2/2]

2011-10-13 Thread Julian Brown
On Wed, 28 Sep 2011 14:33:17 +0100 Ramana Radhakrishnan wrote: > On 6 May 2011 14:13, Julian Brown wrote: > > Hi, > > > > This is the second of two patches to add unaligned-access support to > > the ARM backend. It builds on the first patch to provide support for > > unaligned accesses when expa

RE: Intrinsics for N2965: Type traits and base classes

2011-10-13 Thread Michael Spertus
Addressing Jason's comments: Index: libstdc++-v3/include/tr2/type_traits === --- libstdc++-v3/include/tr2/type_traits(revision 0) +++ libstdc++-v3/include/tr2/type_traits(revision 0) @@ -0,0 +1,96 @@ +// TR2 type_trait

Re: [RFA/ARM][Patch 02/05]: LDRD generation instead of POP in A15 Thumb2 epilogue.

2011-10-13 Thread Richard Henderson
On 10/11/2011 02:21 AM, Sameera Deshpande wrote: > +/* When saved-register index (i) is odd, RTXs for both the > registers > + to be loaded are generated in above given LDRD pattern, and > the > + pattern can be emitted now. */ > +par = emit_in

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On Thu, Oct 13, 2011 at 19:06, Richard Kenner wrote: >> An and:DI is cheaper than a zero_extend:DI of an and:SI. > > That depends strongly on the constants and whether the machine is 32-bit > or 64-bit. Yes, the rtx_costs take care of that. > But that's irrelevant in this case since the and:SI w

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 10:21 AM, Paolo Bonzini wrote: > On Thu, Oct 13, 2011 at 19:19, H.J. Lu wrote: >> On Thu, Oct 13, 2011 at 10:01 AM, Paolo Bonzini wrote: >>> On 10/13/2011 06:35 PM, Richard Kenner wrote: > > It never calls make_extraction.  There are several cases handled > fo

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On Thu, Oct 13, 2011 at 19:19, H.J. Lu wrote: > On Thu, Oct 13, 2011 at 10:01 AM, Paolo Bonzini wrote: >> On 10/13/2011 06:35 PM, Richard Kenner wrote: It never calls make_extraction.  There are several cases handled for AND operation. But (and:DI (plus:DI (subreg:DI (mul

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 10:01 AM, Paolo Bonzini wrote: > On 10/13/2011 06:35 PM, Richard Kenner wrote: >>> >>> It never calls make_extraction.  There are several cases handled >>> for AND operation. But >>> >>> (and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ]) >>>                (const_int

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> An and:DI is cheaper than a zero_extend:DI of an and:SI. That depends strongly on the constants and whether the machine is 32-bit or 64-bit. But that's irrelevant in this case since the and:SI will be removed (it reflects what already been done).

Re: Ping shrink wrap patches

2011-10-13 Thread Bernd Schmidt
On 10/13/11 18:50, Bernd Schmidt wrote: > On 10/13/11 14:27, Alan Modra wrote: >> Without the ifcvt >> optimization for a function "int foo (int x)" we might have something >> like >> >> r29 = r3; // save r3 in callee saved reg >> if (some test) goto exit_label >> // main body of foo, calling ot

Re: Vector alignment tracking

2011-10-13 Thread Jakub Jelinek
On Thu, Oct 13, 2011 at 06:57:47PM +0200, Andi Kleen wrote: > > Or I am missing someting? > > I often see the x86 vectorizer with -mtune=generic generate a lot of > complicated code just to adjust for potential misalignment. > > My thought was just if the alias oracle knows what the original > de

Re: [PATCH, rs6000] Preserve link stack for 476 cpus

2011-10-13 Thread Richard Henderson
On 10/13/2011 08:49 AM, Peter Bergner wrote: > + if (TARGET_LINK_STACK) > + asm_fprintf (file, "\tbl 1f\n\tb 2f\n1:\n\tblr\n2:\n"); > + else > + asm_fprintf (file, "\tbcl 20,31,1f\n1:\n"); Wouldn't it be better to set up an out-of-line "blr" insn that could be shared by

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On 10/13/2011 06:35 PM, Richard Kenner wrote: It never calls make_extraction. There are several cases handled for AND operation. But (and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ]) (const_int 4 [0x4])) 0) (subreg:DI (reg:SI 106) 0)) (const_int 4294967292 [0x

Re: Vector alignment tracking

2011-10-13 Thread Andi Kleen
> Or I am missing someting? I often see the x86 vectorizer with -mtune=generic generate a lot of complicated code just to adjust for potential misalignment. My thought was just if the alias oracle knows what the original declaration is, and it's available for changes (e.g. LTO), it would be like

Re: Ping shrink wrap patches

2011-10-13 Thread Bernd Schmidt
On 10/13/11 14:27, Alan Modra wrote: > Without the ifcvt > optimization for a function "int foo (int x)" we might have something > like > > r29 = r3; // save r3 in callee saved reg > if (some test) goto exit_label > // main body of foo, calling other functions > r3 = 0; > return; > exit_label

Re: Ping shrink wrap patches

2011-10-13 Thread Richard Henderson
On 10/13/2011 05:27 AM, Alan Modra wrote: > Ping > http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01002.html > http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01003.html Ok. > http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01596.html > > The last one needs a tweak. > s/FUNCTION_VALUE_REGNO_P/targetm.ca

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> It never calls make_extraction. There are several cases handled > for AND operation. But > > (and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ]) >(const_int 4 [0x4])) 0) >(subreg:DI (reg:SI 106) 0)) >(const_int 4294967292 [0xfffc])) > > isn't one of them.

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 9:11 AM, Richard Kenner wrote: >> at the end.  make_compound_operation doesn't know how to >> restore ZERO_EXTEND. > > It does in general.  See make_extraction, which it calls.  The question is > why it doesn't in this case.  That's the bug. > It never calls make_extractio

Re: [PATCH] vec_set for 32-byte vectors

2011-10-13 Thread Richard Henderson
On 10/13/2011 09:21 AM, Jakub Jelinek wrote: > * config/i386/sse.md (vec_set): Change V_128 iterator mode to V. Ok. r~

[PATCH] vec_set for 32-byte vectors

2011-10-13 Thread Jakub Jelinek
Hi! As noted by Kirill Yukhin (and what lead to the previous tree-ssa.c patch), vec_set wasn't wired for 32-byte vectors. Although ix86_expand_vector_set handles 32-byte vectors just fine (even for AVX and integer vectors), without the expander we'd force things into memory etc. Fixed thusly, boo

[committed] Drop TREE_ADDRESSABLE from BIT_FIELD_REF on lhs accessed vectors/complex

2011-10-13 Thread Jakub Jelinek
Hi! I've noticed that #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type vector (4, int) f1 (vector (4, int) a, int b) { ((int *)&a)[0] = b; return a; } as well as vector (4, int) f2 (vector (4, int) a, int b) { a[0] = b; return a; } don't result

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> at the end. make_compound_operation doesn't know how to > restore ZERO_EXTEND. It does in general. See make_extraction, which it calls. The question is why it doesn't in this case. That's the bug.

Re: Vector alignment tracking

2011-10-13 Thread Artem Shinkarov
On Thu, Oct 13, 2011 at 4:54 PM, Andi Kleen wrote: > Artem Shinkarov writes: >> >> 1) Currently in C we cannot provide information that an array is >> aligned to a certain number.  The problem is hidden in the fact, that > > Have you considered doing it the other way round: when an optimization >

Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching

2011-10-13 Thread Matthew Gretton-Dann
This patch seems to have caused PR50717: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50717 Thanks, Matt On 19/08/11 15:49, Andrew Stubbs wrote: On 14/07/11 15:35, Richard Guenther wrote: Ok. I've just committed this updated patch. I found bugs with VOIDmode constants that have caused me

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 7:14 AM, Richard Kenner wrote: >> Or being fooled by the 0xfffc masking, perhaps. > > No, I'm pretty sure that's NOT the case.  The *whole point* of the > routine is to deal with that masking. > I got (gdb) step make_compound_operation (x=0x7139c4c8, in_code=MEM)

Re: Vector alignment tracking

2011-10-13 Thread Andi Kleen
Artem Shinkarov writes: > > 1) Currently in C we cannot provide information that an array is > aligned to a certain number. The problem is hidden in the fact, that Have you considered doing it the other way round: when an optimization needs something to be aligned, make the declaration aligned?

Re: [PATCH, rs6000] Preserve link stack for 476 cpus

2011-10-13 Thread Peter Bergner
On Mon, 2011-09-12 at 15:29 -0400, David Edelsohn wrote: > First, please choose a more informative option name. > -mpreserve-link-stack seems like something generally useful for all > processors and someone may randomly add the option. It always is > useful to preserve the link stack -- that's why

[pph] Make streamer hooks internal (issue5278043)

2011-10-13 Thread Diego Novillo
To avoid confusion, I moved the callbacks into pph-streamer.c so they can be internal to that file. They don't need to be called directly ever. Tested on x86_64. Committed to branch. Diego. * pph-streamer-in.c (pph_in_mergeable_tree): Fix comment. (pph_read_tree): Move to pph

[pph] shorten timeout on c1limits-externalid.cc and XFAIL (issue5278042)

2011-10-13 Thread Diego Novillo
I think this may be an infinite loop, but it may also just be taking a long time to do the merge operations. Teste on x86_64. Committed to branch. Diego. * g++.dg/pph/c1limits-externalid.cc: Add shorter timeout. Document failure mode. diff --git a/gcc/testsuite/g++.dg/pph/c1l

Vector alignment tracking

2011-10-13 Thread Artem Shinkarov
Hi I would like to share some plans about improving the situation with vector alignment tracking. First of all, I would like to start with a well-known bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50716. There are several aspects of the problem: 1) We would like to avoid the quiet segmentati

Re: [PATCH] Fix PR50712

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 4:55 AM, Richard Guenther wrote: > > This fixes PR50712, an issue with IPA split uncovered by adding > verifier calls after it ... we need to also gimplify reads of > register typed memory when passing it as argument. > > Bootstrapped on x86_64-unknown-linux-gnu, testing in

[Patch, Darwin] fix PR50699.

2011-10-13 Thread Iain Sandoe
.. this looks like an (almost) obvious fix for the bootstrap breakage... OK for trunk? Iain Index: gcc/config/darwin.c === --- gcc/config/darwin.c (revision 179865) +++ gcc/config/darwin.c (working copy) @@ -2957,10 +2957,11 @@ darwi

Re: RFC: Add ADD_RESTRICT tree code

2011-10-13 Thread Joseph S. Myers
On Thu, 13 Oct 2011, Michael Matz wrote: > Yeah. But I continue to think that this reading is against the intent (or > should be). All the examples in the standard and rationale never say > anything about pointers to restricted objects and the problematic cases > one can construct with them,

Re: [pph] More DECL merging. (issue5268042)

2011-10-13 Thread Diego Novillo
I'm seeing an infinite loop in g++.dg/pph/c1limits-externalid.cc. The while() loop in pph_search_in_chain is not ending. Or maybe it's falling into the N^2 trap you mention in that routine? I've added a short timeout to this test and XFAIL'd it so you can debug it. Diego.

Re: [testsuite] require arm_little_endian in two tests

2011-10-13 Thread Richard Earnshaw
On 13/10/11 15:56, Joseph S. Myers wrote: > On Thu, 13 Oct 2011, Richard Earnshaw wrote: > >> 2) Change the compiler to make initializers of vectors assign elements >> of initializers to consecutive lanes in a vector, rather than the >> current behaviour of 'casting' an array of elements to a vect

[Patch]: fix typo in rs6000.c (AIX bootstrap broken)

2011-10-13 Thread Tristan Gingold
Hi, looks like an obvious typo. Ok for trunk ? Tristan. 2011-10-13 Tristan Gingold * config/rs6000/rs6000.c (rs6000_init_builtins): Fix typo. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 4fd2192..3bfe33e 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gc

Re: RFC: Add ADD_RESTRICT tree code

2011-10-13 Thread Michael Matz
Hi, On Thu, 13 Oct 2011, Jakub Jelinek wrote: > On Thu, Oct 13, 2011 at 02:57:56PM +0200, Michael Matz wrote: > > struct S {int * restrict p;}; > > void foo (struct S *s, struct S *t) { > > s->p[0] = 0; > > t->p[0] = 1; // undefined if s->p == t->p; the caller was responsible > >

Re: [Patch, Fortran, committed] PR 50659: [4.4/4.5/4.6/4.7 Regression] ICE with PROCEDURE statement

2011-10-13 Thread Janus Weil
> Committed to the 4.6 branch as r179864: ... and to 4.5 as r179923. Cheers, Janus > 2011/10/9 Janus Weil : >> Hi all, >> >> I have just committed as obvious a patch for an ICE-on-valid problem >> with PROCEDURE statements: >> >> http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=179723 >> >> Th

  1   2   >