--- Comment #10 from suckfish at ihug dot co dot nz 2008-10-12 05:27
---
Changelog for patch if accepted [will do full bootstrap & make test]:
2008-10-12 Ralph Loader <[EMAIL PROTECTED]>
PR 37807
* rtlanal.c (numzero_bits1): Return early on vector types, avoiding
--- Comment #9 from suckfish at ihug dot co dot nz 2008-10-12 05:22 ---
Created an attachment (id=16486)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16486&action=view)
Possible fix for 37807
Patch above essentially stops nonzero_bits1 and num_sign_bit_copies1 processing
vector t
--- Comment #8 from suckfish at ihug dot co dot nz 2008-10-12 04:46 ---
Created an attachment (id=16484)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16484&action=view)
Test-case modfied to take exponential time on trunk too.
It turns out that it was fast on trunk because inlinin
--- Comment #7 from suckfish at ihug dot co dot nz 2008-10-12 02:39 ---
Bug 37809 opened for the issue in internal comment 6, as it is different.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37807
--- Comment #6 from suckfish at ihug dot co dot nz 2008-10-11 23:24 ---
I think this function actually gets miscompiled:
typedef int v2si __attribute__ ((vector_size (8)));
v2si foo (v2si x)
{
x &= (v2si) 0xll;
x = __builtin_ia32_psrad (x, 1);
x &= (v2si) 0x800
--- Comment #5 from suckfish at ihug dot co dot nz 2008-10-11 23:02 ---
It looks like nonzero_bits1 in rtlanal.c is going into an exponential
recursion.
AFAICS, that function doesn't deal properly with vector arithmetic anyway?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37807
--- Comment #4 from suckfish at ihug dot co dot nz 2008-10-11 22:23 ---
BTW, __builtin_ia32_psrld and __builtin_ia32_pslld are not documented on the
'X86 built-in functions' page of the manual.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37807
--- Comment #3 from rguenth at gcc dot gnu dot org 2008-10-11 22:06 ---
On the trunk it's fast if you fix the testcase to do
static INLINE value_t ROTATE_LEFT (value_t a, unsigned count)
{
return OR (LEFT (a, ((value_t){count, count})), RIGHT (a,
((value_t){32-count,32-count})));
}
--- Comment #2 from suckfish at ihug dot co dot nz 2008-10-11 21:35 ---
Using '-da' it looks like the 'combine' pass is the culprit:
$ pidof cc1
6410
$ ls -l /proc/6410/fd
... 4 -> ... slow.c.162r.combine
$ ls -s slow.c.162r.combine
0 slow.c.162r.combine
[is there an easier way to get
--- Comment #1 from suckfish at ihug dot co dot nz 2008-10-11 21:21 ---
Created an attachment (id=16482)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16482&action=view)
Code showing exponential compile time.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37807
10 matches
Mail list logo