--- Comment #2 from siarhei dot siamashka at gmail dot com 2010-08-15
01:01 ---
Here is another test example, now with some performance numbers for gcc 4.5.1
on 64-bit Intel Atom:
$ cat fibbonachi.c
/***/
#include
int fib(int n)
{
int sum, previous = -1, result
--- Comment #13 from siarhei dot siamashka at gmail dot com 2010-08-14
16:28 ---
(In reply to comment #12)
> Any news? :)
http://gcc.gnu.org/ml/gcc-patches/2010-08/msg00894.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45070
--- Comment #7 from siarhei dot siamashka at gmail dot com 2010-08-06
19:36 ---
Do you have any packed structs? I wonder if the problem could be somehow
related to PR45070. But it's hard to say anything until you narrow down the
problem to a smaller testcase.
--
siarhe
--- Comment #4 from siarhei dot siamashka at gmail dot com 2010-08-05
13:40 ---
Looks like this missed optimization regression was introduced in gcc 4.5
Are any similar fixes possible in 4.5 branch?
--
siarhei dot siamashka at gmail dot com changed:
What|Removed
--- Comment #6 from siarhei dot siamashka at gmail dot com 2010-07-28
08:37 ---
'arm_size_return_regs()' returns 2 when generating epilogue for 'next' function
here. And as a result, return value not registered in the mask, causing it to
be clobbered.
Would the fo
--- Comment #5 from siarhei dot siamashka at gmail dot com 2010-07-28
07:18 ---
The disassembly chunk from the comment above was from gcc 4.5.0, using '-Os
-match=armv5te' options.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45070
--- Comment #4 from siarhei dot siamashka at gmail dot com 2010-07-28
07:16 ---
Could not reproduce the problem with gcc 4.3.5
Disassembly of pr45070.o:
000c :
c: e92d401fpush{r0, r1, r2, r3, r4, lr}
10: e89cldm r0, {r2, r3}
14: e1a04000
--- Comment #2 from siarhei dot siamashka at gmail dot com 2010-07-27
20:07 ---
Created an attachment (id=21327)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21327&action=view)
simplified testcase
Confirmed with gcc 4.5.0 here. Also tried but could not reproduce the
--- Comment #1 from siarhei dot siamashka at gmail dot com 2010-07-25
23:25 ---
Created an attachment (id=21308)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21308&action=view)
packed-testcase.cpp
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45070
dot siamashka at gmail dot com
GCC target triplet: arm-unknown-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45070
--- Comment #14 from siarhei dot siamashka at gmail dot com 2010-07-22
20:54 ---
Thanks, this final variant of fix seems to work fine. Can this patch be
backported to 4.5 branch and released with gcc 4.5.1 too?
As I see it, the risk should be minimal because current gcc 4.5 branch is
--- Comment #12 from siarhei dot siamashka at gmail dot com 2010-07-19
13:54 ---
Updated the summary to better describe the problem (which is distro
independent).
The fact that this bug breaks pax-utils tool, which is a vital part of gentoo
packaging system, thus rendering the system
--- Comment #4 from siarhei dot siamashka at gmail dot com 2010-06-15
20:34 ---
(In reply to comment #3)
> Or use multiple alternatives feature for inline assembly constraints to emit
> either VMOV or VLD1?
Well, this kind of works :) But is very ugly and f
--- Comment #3 from siarhei dot siamashka at gmail dot com 2010-06-15
20:14 ---
The whole point of submitting this PR was to find an efficient way to use NEON
instructions to operate on any arbitrary scalar floating point values in order
to overcome Cortex-A8 VFP Lite inherent slowness
--- Comment #4 from siarhei dot siamashka at gmail dot com 2010-06-15
10:34 ---
Created an attachment (id=20913)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20913&action=view)
a fixed testcase
A fixed testcase attached.
The main problem here is that denormals are not
--- Comment #30 from siarhei dot siamashka at gmail dot com 2010-06-08
14:49 ---
(In reply to comment #29)
> Please file a new PR for that, with preprocessed source and all other relevant
> info for reproduction.
Thanks, filed PR44469
--
http://gcc.gnu.org/bugzilla/show_b
--- Comment #1 from siarhei dot siamashka at gmail dot com 2010-06-08
14:45 ---
Created an attachment (id=20868)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20868&action=view)
testcase.i
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44469
Severity: normal
Priority: P3
Component: bootstrap
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC target triplet: armv7l-unknown-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44469
--- Comment #28 from siarhei dot siamashka at gmail dot com 2010-05-18
10:09 ---
Thanks, this patch fixes bootstrap for powerpc/powerpc64. But still fails for
arm on all the same gcc_assert() in another place. Should a new bug be filed
about this?
--
http://gcc.gnu.org/bugzilla
--- Comment #10 from siarhei dot siamashka at gmail dot com 2010-05-17
18:48 ---
Maybe I'm too impatient, but is there anything that prevents this patch from
getting committed to SVN?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43698
--- Comment #22 from siarhei dot siamashka at gmail dot com 2010-05-17
11:31 ---
(In reply to comment #20)
> Perhaps dup of PR44071 that got fixed recently?
The problem is still reproducible with SVN rev 159480 in
'branches/gcc-4_5-branch', so the fix from PR44071 doe
--- Comment #21 from siarhei dot siamashka at gmail dot com 2010-05-17
10:07 ---
(In reply to comment #18)
> Created an attachment (id=20676)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20676&action=view) [edit]
> powerpc64-broken-unreachable.i
>
> With the
--- Comment #19 from siarhei dot siamashka at gmail dot com 2010-05-17
09:06 ---
Can anybody knowledgeable verify whether it was commit r151790 (
http://repo.or.cz/w/official-gcc.git/commit/9dbb96fec5e08762f97dda771522283f1fe9710f
) that is causing troubles when __builtin_unreachable
--- Comment #18 from siarhei dot siamashka at gmail dot com 2010-05-17
07:53 ---
Created an attachment (id=20676)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20676&action=view)
powerpc64-broken-unreachable.i
With the attached file (and '-O2 -c' optio
--- Comment #16 from siarhei dot siamashka at gmail dot com 2010-05-04
07:04 ---
So basically what we have is that gcc miscompiles itself somewhere in the code
where one of those ~7000 gcc_assert is used. The next step is to identify which
one of them triggers this bad behaviour
--- Comment #15 from siarhei dot siamashka at gmail dot com 2010-05-03
23:45 ---
As found by Raúl, indeed this regression was introduced in r150091. Reverting
this change in gcc 4.5.0 release resolves the problem.
Apparently the use of __builtin_unreachable() in gcc_assert macro
--- Comment #1 from siarhei dot siamashka at gmail dot com 2010-04-27
22:44 ---
"#pragma GCC " just does not seem to work with C++. Just
stumbled on it trying to narrow down something that looks like wrong-code
generation bug in gcc 4.5.0 when compiling qt4.
Prepending &quo
--- Comment #8 from siarhei dot siamashka at gmail dot com 2010-04-12
09:34 ---
(In reply to comment #7)
> Patch submitted here.
>
> http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00401.html
Thank you. I have been testing it for two days already.
It really helps (in the sens
intrinsics
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC target
--- Comment #1 from siarhei dot siamashka at gmail dot com 2010-04-12
06:17 ---
Or just "vmov.i32 q8, #0" would be better to avoid any potential data
dependency.
--
siarhei dot siamashka at gmail dot com changed:
What|Removed
--- Comment #2 from siarhei dot siamashka at gmail dot com 2010-04-12
05:26 ---
(In reply to comment #1)
> mov r3, #0
> vdup.32 d16, r3
Also maybe "veor.32 d16, d16, d16" here?
Or drop this NEON register initialization completely because i
--- Comment #2 from siarhei dot siamashka at gmail dot com 2010-04-09
20:34 ---
(In reply to comment #1)
> This is exacted really. Denormals are a weird case in general.
Well, denormals may be weird. But what about nan's, inf's and the other IEEE
stuff, which is not supp
ting point precision loss due to ARM NEON
autovectorization
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
--- Comment #6 from siarhei dot siamashka at gmail dot com 2010-04-09
08:04 ---
(In reply to comment #1)
> 2. Does gcc-4.4.3 work?
Yes, gcc-4.4.3 works (it just does not use 'rev' instruction). So it is a
regression in 4.5. Thanks for a very fast response and analysis
mmary: Invalid code when building gentoo pax-utils-0.1.19 with
-Os optimizations
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu
--- Comment #10 from siarhei dot siamashka at gmail dot com 2010-04-06
14:44 ---
(In reply to comment #8)
> It would be really helpful if someone can explain how to reproduce this with a
> cross-compiler. I will analyze/fix this problem when this is reproducible with
> a cr
--- Comment #7 from siarhei dot siamashka at gmail dot com 2010-04-06
11:01 ---
Long story short. This bootstrap failure seems to be related to
--disable-checking configure option. Reproduced on powerpc-unknown-linux-gnu
and armv7l-unknown-linux-gnueabi. I'm re-running the tests n
--- Comment #6 from siarhei dot siamashka at gmail dot com 2010-04-03
21:53 ---
(In reply to comment #5)
> But preprocessed source feeded to gcc-4.5-20100401 crosscompiler does not
> result in ICE. I'm going to try bootstrapping again with the patch from
> PR42509
>
--- Comment #5 from siarhei dot siamashka at gmail dot com 2010-04-03
17:39 ---
Got exactly the same ICE on ARM, bootstrapping gcc:
/var/tmp/portage/sys-devel/gcc-4.5.0_alpha20100401/work/gcc-4.5-20100401/gcc/sched-deps.c:
In function get_dep_weak_1:
/var/tmp/portage/sys-devel/gcc
--- Comment #6 from siarhei dot siamashka at gmail dot com 2010-03-31
22:50 ---
(In reply to comment #4)
> Not exactly a primary or secondary target. CCing maintainer.
I have been trying to find a complete list of gcc primary and secondary targets
with no luck so far. But at le
--- Comment #2 from siarhei dot siamashka at gmail dot com 2010-03-21
19:07 ---
works fine with gcc 4.4.3
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43469
--- Comment #1 from siarhei dot siamashka at gmail dot com 2010-03-21
19:05 ---
Created an attachment (id=20152)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20152&action=view)
localealias.i
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43469
ompile glibc for ARM thumb2
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GC
--- Comment #8 from siarhei dot siamashka at gmail dot com 2010-03-21
10:05 ---
What about just forbidding to use "q" registers in the inline assembly clobber
list? Is it difficult to do?
As a nice bonus, the existing potentially unsafe inline assembly will fail to
compil
--- Comment #6 from siarhei dot siamashka at gmail dot com 2010-03-21
03:56 ---
(In reply to comment #4)
> IMO the reasons as described in my email is another motivation for Neon
> programmers to be using intrinsics rather than inline assembler and to improve
> in gen
--- Comment #5 from siarhei dot siamashka at gmail dot com 2010-03-21
03:33 ---
I don't quite understand what's the problem: "This patch has the unhappy side
effect of clobbering s0, s1 and s2 if s3 is used because that's the only way we
can indicate that q0 is cl
--- Comment #7 from siarhei dot siamashka at gmail dot com 2010-03-20
13:58 ---
Resolved, as now it WORKSFORME.
--
siarhei dot siamashka at gmail dot com changed:
What|Removed |Added
--- Comment #6 from siarhei dot siamashka at gmail dot com 2010-03-20
13:55 ---
The crash disappeared when recompiling libXft-2.1.13 library with gcc 4.4.3.
Either it was fixed, or something else changed and it is not getting triggered
anymore. I guess this bug can be closed
--- Comment #5 from siarhei dot siamashka at gmail dot com 2010-03-20
08:45 ---
(In reply to comment #4)
> Also, what's the configuration in this case i.e what architecture,
> mode / cpu / fpu ?
Tested on ARM Cortex-A8 hardware, the problematic package built either
NCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC build triplet: arm-unknown-linux-gnueabi
GCC host triplet: arm-unknown-linux-gnueabi
GCC target triplet: a
--- Comment #3 from siarhei dot siamashka at gmail dot com 2010-03-14
12:44 ---
As of today, gcc seems to be clever enough to deduct whether to use single
precision or double precision VFP register when given "w" constraint (so P
modifier is not strictly needed). This behavio
--- Comment #5 from siarhei dot siamashka at gmail dot com 2010-03-14
12:23 ---
Do you want to force data into specific neon registers because of the
restriction on the neon registers which can be used as scalar operand for
multiplication?
It works for me
--- Comment #2 from siarhei dot siamashka at gmail dot com 2010-03-11
20:29 ---
When documentation is missing the needed bits information, these can be
typically extracted from the source code.
The only problem is that these constraints can be changed any time without
notice unless
--- Comment #7 from siarhei dot siamashka at gmail dot com 2009-12-21
08:53 ---
(In reply to comment #4)
> I would rather split the load out as a separate insn and allow it to be
> scheduled separately.
A question just to clarify the status of this issue. Are you waiting for
--- Comment #6 from siarhei dot siamashka at gmail dot com 2009-12-21
08:27 ---
Created an attachment (id=19356)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19356&action=view)
return-address-prediction-bench.c
This looks like a really serious performance issue. N
--- Comment #1 from siarhei dot siamashka at gmail dot com 2009-12-07
14:42 ---
Modifying the program to list q-registers in the clobber list provides even
more interesting results:
//
void f()
{
asm volatile("veor d8, d8, d8" : : :&qu
Component: inline-asm
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC build triplet: armv4tl-softfloat-linux-gnueabi
GCC host triplet: armv4tl-softfloat-linux-gnueabi
GCC target triplet: armv4tl-softfloat-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42321
--- Comment #7 from siarhei dot siamashka at gmail dot com 2009-11-03
20:09 ---
Thanks a lot for checking this. And sorry about the confusion caused by
attributing slowness of the testcase to the microcoded stuff (which turned out
to be not the case) without proper checking this first
--- Comment #1 from siarhei dot siamashka at gmail dot com 2009-10-29
15:21 ---
-O2:
0010 <.x>:
10: 2c 23 00 00 cmpdi r3,0
14: 7c 08 02 a6 mflrr0
18: f8 01 00 10 std r0,16(r1)
1c: f8 21 ff 81 stdur1,-128(r1)
20: 41 82
ance badly
Product: gcc
Version: 4.4.2
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC bui
--- Comment #3 from siarhei dot siamashka at gmail dot com 2009-09-01
15:08 ---
It works fine if '-fno-omit-frame-pointer' is removed. I agree that this is
quite a large and convoluted function. Unfortunately I did not manage to reduce
it to something smaller that would still
: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC build triplet: armv4tl-softfloat-linux-gnueabi
GCC host triplet: armv4tl-softfloat-linux-gnueabi
GCC target triplet: armv4tl-softfloat-linux-gnueabi
http://gcc.gnu.org/bugzi
--- Comment #1 from siarhei dot siamashka at gmail dot com 2009-08-14
22:48 ---
Created an attachment (id=18370)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18370&action=view)
xftrender.i
Preprocessed source. I did not manage to reduce it to a smaller testc
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC target triplet: armv4tl-softfloat-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41074
--- Comment #7 from siarhei dot siamashka at gmail dot com 2009-02-10
15:11 ---
(In reply to comment #6)
> This is not a bug, but a problem with your source code.
>
> In order to understand why, you need to pre-process the code and look at the
> output:
>
> ...
&
--- Comment #1 from siarhei dot siamashka at gmail dot com 2008-10-04
02:48 ---
For -Os optimization, the generated code is much better:
:
0: 55 push %ebp
1: 89 e5 mov%esp,%ebp
3: 53 push %ebx
rity: enhancement
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC build triplet: i686-pc-linux-gnu
GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu
http://gcc
--- Comment #5 from siarhei dot siamashka at gmail dot com 2008-09-03
09:52 ---
I'm sorry, is anybody investigating this quite serious bug? If nobody has
time/motivation to do this work, would it make sense for me to try fixing it
myself and submit a patch here?
--
--- Comment #1 from siarhei dot siamashka at gmail dot com 2008-09-02
15:50 ---
Well, looks like it is not a missing feature, but just incompleteness of
documentation :)
It is possible to use double precision floating point registers and NEON
128-bit registers in the following way
s in inline asm arguments (VFP)
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: inline-asm
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC ho
--- Comment #4 from siarhei dot siamashka at gmail dot com 2008-05-13
12:32 ---
This bug is still present in gcc 4.3
--
siarhei dot siamashka at gmail dot com changed:
What|Removed |Added
--- Comment #2 from siarhei dot siamashka at gmail dot com 2007-07-11
07:06 ---
Tried this test with gcc 4.2.0, it also works correctly. So looks like the
problem only shows up in gcc 4.1.x
--
siarhei dot siamashka at gmail dot com changed:
What|Removed
dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC build triplet: i686-pc-linux-gnu
GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32687
--- Comment #2 from siarhei dot siamashka at gmail dot com 2007-04-25
07:28 ---
This may be related to #31386
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31693
--- Comment #1 from siarhei dot siamashka at gmail dot com 2007-04-25
07:26 ---
Created an attachment (id=13436)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13436&action=view)
testcase for this bug
Testcase attached
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31693
ne-asm
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC target triplet: arm-softfloat-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31693
76 matches
Mail list logo