Assignee: unassigned at gcc dot gnu.org
Reporter: neleai at seznam dot cz
Target Milestone: ---
Hi,
When one uses __builtin_tolower it could be much slower to tolower. In this
example builtin_tolower gets just expanded to call. If one uses tolower instead
it gets expanded to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66989
--- Comment #2 from Ondrej Bilka ---
Created attachment 36050
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36050&action=edit
testing script
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66989
--- Comment #1 from Ondrej Bilka ---
Created attachment 36049
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36049&action=edit
benchmark
: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: neleai at seznam dot cz
Target Milestone: ---
This is another part of considering floating classification builtin
performance. This starts to be more cpu dependent as benchmark show large
improvement for core2 but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66986
Ondrej Bilka changed:
What|Removed |Added
Attachment #36047|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66986
--- Comment #4 from Ondrej Bilka ---
Ok added updated benchmark with adding -mtune=native and tests for core2,
haswell and fx10. It stays pretty consistent.
don't inline
conditional add
branched
real0m0.698s
user0m0.698s
sys 0m0.000
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66986
--- Comment #1 from Ondrej Bilka ---
Created attachment 36047
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36047&action=edit
testing script
: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: neleai at seznam dot cz
Target Milestone: ---
Created attachment 36046
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36046&action=edit
benchmark.
Hi,
On x64 floating builtins are considerably slow
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: neleai at seznam dot cz
Target Milestone: ---
Same problem as with strstr also applies here. As we know length we could
compare that to memrchr. Again instead simply calling that an implementation is
3.5 times
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: neleai at seznam dot cz
Target Milestone: ---
Hi, as I seen bug with string::== being slower than using strcmp I decided to
check other functions for regressions. Here string::find doesn't simply call
opti
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59048
Ondrej Bilka changed:
What|Removed |Added
CC||neleai at seznam dot cz
--- Comment #13
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64247
Ondrej Bilka changed:
What|Removed |Added
CC||neleai at seznam dot cz
--- Comment #7
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60731
Ondrej Bilka changed:
What|Removed |Added
CC||neleai at seznam dot cz
--- Comment #9
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46936
--- Comment #3 from Ondrej Bilka ---
> As per http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html, the
> annotation on the example function there "causes the compiler to check that,
> in > calls to my_memcpy, arguments dest and src are non
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46936
Ondrej Bilka changed:
What|Removed |Added
CC||neleai at seznam dot cz
--- Comment #1
Assignee: unassigned at gcc dot gnu.org
Reporter: neleai at seznam dot cz
Hi, in following testcase gcc -O3 generates following loop:
movq%rsi, %r9
subq%rdx, %r9
movq%r9, %rdi
movq%r9, %rsi
leaq16(%r9), %r8
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58112
--- Comment #1 from Ondrej Bilka ---
Created attachment 30628
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30628&action=edit
testcase
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: neleai at seznam dot cz
Hi,attached code generates extra push/pop rbx pair while there is no gpr
register assigned in segment between them.
This was generated by head xgcc -O3. A gcc-4.7 has
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58110
--- Comment #1 from Ondrej Bilka ---
Created attachment 30627
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30627&action=edit
testcase
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29776
--- Comment #15 from Ondrej Bilka ---
On Thu, Jul 04, 2013 at 07:46:07PM +, glisse at gcc dot gnu.org wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29776
>
> --- Comment #14 from Marc Glisse ---
> (In reply to Jakub Jelinek from comme
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57668
--- Comment #1 from Ondrej Bilka ---
Created attachment 30333
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30333&action=edit
benchmark for memcpy
Assignee: unassigned at gcc dot gnu.org
Reporter: neleai at seznam dot cz
Hi,
When I ran atached benchmark that test how gcc can optimize byte by byte
memcpy(attached memcpy_byte.c) I got a regression on nehalem and ivy_bridge
architectures.
I ran it by commands ./run machine 2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349
--- Comment #4 from Ondrej Bilka 2013-04-27 01:06:45
UTC ---
I found that AMD Bulldozer optimization guide states that moves from xmm to
GPR register should be done directly:"
10.4 Moving Data Between General-Purpose and XMM/YMM Registe
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57056
Ondrej Bilka changed:
What|Removed |Added
Attachment #29930|0 |1
is obsolete|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57056
Bug #: 57056
Summary: Missed optimization of finite finite builtin
Classification: Unclassified
Product: gcc
Version: 4.7.1
Status: UNCONFIRMED
Severity: normal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56676
--- Comment #2 from Ondrej Bilka 2013-03-21 14:53:26
UTC ---
On Thu, Mar 21, 2013 at 01:30:42PM +, rguenth at gcc dot gnu.org wrote:
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56676
>
>
>
> --- Comment #1 from Richard
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56676
Bug #: 56676
Summary: unnecesary splitted load when using avx2
Classification: Unclassified
Product: gcc
Version: 4.7.1
Status: UNCONFIRMED
Severity: normal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56631
--- Comment #1 from Ondrej Bilka 2013-03-16 11:36:04
UTC ---
Created attachment 29678
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29678
testcase
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56631
Bug #: 56631
Summary: duplicated sse code in switch
Classification: Unclassified
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priorit
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56338
--- Comment #1 from Ondrej Bilka 2013-02-15 07:42:10
UTC ---
Created attachment 29461
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29461
testcase
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56338
Bug #: 56338
Summary: register spill caused by loading constant
Classification: Unclassified
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56199
Ondrej Bilka changed:
What|Removed |Added
Status|RESOLVED|UNCONFIRMED
Resolution|IN
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56199
--- Comment #3 from Ondrej Bilka 2013-02-04 15:15:12
UTC ---
Created attachment 29349
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29349
icatche stressing benchmark
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56199
--- Comment #1 from Ondrej Bilka 2013-02-04 08:42:32
UTC ---
Created attachment 29344
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29344
benchmark
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56199
Bug #: 56199
Summary: strcpy/strcat builtins generates suboptimal code.
Classification: Unclassified
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: norma
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55945
Bug #: 55945
Summary: alloca aligns aligned pointers
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
Priority
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55600
--- Comment #3 from Ondrej Bilka 2012-12-26 22:05:37
UTC ---
Created attachment 29052
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29052
benchmark
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55600
--- Comment #2 from Ondrej Bilka 2012-12-26 22:03:59
UTC ---
Yes when 128 is replaced by smaller constant. Attached patch gives on my i5
following:
size 32
vector
real0m0.224s
user0m0.220s
sys0m0.000s
unroll
real0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55600
Bug #: 55600
Summary: excessive size of vectorized code
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
Prior
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54491
Bug #: 54491
Summary: interval membership optimization
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54481
--- Comment #2 from Ondrej Bilka 2012-09-05 09:42:27
UTC ---
On Wed, Sep 05, 2012 at 09:30:04AM +, rguenth at gcc dot gnu.org wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54481
>
> Richard Guenther changed:
>
>What|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54481
Bug #: 54481
Summary: missed optimization: unnecessary indirect call
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
Pri
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349
Ondrej Bilka changed:
What|Removed |Added
Status|RESOLVED|UNCONFIRMED
Resolution|INVALID
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54360
Bug #: 54360
Summary: missed optimalization: unnecessary indirect call
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
P
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349
Bug #: 54349
Summary: _mm_cvtsi128_si64 unnecessary stores value at stack
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54116
--- Comment #2 from Ondrej Bilka 2012-07-29 10:30:46
UTC ---
On Sun, Jul 29, 2012 at 10:13:41AM +, pinskia at gcc dot gnu.org wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54116
>
> --- Comment #1 from Andrew Pinski 2012-07-29
> 10:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54116
Bug #: 54116
Summary: suboptimal code for tight loops
Classification: Unclassified
Product: gcc
Version: 4.7.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54115
Bug #: 54115
Summary: Unnecessary sign extensions for __builtin_ctz et al.
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53907
Bug #: 53907
Summary: gcc uses unaligned load when aligned load was
requested
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
ned at gcc dot gnu dot org
ReportedBy: neleai at seznam dot cz
GCC target triplet: i486-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35527
ned at gcc dot gnu dot org
ReportedBy: neleai at seznam dot cz
GCC target triplet: i486-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35525
51 matches
Mail list logo