On Mon, May 16, 2016 at 11:38:04AM +0100, Wilco Dijkstra wrote:
> GCC expands switch statements in a very simplistic way and tries to use a
> table
> expansion even when it is a bad idea for performance or codesize.
> GCC typically emits extremely sparse tables that contain mostly default
> entri
On 05/23/16 15:32, Evandro Menezes wrote:
I'm fine with this patch, as it achieves in part what I intended
before: going beyond the default_case_values_threshold, too
conservative for Exynos M1. My concern is particularly what happens
to in-order targets, like the ubiquitous A53.
I'll get
On 05/24/16 07:08, Wilco Dijkstra wrote:
Jim Wilson wrote:
It looks like a slight lose on qdf24xx on SPEC CPU2006 at -O3. I see
about a 0.37% loss on the integer benchmarks, and no significant
change on the FP benchmarks. The integer loss is mainly due to
458.sjeng which drops 2%. We had trie
Jim Wilson wrote:
> It looks like a slight lose on qdf24xx on SPEC CPU2006 at -O3. I see
> about a 0.37% loss on the integer benchmarks, and no significant
> change on the FP benchmarks. The integer loss is mainly due to
> 458.sjeng which drops 2%. We had tried various values for
> max_case_valu
On 05/18/16 20:03, Jim Wilson wrote:
Though I see that the original patch from Samsung that added the
max_case_values field has the -O3 check, so there was apparently some
reason why they wanted it to work that way. The value that the
exynos-m1 is using, 48, looks pretty large, so maybe they th
On Mon, May 16, 2016 at 4:30 AM, James Greenhalgh
wrote:
> As this change will change code generation for all cores (except
> Exynos-M1), I'd like to hear from those with more detailed knowledge of
> ThunderX, X-Gene and qdf24xx before I take this patch.
It looks like a slight lose on qdf24xx on
James Greenhalgh wrote:
> As this change will change code generation for all cores (except
> Exynos-M1), I'd like to hear from those with more detailed knowledge of
> ThunderX, X-Gene and qdf24xx before I take this patch.
>
> Let's give it another week or so for comments, and expand the CC list.
N
On Mon, May 16, 2016 at 11:38:04AM +0100, Wilco Dijkstra wrote:
> ping
As this change will change code generation for all cores (except
Exynos-M1), I'd like to hear from those with more detailed knowledge of
ThunderX, X-Gene and qdf24xx before I take this patch.
Let's give it another week or so f
ping
From: Wilco Dijkstra
Sent: 22 April 2016 17:15
To: gcc-patches@gcc.gnu.org
Cc: nd
Subject: [PATCH][AArch64] Improve aarch64_case_values_threshold setting
GCC expands switch statements in a very simplistic way and tries to use a table
expansion even wh
Kyrill Tkachov wrote:
> On 25/04/16 20:21, Wilco Dijkstra wrote:
> > The GCC switch expansion is awful, so
> > even with a good indirect predictor it is better to use conditional
> > branches.
>
> In what way is it awful? If there's something we can do better at
> can you file a bug report with a
Hi Wilco,
On 25/04/16 20:21, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I assume that you mean that such improvements are true for
-mcpu=generic, yes? On which target, A53 or A57 or other?
It's true for any CPU setting. The SPEC results are for Cortex-A57
however I wrote a microbenchmark th
On 04/26/16 11:14, Wilco Dijkstra wrote:
Evandro Menezes wrote:
True, but the results when running on A53 could be quite different.
GCC is ~1.2% faster on Cortex-A53 built for generic, but there is no
difference in perlbench.
Looks good, then. Fine by me.
Thanks for your patience,
--
Evand
Evandro Menezes wrote:
>
> True, but the results when running on A53 could be quite different.
GCC is ~1.2% faster on Cortex-A53 built for generic, but there is no
difference in perlbench.
Wilco
On 04/25/16 14:58, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I agree with your assessment, but I'm more curious to understand how
this change affects code built with the default -mcpu=generic when run
on both A53 and A57, the typical configuration of big.LITTLE machines.
I wouldn't expect th
Evandro Menezes wrote:
> I agree with your assessment, but I'm more curious to understand how
> this change affects code built with the default -mcpu=generic when run
> on both A53 and A57, the typical configuration of big.LITTLE machines.
I wouldn't expect the result to be any different as the -m
On 04/25/16 14:21, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I assume that you mean that such improvements are true for
-mcpu=generic, yes? On which target, A53 or A57 or other?
It's true for any CPU setting. The SPEC results are for Cortex-A57
however I wrote a microbenchmark that shows im
Evandro Menezes wrote:
> I assume that you mean that such improvements are true for
> -mcpu=generic, yes? On which target, A53 or A57 or other?
It's true for any CPU setting. The SPEC results are for Cortex-A57
however I wrote a microbenchmark that shows improvements on
all targets I have access
On 04/22/16 11:15, Wilco Dijkstra wrote:
This patch fixes that by setting the default aarch64_case_values_threshold to
16 when the per-CPU tuning is not set. On SPEC2006 this improves the switch
heavy benchmarks GCC and perlbench both in performance (1-2%) as well as size
(0.5-1% smaller).
I a
18 matches
Mail list logo