[Bug tree-optimization/90332] New test case gcc.dg/vect/slp-reduc-sad-2.c in r270847 fails

2020-03-11 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90332 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #7

[Bug testsuite/94023] [9 regression] gcc.dg/vect/slp-perm-12.c fails starting with r9-5008

2020-03-16 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94023 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug testsuite/94019] [9 regression] gcc.dg/vect/vect-over-widen-17.c fails starting with g:370c2ebe8fa20e0812cd2d533d4ed38ee2d37c85, r9-1590

2020-03-16 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94019 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/94043] [9/10 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-03-17 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org

[Bug tree-optimization/94043] [9/10 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-03-19 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 --- Comment #4 from Kewen Lin --- This was just exposed from my commit, it can also be reproduced without my commit but with -fno-vect-cost-model. Some loops we have for this case: ;; Loop 1 ;; header 3, latch 10 ;; depth 1, outer 0 ;; nodes:

[Bug tree-optimization/94043] [9/10 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-03-20 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 --- Comment #6 from Kewen Lin --- (In reply to rguent...@suse.de from comment #5) > On Fri, 20 Mar 2020, linkw at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 > > > > --- C

[Bug tree-optimization/94043] [9/10 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-03-22 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 --- Comment #8 from Kewen Lin --- > It's most likely either SCEV or expand_simple_operations looking throuhg > the single-arg PHI (which we should avoid for LC PHI nodes) Thanks Richi, I found the loop-closed PHI form was broken after we finishe

[Bug tree-optimization/94043] [9/10 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-03-23 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 --- Comment #10 from Kewen Lin --- (In reply to Richard Biener from comment #9) > OK, so it's indeed vectorizable_live_operation not paying attention to > loop-closed SSA form. > > What it should do before building the lane extract is create a _

[Bug testsuite/93935] [9/10 regression] gcc.dg/vect/bb-slp-over-widen-2.c fails starting with r262371 (r10-6856)

2020-03-24 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93935 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/94043] [9/10 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-03-25 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 --- Comment #12 from Kewen Lin --- Created attachment 48122 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48122&action=edit ppc64le tested patch Thanks Richi! A patch draft attached to ensure on the right track, also bootstrapped/regress

[Bug tree-optimization/94043] [9/10 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-03-26 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 --- Comment #14 from Kewen Lin --- (In reply to Richard Biener from comment #13) > > + /* Find all SSA NAMEs in stmts which is defined in current loop, > create > +PHIs for them, and replace them with phi results accordingly. */ >

[Bug tree-optimization/94043] [9/10 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-03-26 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 Kewen Lin changed: What|Removed |Added Attachment #48122|0 |1 is obsolete|

[Bug tree-optimization/90332] New test case gcc.dg/vect/slp-reduc-sad-2.c in r270847 fails

2020-03-27 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90332 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug tree-optimization/94401] pr92420.c fails on aarch64 since r10-7415

2020-03-30 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org

[Bug tree-optimization/94401] [10 Regression] pr92420.c fails on aarch64 since r10-7415

2020-03-30 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401 Kewen Lin changed: What|Removed |Added CC||segher at gcc dot gnu.org,

[Bug tree-optimization/94401] [10 Regression] pr92420.c fails on aarch64 since r10-7415

2020-03-30 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401 --- Comment #5 from Kewen Lin --- Created attachment 48150 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48150&action=edit untested patch This can fix the REG failures on aarch64.

[Bug tree-optimization/94043] [9/10 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-03-31 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/94043] [9 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-04-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 --- Comment #21 from Kewen Lin --- (In reply to Richard Biener from comment #20) > Re-open. It's marked as broken in GCC 9 so a backport is in oder (if the > issue really reproduces there). Thanks for pointing it out. I'll backport it two week

[Bug tree-optimization/94443] [10 Regression] 510.parest_r and 526.blender_r ICE: verify_ssa failed since r10-7491-gbd0f22a8d5caea8905f38ff1fafce31c1b7d33ad

2020-04-01 Thread linkw at gcc dot gnu.org
gcc dot gnu.org |linkw at gcc dot gnu.org --- Comment #3 from Kewen Lin --- Thanks for reporting this, confirmed.

[Bug middle-end/94449] [10 Regression] FAIL: gcc.c-torture/execute/pr92904.c gcc.dg/torture/pr48731.c

2020-04-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94449 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org --- Comment

[Bug tree-optimization/94451] [10 Regression] April 1st 2020 GCC does not compile spec 2017 gcc_r benchmark with -O3

2020-04-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94451 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org Status

[Bug tree-optimization/94443] [10 Regression] 510.parest_r and 526.blender_r ICE: verify_ssa failed since r10-7491-gbd0f22a8d5caea8905f38ff1fafce31c1b7d33ad

2020-04-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94443 --- Comment #4 from Kewen Lin --- This case has one conversion insn generated after bit_field_ref, the patch introduces one stupid mistake to use gsi_insert_before instead of gsi_insert_seq_before, it leads to miss the conversion insn. The below

[Bug middle-end/94449] [10 Regression] FAIL: gcc.c-torture/execute/pr92904.c gcc.dg/torture/pr48731.c

2020-04-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94449 Kewen Lin changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #8 from Kewen Lin ---

[Bug middle-end/94449] [10 Regression] FAIL: gcc.c-torture/execute/pr92904.c gcc.dg/torture/pr48731.c

2020-04-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94449 --- Comment #10 from Kewen Lin --- (In reply to H.J. Lu from comment #9) > (In reply to Kewen Lin from comment #8) > > May I ask for the configuration option? > > > > I used x86_64 machine in CFarm with cpuinfo > > > > I used > > --prefix=/u

[Bug tree-optimization/94443] [10 Regression] 510.parest_r and 526.blender_r ICE: verify_ssa failed since r10-7491-gbd0f22a8d5caea8905f38ff1fafce31c1b7d33ad

2020-04-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94443 Kewen Lin changed: What|Removed |Added CC||hjl.tools at gmail dot com --- Comment #5 fr

[Bug middle-end/94449] [10 Regression] FAIL: gcc.c-torture/execute/pr92904.c gcc.dg/torture/pr48731.c

2020-04-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94449 Kewen Lin changed: What|Removed |Added Resolution|--- |DUPLICATE Status|ASSIGNED

[Bug middle-end/94449] [10 Regression] FAIL: gcc.c-torture/execute/pr92904.c gcc.dg/torture/pr48731.c

2020-04-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94449 --- Comment #12 from Kewen Lin --- Sorry, correction: corei7-avx is from system gcc. With my built gcc, it's sandybridge. But no difference for the pass/fail result.

[Bug tree-optimization/94443] [10 Regression] 510.parest_r and 526.blender_r ICE: verify_ssa failed since r10-7491-gbd0f22a8d5caea8905f38ff1fafce31c1b7d33ad

2020-04-02 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94443 --- Comment #7 from Kewen Lin --- Yes, thanks Richi! I had the same update locally but didn't update here. The latest whole patch is diff --git a/gcc/testsuite/gcc.dg/vect/pr94443.c b/gcc/testsuite/gcc.dg/vect/pr94443.c new file mode 100644 inde

[Bug tree-optimization/94443] [10 Regression] 510.parest_r and 526.blender_r ICE: verify_ssa failed since r10-7491-gbd0f22a8d5caea8905f38ff1fafce31c1b7d33ad

2020-04-02 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94443 --- Comment #8 from Kewen Lin --- > > > + remove_phi_node (&gsi, false); > > I prefer to have the PHI removed before you re-use its LHS. > Oops, missed this, will move it back when posting to email list.

[Bug tree-optimization/94451] [10 Regression] April 1st 2020 GCC does not compile spec 2017 gcc_r benchmark with -O3

2020-04-02 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94451 Kewen Lin changed: What|Removed |Added Resolution|DUPLICATE |FIXED --- Comment #6 from Kewen Lin --- Rep

[Bug tree-optimization/94443] [10 Regression] 510.parest_r and 526.blender_r ICE: verify_ssa failed since r10-7491-gbd0f22a8d5caea8905f38ff1fafce31c1b7d33ad

2020-04-02 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94443 Kewen Lin changed: What|Removed |Added CC||clyon at gcc dot gnu.org --- Comment #10 fro

[Bug tree-optimization/94456] ICE in aarch64/sve/pr87815.c since r10-7491

2020-04-02 Thread linkw at gcc dot gnu.org
||linkw at gcc dot gnu.org Resolution|--- |DUPLICATE --- Comment #1 from Kewen Lin --- Thanks for reporting, should be duplicated as the symptom. *** This bug has been marked as a duplicate of bug 94443 ***

[Bug tree-optimization/94443] [10 Regression] 510.parest_r and 526.blender_r ICE: verify_ssa failed since r10-7491-gbd0f22a8d5caea8905f38ff1fafce31c1b7d33ad

2020-04-02 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94443 --- Comment #13 from Kewen Lin --- (In reply to Khem Raj from comment #11) > this patch seems to be causing gcc ICE on ARM when compiling lz4 sources in > kernel, lz4, vlc almost identical ICE is seen > > attached is the test case please compile

[Bug tree-optimization/94401] [10 Regression] pr92420.c fails on aarch64 since r10-7415

2020-04-02 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/94451] [10 Regression] April 1st 2020 GCC does not compile spec 2017 gcc_r benchmark with -O3

2020-04-03 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94451 Kewen Lin changed: What|Removed |Added Resolution|FIXED |DUPLICATE --- Comment #7 from Kewen Lin ---

[Bug tree-optimization/94443] [10 Regression] 510.parest_r and 526.blender_r ICE: verify_ssa failed since r10-7491-gbd0f22a8d5caea8905f38ff1fafce31c1b7d33ad

2020-04-03 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94443 --- Comment #15 from Kewen Lin --- *** Bug 94451 has been marked as a duplicate of this bug. ***

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2020-04-03 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 94443, which changed state. Bug 94443 Summary: [10 Regression] 510.parest_r and 526.blender_r ICE: verify_ssa failed since r10-7491-gbd0f22a8d5caea8905f38ff1fafce31c1b7d33ad https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94

[Bug tree-optimization/94443] [10 Regression] 510.parest_r and 526.blender_r ICE: verify_ssa failed since r10-7491-gbd0f22a8d5caea8905f38ff1fafce31c1b7d33ad

2020-04-03 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94443 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug testsuite/94079] gfortran.dg/vect/pr83232.f90 fails on power 7

2020-04-08 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94079 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug tree-optimization/94043] [9 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-04-17 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/96451] [11 Regression] gcc.dg/pr68766.c ICE since r11-2453

2020-08-04 Thread linkw at gcc dot gnu.org
gnu.org |linkw at gcc dot gnu.org Last reconfirmed||2020-08-04 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Kewen Lin --- Thanks for reporting! I will have a look at it.

[Bug tree-optimization/96451] [11 Regression] gcc.dg/pr68766.c ICE since r11-2453

2020-08-04 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96451 --- Comment #3 from Kewen Lin --- (In reply to Richard Biener from comment #2) > possibly a latent issue since the patch is supposed to be cost-only Yes, this case will hit ICE too with -fno-vect-cost-model even without the culprit commit. With

[Bug tree-optimization/96451] [11 Regression] gcc.dg/pr68766.c ICE since r11-2453

2020-08-05 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96451 --- Comment #5 from Kewen Lin --- Created attachment 49000 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49000&action=edit untested patch Just noticed the dbgcnt supports several intervals, if we want to count epilogue loop, we probably n

[Bug tree-optimization/96451] [11 Regression] gcc.dg/pr68766.c ICE since r11-2453

2020-08-05 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96451 --- Comment #6 from Kewen Lin --- (In reply to rguent...@suse.de from comment #4) > On Wed, 5 Aug 2020, linkw at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96451 > > > > --- Comment #3 fro

[Bug tree-optimization/96451] [11 Regression] gcc.dg/pr68766.c ICE since r11-2453

2020-08-05 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96451 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/94077] gcc.dg/gomp/pr82374.c fails on power 7

2020-08-11 Thread linkw at gcc dot gnu.org
|1 CC||linkw at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Kewen Lin --- This issue only exists on gcc8 and gcc9, it's gone with gcc10 and trunk. The main difference is listed

[Bug target/94077] gcc.dg/gomp/pr82374.c fails on power 7

2020-08-12 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94077 --- Comment #2 from Kewen Lin --- To be more specific, the reason causing the available alignment forcing is the default setting of -fcommon, we set -fno-common as default from GCC10, it makes decl_binds_to_current_def_p return true then. I can

[Bug target/94077] gcc.dg/gomp/pr82374.c fails on power 7

2020-08-12 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94077 --- Comment #3 from Kewen Lin --- > > I can observe this case fail if with explicit -fcommon. I mean even with gcc10 or trunk.

[Bug target/94077] gcc.dg/gomp/pr82374.c fails on power 7

2020-08-12 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94077 --- Comment #6 from Kewen Lin --- (In reply to Jakub Jelinek from comment #5) > I mean -fno-common, sorry. Good idea, that works! I'll send a patch by adding -fno-common into dg-options. Thanks for your suggestion!

[Bug testsuite/94077] gcc.dg/gomp/pr82374.c fails on power 7

2020-08-12 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94077 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/96789] New: x264: sub4x4_dct() improves when vectorization is disabled

2020-08-25 Thread linkw at gcc dot gnu.org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- One of my workmates found that if we disable vectorization for SPEC2017 525.x264_r function sub4x4_dct in source file x264_src

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-08-26 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #2 from Kewen Lin --- Created attachment 49124 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49124&action=edit sub4x4_dct SLP dumping

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-08-26 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #3 from Kewen Lin --- Bisection shows it started to fail from r11-205.

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-08-30 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org --- Comment

[Bug target/96933] New: inefficient code for char/short vec CTOR

2020-09-04 Thread linkw at gcc dot gnu.org
Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- When I'm investigate the vectorization cost for vec_construct, I happened to find the generated code for vector construction is inefficient with DIRECT_MOVE support. The test

[Bug target/96933] rs6000: inefficient code for char/short vec CTOR

2020-09-04 Thread linkw at gcc dot gnu.org
||bergner at gcc dot gnu.org, ||linkw at gcc dot gnu.org, ||segher at gcc dot gnu.org, ||wschmidt at gcc dot gnu.org Summary

[Bug target/96933] rs6000: inefficient code for char/short vec CTOR

2020-09-04 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96933 --- Comment #2 from Kewen Lin --- (In reply to Segher Boessenkool from comment #1) > Is that actually faster though? The original has shorter dependency > chains. Or is this to avoid some LHS/SHL? Yes, I tested it with one constructed case, th

[Bug target/96933] rs6000: inefficient code for char/short vec CTOR

2020-09-06 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96933 --- Comment #5 from Kewen Lin --- (In reply to Segher Boessenkool from comment #4) > Yes, timing suggests there is some SHL/LHS flush. > > On p9 and later we can use mtvsrdd instead of mtvsrd (moving two > bytes into place at one), which reduces

[Bug target/96933] rs6000: inefficient code for char/short vec CTOR

2020-09-07 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96933 --- Comment #6 from Kewen Lin --- (In reply to Kewen Lin from comment #5) > (In reply to Segher Boessenkool from comment #4) > > Yes, timing suggests there is some SHL/LHS flush. > > > > On p9 and later we can use mtvsrdd instead of mtvsrd (movi

[Bug target/96933] rs6000: inefficient code for char/short vec CTOR

2020-09-07 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96933 --- Comment #8 from Kewen Lin --- (In reply to Segher Boessenkool from comment #7) > There are vmrglb and vrghb etc.? But these are only for low/high part separately, with mtvsrdd both low/high parts (doubleword) have the values, we don't have V

[Bug target/96933] rs6000: inefficient code for char/short vec CTOR

2020-09-08 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96933 --- Comment #10 from Kewen Lin --- (In reply to Segher Boessenkool from comment #9) > I'm not sure what you mean. > > vmrglb merges the vectors > abcdefghijklmnop > and > ABCDEFGHIJKLMNOP > to > iIjJkKlLmMnNoOpP > > ... ah, I see what you

[Bug target/97019] New: rs6000:redundant rldicr fed to lvx/stvx

2020-09-11 Thread linkw at gcc dot gnu.org
Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- When we do the early expansion for altivec built-in function vec_ld/vec_st, we can probably leave some redundant rldicr x,y,0,59 which aims to AND (-16) for the vector access address

[Bug target/97019] rs6000:redundant rldicr fed to lvx/stvx

2020-09-11 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97019 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org

[Bug target/97019] rs6000:redundant rldicr fed to lvx/stvx

2020-09-15 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97019 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-16 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 Kewen Lin changed: What|Removed |Added Last reconfirmed||2020-09-16 Status|UNCONFIRMED

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-16 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #9 from Kewen Lin --- (In reply to Richard Biener from comment #8) > (In reply to Kewen Lin from comment #7) > > Two questions in mind, need to dig into it further: > > 1) from the assembly of scalar/vector code, I don't see any sto

[Bug tree-optimization/97075] [11 regression] powerpc64 vector tests fails after r11-3230

2020-09-16 Thread linkw at gcc dot gnu.org
|ASSIGNED Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org CC||linkw at gcc dot gnu.org Last reconfirmed||2020-09-17 --- Comment #1 from Kewen Lin --- I'll take a look at this.

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-16 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #11 from Kewen Lin --- (In reply to Richard Biener from comment #10) > (In reply to Kewen Lin from comment #9) > > (In reply to Richard Biener from comment #8) > > > (In reply to Kewen Lin from comment #7) > > > > Two questions in min

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-16 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #12 from Kewen Lin --- > Thanks for the explanation! I'll look at it after checking 2). IIUC, the > advantage to eliminate stores here looks able to get those things which is > fed to stores and stores' consumers bundled, then get mo

[Bug tree-optimization/97075] [11 regression] powerpc64 vector tests fails after r11-3230

2020-09-17 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97075 --- Comment #3 from Kewen Lin --- (In reply to akrl from comment #2) > Thanks Kewen, unfortunately I've no Power setup. Sorry for the > inconvenience. My pleasure! If you have interests to run on Power machines, you can apply and use some Power

[Bug tree-optimization/97075] [11 regression] powerpc64 vector tests fails after r11-3230

2020-09-17 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97075 --- Comment #4 from Kewen Lin --- > gcc.target/powerpc/p9-vec-length-full-6.c This is a test case issue, 64bit/32bit pairs will use full vector instead of partial vector as Andrea's improvement. > gcc.target/powerpc/p9-vec-length-epil-7.c It e

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-18 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #13 from Kewen Lin --- > 2) on Power, the conversion from unsigned char to unsigned short is nop > conversion, when we counting scalar cost, it's counted, then add costs 32 > totally onto scalar cost. Meanwhile, the conversion from

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-18 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #15 from Kewen Lin --- (In reply to rguent...@suse.de from comment #14) > On Fri, 18 Sep 2020, linkw at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 > > > > --- Co

[Bug target/92132] new test case gcc.dg/vect/vect-cond-reduc-4.c fails with its introduction in r277067

2019-10-21 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92132 --- Comment #3 from Kewen Lin --- Powerpc already support vcond where A and B are in the same mode or the same size mode. As Richard pointed out, this case requires some packs, it requires powerpc supports vec_cmpv2dfv2di and vcond_mask_v4siv4si,

[Bug tree-optimization/92185] New: ICE when perform condition reduction vectorization on uchar ind var

2019-10-23 Thread linkw at gcc dot gnu.org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- TESTCASE: #include "tree-vect.h" extern void abort (void) __attribute__ ((noreturn)); #define N 27 uns

[Bug tree-optimization/92185] ICE when perform condition reduction vectorization on uchar ind var

2019-10-23 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92185 --- Comment #3 from Kewen Lin --- (In reply to Richard Biener from comment #2) > Hmm, I can't reproduce this, I tried ppc64le and x86_64. Sorry, my local codebase is on r277221, trying latest trunk.

[Bug tree-optimization/92162] [10 Regression] ICE in vect_create_epilog_for_reduction, at tree-vect-loop.c:4252

2019-10-23 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92162 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #6

[Bug tree-optimization/92185] ICE when perform condition reduction vectorization on uchar ind var

2019-10-23 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92185 Kewen Lin changed: What|Removed |Added Status|RESOLVED|CLOSED Resolution|FIXED

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2019-10-23 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 92074, which changed state. Bug 92074 Summary: [10 regression] 26% performance regression on Spec2017 548.exchange2_r https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92074 What|Removed |Ad

[Bug ipa/92074] [10 regression] 26% performance regression on Spec2017 548.exchange2_r

2019-10-23 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92074 Kewen Lin changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug testsuite/92127] [10 regression] gcc.dg/vect/costmodel/ppc/costmodel-fast-math-vect-pr29925.c fails after r276645 on power7

2019-10-30 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92127 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #3

[Bug testsuite/92127] [10 regression] gcc.dg/vect/costmodel/ppc/costmodel-fast-math-vect-pr29925.c fails after r276645 on power7

2019-11-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92127 --- Comment #4 from Kewen Lin --- Author: linkw Date: Fri Nov 1 07:11:12 2019 New Revision: 277704 URL: https://gcc.gnu.org/viewcvs?rev=277704&root=gcc&view=rev Log: 2019-11-01 Kewen Lin PR testsuite/92127 * gcc.dg/vect/costmodel/ppc/co

[Bug testsuite/92127] [10 regression] gcc.dg/vect/costmodel/ppc/costmodel-fast-math-vect-pr29925.c fails after r276645 on power7

2019-11-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92127 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug ipa/92074] [10 regression] 26% performance regression on Spec2017 548.exchange2_r

2019-11-01 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92074 Kewen Lin changed: What|Removed |Added Status|RESOLVED|CLOSED --- Comment #8 from Kewen Lin --- Cl

[Bug testsuite/87306] test case gcc.dg/vect/bb-slp-pow-1.c fails with its introduction in r263290

2019-11-04 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87306 --- Comment #6 from Kewen Lin --- Author: linkw Revision: 268003 Modified property: svn:log Modified: svn:log at Tue Nov 5 02:26:38 2019 -- --- svn:log (original) +++ s

[Bug testsuite/92127] [10 regression] gcc.dg/vect/costmodel/ppc/costmodel-fast-math-vect-pr29925.c fails after r276645 on power7

2019-11-04 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92127 --- Comment #6 from Kewen Lin --- Author: linkw Revision: 277704 Modified property: svn:log Modified: svn:log at Tue Nov 5 02:36:58 2019 -- --- svn:log (original) +++ s

[Bug target/92132] new test case gcc.dg/vect/vect-cond-reduc-4.c fails with its introduction in r277067

2019-11-07 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92132 --- Comment #4 from Kewen Lin --- Author: linkw Date: Fri Nov 8 07:37:07 2019 New Revision: 277947 URL: https://gcc.gnu.org/viewcvs?rev=277947&root=gcc&view=rev Log: [rs6000]Fix PR92132 by adding vec_cmp and vcond_mask supports To support

[Bug target/92132] new test case gcc.dg/vect/vect-cond-reduc-4.c fails with its introduction in r277067

2019-11-07 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92132 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug testsuite/92464] [10 regression] r278033 breaks gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c

2019-11-12 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92464 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed|

[Bug testsuite/92464] [10 regression] r278033 breaks gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c

2019-11-12 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92464 --- Comment #3 from Kewen Lin --- (In reply to Segher Boessenkool from comment #2) > What is the testcase testing? Whether we can properly vectorize this > code, right? And for p7 we now do it correctly, but thought it was > too expensive befor

[Bug testsuite/92464] [10 regression] r278033 breaks gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c

2019-11-12 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92464 --- Comment #4 from Kewen Lin --- By the way, if I removed the check_vect and result verification code, the vectorized version perform very slightly better than non-vectorized version. And yes, I think it was a bit off before.

[Bug testsuite/92464] [10 regression] r278033 breaks gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c

2019-11-13 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92464 --- Comment #5 from Kewen Lin --- Author: linkw Date: Thu Nov 14 05:57:12 2019 New Revision: 278195 URL: https://gcc.gnu.org/viewcvs?rev=278195&root=gcc&view=rev Log: [testsuite] Fix PR92464 by adjust test case loop bound The recent vectori

[Bug testsuite/92464] [10 regression] r278033 breaks gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c

2019-11-13 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92464 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/92566] rs6000_preferred_simd_mode isn't very good

2019-11-18 Thread linkw at gcc dot gnu.org
||2019-11-19 CC||linkw at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Kewen Lin --- Currently we guard V2DImode under

[Bug target/92566] rs6000_preferred_simd_mode isn't very good

2019-11-18 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92566 --- Comment #2 from Kewen Lin --- Created attachment 47295 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47295&action=edit Guard V2DImode and V1TImode under VSX and P8VECTOR

[Bug target/92534] [10 regression] gcc.dg/vect/bb-slp-42.c fails after r278262

2019-11-19 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92534 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org --- Comment

[Bug target/92566] rs6000_preferred_simd_mode isn't very good

2019-11-19 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92566 Kewen Lin changed: What|Removed |Added Attachment #47295|0 |1 is obsolete|

[Bug target/92534] [10 regression] gcc.dg/vect/bb-slp-42.c fails after r278262

2019-11-20 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92534 Kewen Lin changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #4 from Kewen Lin ---

[Bug target/92534] [10 regression] gcc.dg/vect/bb-slp-42.c fails after r278262

2019-11-20 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92534 Kewen Lin changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #5 fr

[Bug target/92534] [10 regression] gcc.dg/vect/bb-slp-42.c fails after r278262

2019-11-21 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92534 --- Comment #7 from Kewen Lin --- Thanks for your confirmation and notes! Yes, the realignment codes won't take effect from Power8 which supports unaligned vector load/store. I'll learn the code, follow your suggestion and cook some patches later

  1   2   3   4   5   6   7   8   9   10   >