[PATCH] Optimize multiplication for V8QI,V16QI,V32QI under TARGET_AVX512BW [target/95488]

2020-06-04 Thread Hongtao Liu via Gcc-patches
Hi: +/* Optimize vector MUL generation for V8QI, V16QI and V32QI + under TARGET_AVX512BW. i.e. for v16qi a * b, it has + + vpmovzxbw ymm2, xmm0 + vpmovzxbw ymm3, xmm1 + vpmullw ymm4, ymm2, ymm3 + vpmovwb xmm0, ymm4 + + it would take less instructions than ix86_expand_vecop_qihi. +

Re: [PATCH] Do not copy NULL string with memcpy.

2020-06-04 Thread Richard Biener via Gcc-patches
On June 4, 2020 10:22:55 PM GMT+02:00, Alexandre Oliva wrote: >On Jun 3, 2020, Martin Liška wrote: > >> On 6/3/20 5:58 AM, Alexandre Oliva wrote: >>> Please let me know if you'd prefer me to take this PR over. > >> Yes, please take a look. > >Here's what I've regstrapped on x86_64-linux-gnu. I

[PATCH] ix86: Improve __builtin_c[lt]z followed by extension [PR95535]

2020-06-04 Thread Jakub Jelinek via Gcc-patches
Hi! In January I've added patterns to optimize SImode -> DImode sign or zero extension of __builtin_popcount, this patch does the same for __builtin_c[lt]z. Like most other instructions, the [tl]zcntl instructions clear the upper 32 bits of the destination register and as the instructions only re

Re: [PATCH] ix86: Improve __builtin_c[lt]z followed by extension [PR95535]

2020-06-04 Thread Uros Bizjak via Gcc-patches
On Fri, Jun 5, 2020 at 8:45 AM Jakub Jelinek wrote: > > Hi! > > In January I've added patterns to optimize SImode -> DImode sign or zero > extension of __builtin_popcount, this patch does the same for > __builtin_c[lt]z. Like most other instructions, the [tl]zcntl instructions > clear the upper 3

<    1   2