on 2024/1/8 19:44, Richard Biener wrote:
> On Mon, Jan 8, 2024 at 3:35 AM Kewen.Lin wrote:
>>
>> Hi,
>>
>> As PR113100 shows, the unbiasing introduced by r14-6737 can
>> cause the scrubbing to overrun and screw some critical data
>> on stack like saved
Hi Mike,
on 2024/1/6 06:18, Michael Meissner wrote:
> In looking at support for load vector pair and store vector pair for the
> PowerPC in GCC, I noticed that we were missing a print_operand output modifier
> if you are dealing with vector pairs to print the 2nd register in the vector
> pair.
>
Hi Alexandre,
on 2024/1/11 17:05, Alexandre Oliva wrote:
> On Jan 7, 2024, "Kewen.Lin" wrote:
>
>> As PR113100 shows, the unbiasing introduced by r14-6737 can
>> cause the scrubbing to overrun and screw some critical data
>> on stack like saved toc base conse
on 2024/1/12 19:03, Alexandre Oliva wrote:
> On Jan 12, 2024, "Kewen.Lin" wrote:
>
>>>> By checking PR112917, IMHO we should keep this unbiasing
>>>> guarded under SPARC_STACK_BOUNDARY_HACK (TARGET_ARCH64 &&
>>>> TARGET_STACK_BIAS), sim
Hi Haochen,
on 2024/1/10 09:35, HAO CHEN GUI wrote:
> Hi,
> This patch refactors function expand_compare_loop and split it to two
> functions. One is for fixed length and another is for variable length.
> These two functions share some low level common help functions.
I'm expecting refactoring
Hi Haochen,
on 2024/1/12 14:48, HAO CHEN GUI wrote:
> Hi,
> On P9 "setb" is used to set the result of block compare. So it works
> with m32 and mpowerpc64. On P8, carry bit is used. So it can't work
> with m32 and mpowerpc64. This patch enables block compare expand for
> m32 and mpowerpc64 on P9
Hi Haochen,
on 2024/1/11 16:28, HAO CHEN GUI wrote:
> Hi,
> This patch eliminates unnecessary byte swaps for block clear on P8
> LE. For block clear, all the bytes are set to zero. The byte order
> doesn't make sense. So the alignment of destination could be set to
> the store mode size in stead
Hi,
As pointed out by the discussion in PR109705, the current
vect_long_mult effective target check on Power is broken.
This patch is to fix it accordingly.
With additional change by adding a guard vect_long_mult
in gcc.dg/vect/pr25413a.c , it's tested well on Power{8,9}
LE & BE (also on Power10
on 2024/1/16 06:22, Ajit Agarwal wrote:
> Hello Richard:
>
> On 15/01/24 6:25 pm, Ajit Agarwal wrote:
>>
>>
>> On 15/01/24 6:14 pm, Ajit Agarwal wrote:
>>> Hello Richard:
>>>
>>> On 15/01/24 3:03 pm, Richard Biener wrote:
On Sun, Jan 14, 2024 at 4:29 PM Ajit Agarwal
wrote:
>
>
Hi,
As PR101169 comment #c4 shows, previously the addi count
update on fold-vec-extract-char.p7.c covered a sub-optimal
code gen issue. On trunk, pass fold-mem-offsets helps to
recover the best code sequence, so this patch is to
revert the count back to the original which matches the
optimal addi
aix is:
make check-gcc RUNTESTFLAGS="--target_board=unix'{-m64,-m32}'
dg.exp=strub-unsupported*.c"
BR,
Kewen
> Thanks, David
>
>
> On Wed, Jan 17, 2024 at 8:06 PM Alexandre Oliva <mailto:ol...@adacore.com>> wrote:
>
> David,
>
&g
Hi Mike,
on 2024/1/6 07:35, Michael Meissner wrote:
> This patch implements support for a potential future PowerPC cpu. Features
> added with -mcpu=future, may or may not be added to new PowerPC processors.
>
> This patch adds support for the -mcpu=future option. If you use -mcpu=future,
> the
on 2024/1/6 07:37, Michael Meissner wrote:
> This patch re-enables generating load and store vector pair instructions when
> doing certain memory copy operations when -mcpu=future is used.
>
> During power10 development, it was determined that using store vector pair
> instructions were problemati
Hi Mike,
on 2024/1/12 01:29, Michael Meissner wrote:
> This is version 2 of the patch. The only difference is I made the test case
> simpler to read.
>
> In looking at support for load vector pair and store vector pair for the
> PowerPC in GCC, I noticed that we were missing a print_operand outp
on 2024/1/24 11:11, Peter Bergner wrote:
> On 1/23/24 8:30 PM, Kewen.Lin wrote:
>>> - output_operand_lossage ("invalid %%x value");
>>> + output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' :
>>> 'x
on 2024/1/24 23:51, Peter Bergner wrote:
> On 1/24/24 12:04 AM, Kewen.Lin wrote:
>> on 2024/1/24 11:11, Peter Bergner wrote:
>>> But not with this. The -mdejagnu-cpu=power10 option already enables -mvsx.
>>> If the user explcitly forces -mno-vsx via RUNTESTFLAGS, the
Hi,
Thanks for adjusting this.
on 2024/1/24 19:42, Xi Ruoyao wrote:
> On Wed, 2024-01-24 at 19:08 +0800, chenxiaolong wrote:
>> At 19:00 +0800 on Wednesday, 2024-01-24, Xi Ruoyao wrote:
>>> On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote:
On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoya
Hi Mike,
on 2024/1/6 07:38, Michael Meissner wrote:
> The MMA subsystem added the notion of accumulator registers as an optional
> feature of ISA 3.1 (power10). In ISA 3.1, these accumulators overlapped with
> the traditional floating point registers 0..31, but logically the accumulator
> registe
on 2024/1/27 06:42, Andrew Pinski wrote:
> On Mon, Jan 15, 2024 at 6:43 PM Kewen.Lin wrote:
>>
>> Hi,
>>
>> As pointed out by the discussion in PR109705, the current
>> vect_long_mult effective target check on Power is broken.
>> This patch is to fix it ac
Hi,
PR112995 exposed one issue in current try_replace_dest_reg
that the result rtx insn after replace_dest_with_reg_in_expr
is probably unable to match any constraints. Although there
are some checks on the changes onto dest or src of orig_insn,
none is performed on the EXPR_INSN_RTX.
This patch
Hi Haochen,
on 2023/12/18 10:43, HAO CHEN GUI wrote:
> Hi,
> The patch corrects the definition of
> TARGET_EFFICIENT_OVERLAPPING_UNALIGNED and replace it with the call of
> slow_unaligned_access.
>
> Compared with last version,
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640076.
Hi Haochen,
on 2023/12/18 10:44, HAO CHEN GUI wrote:
> Hi,
> This patch cleans up pre-checkings of expand_block_compare. It does
> 1. Assert only P7 above can enter this function as it's already guard
> by the expand.
> 2. Return false when optimizing for size.
> 3. Remove P7 processor test as o
Hi,
This patch follows Richi's suggestion "scheduling shouldn't
special case empty blocks as they usually do not appear" in
[1], it removes function no_real_insns_p and its uses
completely.
There is some case that one block previously has only one
INSN_P, but while scheduling some other blocks th
Hi Jeff,
on 2023/12/21 04:43, Jeff Law wrote:
>
>
> On 12/11/23 23:17, Kewen.Lin wrote:
>> Hi,
>>
>> Gentle ping this:
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636597.html
>>
>> BR,
>> Kewen
>>
>> on 2
Hi Jeff,
on 2023/12/21 04:30, Jeff Law wrote:
>
>
> On 12/15/23 01:52, Kewen.Lin wrote:
>> Hi,
>>
>> PR112995 exposed one issue in current try_replace_dest_reg
>> that the result rtx insn after replace_dest_with_reg_in_expr
>> is probably unable to match
Hi Haochen,
on 2023/12/20 16:51, HAO CHEN GUI wrote:
> Hi,
> The patch corrects the definition of
> TARGET_EFFICIENT_OVERLAPPING_UNALIGNED and replace it with the call of
> slow_unaligned_access.
>
> Compared with last version,
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640832.
Hi Haochen,
on 2023/12/20 16:56, HAO CHEN GUI wrote:
> Hi,
> This patch call library function for block memory compare when it's
> optimized for size.
>
> Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
> regressions. Is this OK for trunk?
>
> Thanks
> Gui Haochen
>
> C
Hi,
on 2023/12/21 09:37, HAO CHEN GUI wrote:
> Hi,
> This patch cleans up pre-checkings of expand_block_compare. It does
> 1. Assert only P7 above can enter this function as it's already guard
> by the expand.
> 2. Remove P7 processor test as only P7 above can enter this function and
> P7 LE is
Hi,
As PR113100 shows, the unbiasing introduced by r14-6737 can
cause the scrubbing to overrun and screw some critical data
on stack like saved toc base consequently cause segfault on
Power.
By checking PR112917, IMHO we should keep this unbiasing
guarded under SPARC_STACK_BOUNDARY_HACK (TARGET_A
Hi,
As PR112751 shows, commit r14-5628 caused pcrel-sibcall-1.c
to fail as it enables ipa-vrp which makes return values of
functions {x,y,xx} as known and propagated. This patch is
to adjust it with noipa to make it not fragile.
Tested well on powerpc64-linux-gnu P8/P9 and
powerpc64le-linux-gnu
Hi,
As PR111480 shows, commit r14-4079 only optimizes the case
of vctzlsbb but not for the similar vclzlsbb. This patch
is to consider vclzlsbb as well and avoid the failure on
the reported test case. It also simplifies the patterns
with iterator and attribute.
Bootstrapped and regtested on pow
Hi,
I noticed that commit r14-6192 can't help PR112606 #c3 as
it only takes care of SF/DF but TF/KF can still suffer the
issue. Similar to commit r14-6192, this patch is to take
care of copysign3 with IEEE128 as well.
Bootstrapped and regtested on powerpc64-linux-gnu P8/P9
and powerpc64le-linux-
Hi Ajit,
on 2023/12/1 17:10, Ajit Agarwal wrote:
> Hello Kewen:
>
> On 24/11/23 3:01 pm, Kewen.Lin wrote:
>> Hi Ajit,
>>
>> Don't forget to CC David (CC-ed) :), some comments are inlined below.
>>
>> on 2023/10/8 03:04, Ajit Agarwal wrote:
>&
Hi Haochen,
on 2023/12/1 10:42, HAO CHEN GUI wrote:
> Hi,
> The "fctid" is supported on 64-bit Power processors and powerpc 476. It
> need a guard to check it. The patch fixes the issue.
>
> Bootstrapped and tested on x86 and powerpc64-linux BE and LE with
> no regressions. Is this OK for tru
Hi Jakub,
on 2023/11/25 18:17, Jakub Jelinek wrote:
> Hi!
>
> The middle-end has been changed quite recently to canonicalize
> -abs (x) to copysign (x, -1) rather than the other way around.
> While I agree with that at GIMPLE level, since it matches the GIMPLE
> goal of as few operations as possi
Hi,
As PR112788 shows, on rs6000 with -mabi=ieeelongdouble type _Float128
has the different type precision (128) from that (127) of type long
double, but actually they has the same underlying mode, so they have
the same precision as the mode indicates the same real type format
ieee_quad_format.
I
Hi,
Gentle ping this series:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html
BR,
Kewen
>
>> on 2022/11/24 17:15, Kewen Lin wrote:
>>> Hi,
>>>
>>> Following Segher's suggestion, this patch series is to rework
>>> function rs6000_emit_vector_compare for ve
Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609993.html
BR,
Kewen
>
>>>> on 2023/1/16 17:08, Kewen.Lin via Gcc-patches wrote:
>>>>> Hi,
>>>>>
>>>>> As Honza pointed out in [1], the cur
Hi Haochen,
on 2023/12/1 10:41, HAO CHEN GUI wrote:
> Hi,
> SImode in float register is supported on P7 above. It causes "fctiw"
> can be generated on old 32-bit processors as the output operand of
typo? I guess you meant to say "can NOT"?
> fctiw insn is a SImode in float/double register. Th
on 2023/12/6 02:01, Ajit Agarwal wrote:
> Hello Kewen:
>
>
> On 05/12/23 7:13 pm, Ajit Agarwal wrote:
>> Hello Kewen:
>>
>> On 04/12/23 7:31 am, Kewen.Lin wrote:
>>> Hi Ajit,
>>>
>>> on 2023/12/1 17:10, Ajit Agarwal wrote:
>>
Hi Jeff,
on 2023/12/6 13:24, Jiufu Guo wrote:
> Hi,
>
> Trunk gcc supports more constants to be built via two instructions:
> e.g. "li/lis; xori/xoris/rldicl/rldicr/rldic".
> And then num_insns_constant should also be updated.
>
> Function "rs6000_emit_set_long_const" is used to build complicate
Hi Jeff,
on 2023/12/6 13:24, Jiufu Guo wrote:
> Hi,
>
> For constant building e.g. r120=0x, which does not fit 'li or lis',
> 'pli' is used to build this constant via 'emit_move_insn'.
>
> While for a complicated constant, e.g. 0xULL, when using
> 'rs6000_emit_set_long_co
Hi Haochen,
on 2023/12/6 16:13, HAO CHEN GUI wrote:
> Hi,
> The "fctid" is supported on 64-bit Power processors and powerpc 476. It
> need a guard to check it. The patch fixes the issue.
>
> Compared with last version,
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/638859.html
> th
Hi,
on 2023/12/6 16:13, HAO CHEN GUI wrote:
> Hi,
> SImode in float register is supported on P7 above. It causes "fctiw"
> can't be generated on old 32-bit processors as the output operand of
> fctiw insn is an SImode in float/double register. This patch fixes the
> problem by adding one expand
on 2023/12/6 13:09, Michael Meissner wrote:
> On Wed, Dec 06, 2023 at 10:22:57AM +0800, Kewen.Lin wrote:
>> I'd expect you use UNSPEC_MMA_EXTRACT to extract V16QI from the result of
>> lxvp,
>> the current define_insn_and_split "*vsx_disassemble_pair" shou
Hi Haochen,
on 2023/12/8 09:58, HAO CHEN GUI wrote:
> Hi,
> The "fctid" is supported on 64-bit Power processors and PowerPC476. It
> need a guard to check it. The patch fixes the issue.
>
> Compared with last version,
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639536.html
> the
Hi Ajit,
on 2023/12/8 16:01, Ajit Agarwal wrote:
> Hello Kewen:
>
> On 07/12/23 4:31 pm, Ajit Agarwal wrote:
>> Hello Kewen:
>>
>> On 06/12/23 7:52 am, Kewen.Lin wrote:
>>> on 2023/12/6 02:01, Ajit Agarwal wrote:
>>>> Hello Kewen:
>&g
Hi Jeff,
on 2023/12/11 11:26, Jiufu Guo wrote:
> Hi,
>
> Trunk gcc supports more constants to be built via two instructions:
> e.g. "li/lis; xori/xoris/rldicl/rldicr/rldic".
> And then num_insns_constant should also be updated.
>
> Function "rs6000_emit_set_long_const" is used to build complicat
Hi,
on 2023/12/11 11:26, Jiufu Guo wrote:
> Hi,
>
> For constant building e.g. r120=0x, which does not fit 'li or lis',
> 'pli' is used to build this constant via 'emit_move_insn'.
>
> While for a complicated constant, e.g. 0xULL, when using
> 'rs6000_emit_set_long_const'
Hi,
on 2023/12/11 09:49, HAO CHEN GUI wrote:
> Hi,
> The patch corrects the definition of
> TARGET_EFFICIENT_OVERLAPPING_UNALIGNED and change its name to a
> comprehensible name.
>
> Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
> regressions. Is this OK for trunk?
>
>
Hi,
on 2023/12/11 10:54, HAO CHEN GUI wrote:
> Hi,
> This patch cleans up pre-checking of expand_block_compare. It does
> 1. Assert only P7 above can enter this function as it's already guard
> by the expand.
> 2. Return false when optimizing for size.
> 3. Remove P7 CPU test as only P7 above ca
Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639140.html
BR,
Kewen
on 2023/12/4 17:49, Kewen.Lin wrote:
> Hi,
>
> As PR112788 shows, on rs6000 with -mabi=ieeelongdouble type _Float128
> has the different type precision (128) from that (127)
Hi,
Gentle ping this series:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html
BR,
Kewen
>>> on 2022/11/24 17:15, Kewen Lin wrote:
Hi,
Following Segher's suggestion, this patch series is to rework
function rs6000_emit_vector_compare for
Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609993.html
BR,
Kewen
>>>>> on 2023/1/16 17:08, Kewen.Lin via Gcc-patches wrote:
>>>>>> Hi,
>>>>>>
>>>>>> As Honza pointed out in [1],
Hi,
Gentle ping:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636599.html
BR,
Kewen
on 2023/11/15 17:16, Kewen.Lin wrote:
> Hi,
>
> As Power ISA defines, when loading a scalar single precision (SP)
> floating point from memory, we have the double precision (DP) format
Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636597.html
BR,
Kewen
on 2023/11/15 17:01, Kewen.Lin wrote:
> Hi,
>
> on 2023/11/10 01:40, Alexander Monakov wrote:
>
>> I agree with the concern. I hoped that solving the problem by skipping
Hi Ajit,
on 2023/12/8 16:01, Ajit Agarwal wrote:
> Hello Kewen:
>
[snip...]
> With UNSPEC_MMA_EXTRACT I could generate the register pair but functionally
> here is the
> below code which is incorrect.
>
> llxvp %vs0,0(%r4)
> xxlor %vs32,%vs0,%vs0
> xvf32ger 0,%vs34,%vs32
Hi,
on 2023/11/22 17:30, Kewen.Lin wrote:
> on 2023/11/17 20:55, Alexander Monakov wrote:
>>
>> On Fri, 17 Nov 2023, Kewen.Lin wrote:
>>>> I don't think you can run cleanup_cfg after sched_init. I would suggest
>>>> to put it early in schedule_insns.
&g
Hi Jakub & Andrew,
on 2023/12/12 22:42, Jakub Jelinek wrote:
> On Tue, Dec 12, 2023 at 09:33:38AM -0500, Andrew MacLeod wrote:
>> I leave this for the release managers, but I am not opposed to it for this
>> release... It would be nice to remove it for the next release
>
> I can live with it for
Hi,
on 2023/11/6 15:20, Jiufu Guo wrote:
> Hi,
>
> With latest trunk, case pr106550_1.c can run with failure on ppc under -m32.
> While, the case is testing 64bit constant building. So, "has_arch_ppc64"
> is required.
Please also mention that it failed with ICE initially due to PR111971, now
tha
Hi Haochen,
on 2023/11/6 10:36, HAO CHEN GUI wrote:
> Hi,
> This patch enables vector mode for by pieces equality compare. It
> adds a new expand pattern - cbrnachv16qi4 and set MOVE_MAX_PIECES
> and COMPARE_MAX_PIECES to 16 bytes when P8 vector enabled. The compare
> relies both move and compar
Hi,
on 2023/11/6 17:47, HAO CHEN GUI wrote:
> Hi,
> The patch 2 enables 16-byte by pieces move on rs6000. This patch fixes
> the regression cases caused by previous patch. For sra-17/18, the long
> array with 4 elements can be loaded by one 16-byte by pieces move on 32-bit
> platform. So the arr
Hi,
on 2023/11/7 11:24, HAO CHEN GUI wrote:
> Hi Kewen,
>
>Thanks for your review comments. Just one question on following
> comment.
>
> 在 2023/11/7 10:40, Kewen.Lin 写道:
>> Nit: has_arch_pwr8 would make it un-tested on Power7 default env, I'd prefer
>>
Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634201.html
BR,
Kewen
on 2023/10/25 10:45, Kewen.Lin wrote:
> Hi,
>
> This is almost a repost for v2 which was posted at[1] in March
> excepting for:
> 1) rebased from r14-4810 which is relati
Hi,
Gentle ping this series:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html
BR,
Kewen
> on 2022/11/24 17:15, Kewen Lin wrote:
>> Hi,
>>
>> Following Segher's suggestion, this patch series is to rework
>> function rs6000_emit_vector_compare for vector flo
Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609993.html
BR,
Kewen
>>> on 2023/1/16 17:08, Kewen.Lin via Gcc-patches wrote:
>>>> Hi,
>>>>
>>>> As Honza pointed out in [1], the current uses of f
Hi Maxim and Alexander,
Thanks a lot for the review comments!
on 2023/11/10 01:40, Alexander Monakov wrote:
>
> On Thu, 9 Nov 2023, Maxim Kuvyrkov wrote:
>
>> Hi Kewen,
>>
>> Below are my comments. I don't want to override Alexander's review, and if
>> the patch looks good to him, it's fine to
Hi,
on 2023/11/9 09:31, HAO CHEN GUI wrote:
> Hi,
> This patch enables vector mode for by pieces equality compare. It
> adds a new expand pattern - cbrnachv16qi4 and set MOVE_MAX_PIECES
> and COMPARE_MAX_PIECES to 16 bytes when P8 vector enabled. The compare
> relies both move and compare instru
Hi,
on 2023/11/10 17:22, HAO CHEN GUI wrote:
> Hi,
> Originally 16-byte memory to memory is expanded via pattern.
> expand_block_move does an optimization on P8 LE to leverage V2DI reversed
> load/store for memory to memory move. Now it's done by 16-byte by pieces
> move and the optimization is
Hi Peter,
on 2023/11/11 07:51, Peter Bergner wrote:
> On 8/27/23 9:06 PM, Kewen.Lin wrote:
>> Assuming we only have ELFv2_ABI_CHECK in PCREL_SUPPORTED_BY_OS, we
>> can have either TARGET_PCREL or !TARGET_PCREL after the checking.
>> For the latter, it's fine and don&
Hi,
on 2023/11/15 11:01, Peter Bergner wrote:
> PCREL data accesses are only officially supported on ELFv2. We currently
> incorrectly enable PCREL on all Power10 compiles in which prefix instructions
> are also enabled. Rework the option override code so we only enable PCREL
> for those ABIs th
Hi,
on 2023/11/15 10:26, HAO CHEN GUI wrote:
> Hi,
> This patch cleans up by_pieces_ninsns and does following things.
> 1. Do the length and alignment adjustment for by pieces compare when
> overlap operation is enabled.
> 2. Remove unnecessary mov_optab checks.
>
> Bootstrapped and tested on
Hi,
on 2023/11/10 01:40, Alexander Monakov wrote:
> I agree with the concern. I hoped that solving the problem by skipping the BB
> like the (bit-rotted) debug code needs to would be a minor surgery. As things
> look now, it may be better to remove the non-working sched_block debug counter
> enti
Hi Alexander/Richard/Jeff,
Thanks for the insightful comments!
on 2023/11/10 22:41, Alexander Monakov wrote:
>
> On Fri, 10 Nov 2023, Richard Biener wrote:
>
>> On Fri, Nov 10, 2023 at 3:18 PM Alexander Monakov wrote:
>>>
>>>
>>> On Fri, 10 Nov 2023, Richard Biener wrote:
>>>
> I'm afraid
Hi,
As Power ISA defines, when loading a scalar single precision (SP)
floating point from memory, we have the double precision (DP) format
in target register converted from SP, it's unlike some other
architectures which supports SP and DP in registers with their
separated formats. The scalar SP i
on 2023/11/15 17:43, Alexander Monakov wrote:
>
> On Wed, 15 Nov 2023, Kewen.Lin wrote:
>
>>>> And I suppose it would be OK to do that. Empty BBs are usually removed by
>>>> CFG cleanup so the situation should only happen in rare corner cases where
>>
Hi,
on 2023/11/15 11:02, Jiufu Guo wrote:
> Hi,
>
> Trunk gcc supports more constants to be built via two instructions: e.g.
> "li/lis; xori/xoris/rldicl/rldicr/rldic".
> And then num_insns_constant should also be updated.
>
> Function "rs6000_emit_set_long_const" is used to build complicate
> c
Hi,
on 2023/11/15 11:02, Jiufu Guo wrote:
> Hi,
>
> For constants with 16bit values, 'li or lis' can be used to generate
> the value. For 34bit constant, 'pli' is ok to generate the value.
> For example: 0xULL, "pli 3,1717986918; rldimi 3,3,32,0"
> can be used.
Since now if emit
on 2023/11/17 20:55, Alexander Monakov wrote:
>
> On Fri, 17 Nov 2023, Kewen.Lin wrote:
>>> I don't think you can run cleanup_cfg after sched_init. I would suggest
>>> to put it early in schedule_insns.
>>
>> Thanks for the suggestion, I placed it at the
on 2023/11/22 18:25, Richard Biener wrote:
> On Wed, Nov 22, 2023 at 10:31 AM Kewen.Lin wrote:
>>
>> on 2023/11/17 20:55, Alexander Monakov wrote:
>>>
>>> On Fri, 17 Nov 2023, Kewen.Lin wrote:
>>>>> I don't think you can run cleanup_cfg after
on 2023/11/23 16:20, Richard Biener wrote:
> On Thu, Nov 23, 2023 at 4:02 AM Kewen.Lin wrote:
>>
>> on 2023/11/22 18:25, Richard Biener wrote:
>>> On Wed, Nov 22, 2023 at 10:31 AM Kewen.Lin wrote:
>>>>
>>>> on 2023/11/17 20:55, Alexander Monakov w
Hi Peter,
on 2023/11/16 07:50, Peter Bergner wrote:
> PR109116 exposes an issue where using unspecs to access each vector component
> of an opaque mode variable leads to unneeded register copies, because our rtl
> optimizers cannot handle unspecs. Instead, use subregs to access each vector
> comp
Hi Ajit,
Don't forget to CC David (CC-ed) :), some comments are inlined below.
on 2023/10/8 03:04, Ajit Agarwal wrote:
> Hello All:
>
> This patch add new pass to replace contiguous addresses vector load lxv with
> mma instruction
> lxvp.
IMHO the current binding lxvp (and lxvpx, stxvp{x,}) to
on 2023/11/20 16:56, Michael Meissner wrote:
> On Mon, Nov 20, 2023 at 08:24:35AM +0100, Richard Biener wrote:
>> I wouldn't expose the "fake" larger modes to the vectorizer but rather
>> adjust m_suggested_unroll_factor (which you already do to some extent).
>
> Thanks. I figure I first need to
Hi Mike,
on 2023/11/28 12:34, Michael Meissner wrote:
> On Fri, Nov 24, 2023 at 05:31:20PM +0800, Kewen.Lin wrote:
>> Hi Ajit,
>>
>> Don't forget to CC David (CC-ed) :), some comments are inlined below.
>>
>> on 2023/10/8 03:04, Ajit Agarwal wrote:
>>
on 2023/11/28 15:05, Michael Meissner wrote:
> I tried using this patch to compare with the vector size attribute patch I
> posted. I could not build it as a cross compiler on my x86_64 because the
> assembler gives the following error:
>
> Error: operand out of domain (11 is not a multiple of 2)
Hi Haochen,
on 2023/11/28 15:43, HAO CHEN GUI wrote:
> Hi,
> This patch passes down the equality only flags from
> emit_block_cmp_hints to cmpmem optab so that the target specific expand
> can generate optimized insns for equality only compare. Targets
> (e.g. rs6000) can generate more efficient
Hi,
on 2023/9/18 16:53, Richard Biener wrote:
> On Mon, Sep 18, 2023 at 10:41 AM Richard Sandiford
> wrote:
>>
>> Kewen Lin writes:
>>> This costing adjustment patch series exposes one issue in
>>> aarch64 specific costing adjustment for STP sequence. It
>>> causes the below test cases to fail:
Hi,
on 2023/9/20 16:49, HAO CHEN GUI wrote:
> Hi,
> This patch enables vector compare for 16-byte memory equality compare.
> The 16-byte memory equality compare can be efficiently implemented by
> instruction "vcmpequb." It reduces one branch and one compare compared
> with two 8-byte compare se
Hi,
on 2023/9/25 09:57, HAO CHEN GUI wrote:
> Hi Kewen,
>
> 在 2023/9/18 15:34, Kewen.Lin 写道:
>> hanks for checking! So for P7, this patch looks neutral, but for P8 and
>> later, it may cause some few differences in code gen. I'm curious that how
>> many total o
Hi,
on 2023/9/25 10:05, HAO CHEN GUI wrote:
> Hi,
> This patch implements 32bit inline lrint by "fctiw". It depends on
> the patch1 to do SImode move from FP registers on P7.
>
> Compared to last version, the main change is to add some test cases.
> https://gcc.gnu.org/pipermail/gcc-patches/2
Hi,
As PR111367 shows, with prefixed insn supported, some of
checkings consider it's able to leverage prefixed insn
for stack protect related load/store, but since we don't
actually change the emitted assembly for 32 bit, it can
cause the assembler error as exposed.
Mike's commit r10-4547-gce6a6c
Hi,
The uninitialized variable a in pr60510.f can cause some
random failures as exposed in PR111427, see the details
there. This patch is to make it initialized accordingly.
As verified, it can fix the reported -m32 failures on
P7 and P8 BE. It's also tested well on powerpc64-linux-gnu
P9 and p
Hi Jeff,
on 2023/8/30 15:43, Jiufu Guo wrote:
> Hi,
>
> Currently, we have the pattern "movsf_from_si2" which was trying
> to support moving high part DI to SF.
>
> The pattern looks like: XX:SF=bitcast:SF(subreg(YY:DI>>32),0)
> It only accepts the "ashiftrt" for ">>", but "lshiftrt" is also ok.
Hi Jeff,
on 2023/8/30 15:43, Jiufu Guo wrote:
> Hi,
>
> As mentioned in PR108338, on p9, we could use mtvsrws to implement
> the bitcast from SI to SF (or lowpart DI to SF).
>
> For code:
> *(long long*)buff = di;
> float f = *(float*)(buff);
>
> "sldi 9,3,32 ; mtvsrd 1,9 ; xscvspdpn 1,1" i
Hi,
As PR115659 shows, assuming c = x CMP y, there are some
folding chances for patterns r = c ? 0/z : z/-1:
- For r = c ? 0 : z, it can be folded into r = ~c & z.
- For r = c ? z : -1, it can be folded into r = ~c | z.
But BIT_AND/BIT_IOR applied on one BIT_NOT operand is a
compound operatio
Hi,
As PR115659 shows, assuming c = x CMP y, there are some
folding chances for patterns r = c ? -1/z : z/0.
For r = c ? -1 : z, it can be folded into:
- r = c | z (with ior_optab supported)
- or r = c ? c : z
while for r = c ? z : 0, it can be foled into:
- r = c & z (with and_optab supp
Hi,
Commit r15-1594 removed define of LONG_DOUBLE_TYPE_SIZE in
sparc.cc, it's based on the assumption that each OS has its
own define (see the comments in sparc.h), but it exposes an
issue on vxworks which lacks of the define.
We can bring back the default SPARC_LONG_DOUBLE_TYPE_SIZE to
sparc.cc,
on 2024/7/1 22:28, Richard Biener wrote:
> On Mon, Jul 1, 2024 at 8:16 AM Kewen.Lin wrote:
>>
>> Hi,
>>
>> As PR115659 shows, assuming c = x CMP y, there are some
>> folding chances for patterns r = c ? -1/z : z/0.
>>
>> For r = c ? -1 : z, it can be
Hi!
on 2024/7/2 04:28, Segher Boessenkool wrote:
> On Mon, Jul 01, 2024 at 04:36:44PM +0200, Richard Biener wrote:
>> On Mon, Jul 1, 2024 at 8:17 AM Kewen.Lin wrote:
>>> As PR115659 shows, assuming c = x CMP y, there are some
>>> folding chances for patterns r = c ? 0/
101 - 200 of 1615 matches
Mail list logo