gcc-patches
Subject: Re: Re: [PATCH] VECT: Support loop len control on EXTRACT_LAST
vectorization
On Wed, 9 Aug 2023, ??? wrote:
> Hi, Richard.
>
> >> I'm a bit behind of email, but why isn't BIT_FIELD_REF enough for
> >> the case that the patch is handling?
k, vector, length and bias. But even then, I think there'll
> >> be a temptation to lower calls with all-1 masks to val[len - 1 - bias].
> >> So I think the function only makes sense if we have a use case where
> >> the mask might not be all-1s.
>
= VEC_EXTRACT (v, new_loop_len_22 - 1 - BIAS)
Feel free to correct me if I am wrong.
Thanks.
juzhe.zh...@rivai.ai
From: Richard Sandiford
Date: 2023-08-09 21:30
To: juzhe.zhong\@rivai.ai
CC: rguenther; gcc-patches
Subject: Re: [PATCH] VECT: Support loop len control on
"juzhe.zh...@rivai.ai" writes:
> Hi, Richi.
>
>>> that should be
>
>>> || (!LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
>>> && !LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
>
>>> I think. It seems to imply that SLP isn't supported with
>>> masking/lengthing.
>
> Oh, yes. At first glance, the
'i' SLP lanes enabled and AND
that to the mask would select the proper lane for EXTRACT_LAST.
Not sure how to handle this for 'len' - I guess since 'len'
covers all SLP lanes as well we could just subtract
SLP_TREE_LANES (node) - slp_index from it? I'll note we do
ll SLP lanes as well we could just subtract
SLP_TREE_LANES (node) - slp_index from it? I'll note we don't
handle ncopies > 1 which I think we could with using FOLD_EXTRACT_LAST?
Richard.
> Thanks.
>
>
> juzhe.zh...@rivai.ai
>
> From: Richard Biener
> Date: 2023-08-09
uzhe.zh...@rivai.ai
From: Richard Biener
Date: 2023-08-09 19:00
To: Ju-Zhe Zhong
CC: gcc-patches; richard.sandiford
Subject: Re: [PATCH] VECT: Support loop len control on EXTRACT_LAST
vectorization
On Wed, 9 Aug 2023, juzhe.zh...@rivai.ai wrote:
> From: Ju-Zhe Zhong
>
> H
On Wed, 9 Aug 2023, juzhe.zh...@rivai.ai wrote:
> From: Ju-Zhe Zhong
>
> Hi, this patch is adding loop len control on extract_last autovectorization.
>
> Consider this following case:
>
> #include
>
> #define EXTRACT_LAST(TYPE)\
> TYPE __attribute__ ((noinline, noclone)
From: Ju-Zhe Zhong
Hi, this patch is adding loop len control on extract_last autovectorization.
Consider this following case:
#include
#define EXTRACT_LAST(TYPE) \
TYPE __attribute__ ((noinline, noclone)) \
test_##TYPE (TYPE *x, int n, TYPE value) \
{