Hi Segher,
on 2021/9/23 上午6:36, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Sep 21, 2021 at 11:24:08AM +0800, Kewen.Lin wrote:
>> on 2021/9/18 上午6:01, Segher Boessenkool wrote:
>>> On Thu, Sep 16, 2021 at 09:14:15AM +0800, Kewen.Lin wrote:
The way with nunits * stmt_cost can get one much exa
Hi!
On Tue, Sep 21, 2021 at 11:24:08AM +0800, Kewen.Lin wrote:
> on 2021/9/18 上午6:01, Segher Boessenkool wrote:
> > On Thu, Sep 16, 2021 at 09:14:15AM +0800, Kewen.Lin wrote:
> >> The way with nunits * stmt_cost can get one much exaggerated
> >> penalized cost, such as: for V16QI on P8, it's 16 *
Hi Segher,
Thanks for the review!
on 2021/9/18 上午6:01, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Sep 16, 2021 at 09:14:15AM +0800, Kewen.Lin wrote:
>> The way with nunits * stmt_cost can get one much exaggerated
>> penalized cost, such as: for V16QI on P8, it's 16 * 20 = 320,
>> that's why we
Hi Bill,
Thanks for the review!
on 2021/9/18 上午12:34, Bill Schmidt wrote:
> Hi Kewen,
>
> On 9/15/21 8:14 PM, Kewen.Lin wrote:
>> Hi,
>>
>> This patch follows the discussion here[1], where Segher pointed
>> out the existing way to guard the extra penalized cost for
>> strided/elementwise loads w
Hi!
On Thu, Sep 16, 2021 at 09:14:15AM +0800, Kewen.Lin wrote:
> The way with nunits * stmt_cost can get one much exaggerated
> penalized cost, such as: for V16QI on P8, it's 16 * 20 = 320,
> that's why we need one bound. To make it scale, this patch
> doesn't use nunits * stmt_cost any more, but
Hi Kewen,
On 9/15/21 8:14 PM, Kewen.Lin wrote:
Hi,
This patch follows the discussion here[1], where Segher pointed
out the existing way to guard the extra penalized cost for
strided/elementwise loads with a magic bound doesn't scale.
The way with nunits * stmt_cost can get one much exaggerated
Hi,
This patch follows the discussion here[1], where Segher pointed
out the existing way to guard the extra penalized cost for
strided/elementwise loads with a magic bound doesn't scale.
The way with nunits * stmt_cost can get one much exaggerated
penalized cost, such as: for V16QI on P8, it's 16