Thanks for the review.
I'm a bit concerned about using unsigned long.
Would it be OK if I change the type to uint64_t?
I could rename the function to gcc_jit_context_new_array_type_u64.
Regards.
Le 2024-06-26 à 11 h 34, David Malcolm a écrit :
On Fri, 2024-02-23 at 09:55 -0500, Antoni Boucher wr
This is really more of a question than a patch.
Looking at PR/115687 I managed to convince myself there's a general
class of problems here: splitting might produce constant subexpressions,
but as far as I can tell there's nothing to eliminate those constant
subexpressions. So I very quickly threw
On Thu, 27 Jun 2024 at 14:27, Maciej Cencora wrote:
>
> I think going the bit_cast way would be the best because it enables the
> optimization for many more classes including common wrappers like optional,
> variant, pair, tuple and std::array.
This isn't tested but seems to work on simple case
Hi!
On 2024-06-27T23:20:18+0200, I wrote:
> On 2024-06-27T22:27:21+0200, I wrote:
>> On 2024-06-27T18:49:17+0200, I wrote:
>>> On 2023-10-24T19:49:10+0100, Richard Sandiford
>>> wrote:
This patch adds a combine pass that runs late in the pipeline.
>>
>> [After sending, I realized I replied
Maciej W. Rozycki 于2024年6月28日周五 01:01写道:
>
> On Thu, 27 Jun 2024, YunQiang Su wrote:
>
> > > The missed optimisation in GAS, which used not to trigger pre-R6, is
> > > irrelevant from this change's point of view and just adds noise. I'm
> > > surprised that it worked even in the first place, as
BNEGI.W/D are used for test7_v2f64 and test7_v4f32 now. It is
an improvment since that we can save a instruction.
ILVR.D is used for test43_v2i64 now, instead of INSVE.D.
gcc/testsuite
gcc.target/mips/msa.c: Fix test7_v2f64, test7_v4f32 and
test43_v2i64.
---
gcc/testsuite/gcc.ta
On Thu, Jun 27, 2024 at 05:06:14PM -0400, Lewis Hyatt wrote:
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115312
>
> This fixes a 14.1 regression with PCH for MinGW and other platforms that don't
> use stdc-predef.h. Bootstrap + regtest all languages on x86-64 Linux;
> bootstrap + re
Le 2024-06-26 à 18 h 01, David Malcolm a écrit :
On Wed, 2024-02-21 at 14:16 -0500, Antoni Boucher wrote:
On Thu, 2023-12-07 at 19:57 -0500, David Malcolm wrote:
On Thu, 2023-12-07 at 17:26 -0500, Antoni Boucher wrote:
Hi.
This patch fixes getting the size of size_t (bug 112910).
There's o
This patch improves GCC’s vectorization of __builtin_popcount for aarch64 target
by adding popcount patterns for vector modes besides QImode, i.e., HImode,
SImode and DImode.
With this patch, we now generate the following for V8HI:
cnt v1.16b, v0.16b
uaddlp v2.8h, v1.16b
For V4HI, we gen
Thanks, Richard! I've updated the patch accordingly.
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655912.html
Please let me know if any other changes are needed.
Thanks,
Pengxuan
> Sorry for the slow reply.
>
> Pengxuan Zheng writes:
> > This patch improves GCC’s vectorization of __buil
On Thu, Jun 27, 2024 at 4:29 PM Roger Sayle wrote:
>
>
> This patch is another round of refinements to fine tune the new ternlog
> infrastructure in i386's sse.md. This patch tweaks ix86_ternlog_idx
> to allow multiple MEM/CONST_VECTOR/VEC_DUPLICATE operands prior to
> splitting (before reload),
for the testcase in the PR115406, here is part of the dump.
char D.4882;
vector(1) _1;
vector(1) signed char _2;
char _5;
:
_1 = { -1 };
When assign { -1 } to vector(1} {signed-boolean:8},
Since TYPE_PRECISION (itype) <= BITS_PER_UNIT, so it set each bit of dest
with each vector el
Currently, we support the cases that strictly fit for the instructions.
For example, for V16QImode, we only support shuffle like
(0<=N0, N1, N2, N3<=3 here)
N0, N1, N2, N3
N0+4N1+4N2+4, N3+4
N0+8N1+8N2+8, N3+8
N0+12 N1+12 N2+12, N
The scan-assembler-times rules only fit for -mfp32 and -mfpxx.
It fails if we are configured as FP64 by default, as it has
one less sdc1/ldc1 pair.
gcc/testsuite
* gcc.target/mips/call-clobbered-1.c: Add -mfpxx.
---
gcc/testsuite/gcc.target/mips/call-clobbered-1.c | 2 +-
1 file changed,
From: Pan Li
This patch would like to support the form of unsigned scalar .SAT_ADD
when one of the op is IMM. For example as below:
Form IMM:
#define DEF_SAT_U_ADD_IMM_FMT_1(T) \
T __attribute__((noinline)) \
sat_u_add_imm_##T##_fmt_1 (T x) \
{
The testcases are supposed to scan for vpopcnt{b,w,d,q} operations
with k mask, but mask is defined as uninitialized local variable which
will be set as 0 at rtl expand phase.
And it's further simplified off by late_combine which caused scan assembly
failure.
Move the definition of mask outside to
Because of the issue described in PR115610, late_combine is disabled by
default.The series try to solve the regressions and enable late_combine.
There're 4 regressions observed.
1. The first one is related to pass_stv2, because late_combine will restore
transformation did in the pass. Move the pas
Move pass_stv2 and pass_rpad after pre_reload pass_late_combine, also
define target_insn_cost to prevent post_reload pass_late_combine to
revert the optimziation did in pass_rpad.
Adjust testcases since pass_late_combine generates better code but
break scan assembly.
.i.e
Under 32-bit target, gcc
late_combine will combine lshift + zero into *lshifrtsi3_1_zext which
cause extra mov between gpr and kmask, add ?k to the pattern.
gcc/ChangeLog:
PR target/115610
* config/i386/i386.md (<*insnsi3_zext): Add alternative ?k,
enable it only for lshiftrt and under avx512bw.
On Thu, Jun 27, 2024 at 4:45 PM Li, Pan2 wrote:
>
> Hi Richard,
>
> As mentioned by tamar in previous, would like to try even more optimization
> based on this patch.
> Assume we take zip benchmark as example, we may have gimple similar as below
>
> unsigned int _1, _2;
> unsigned short int _9;
>
On Thu, Jun 27, 2024 at 5:15 PM Feng Xue OS wrote:
>
> I added two test cases for the examples your mentioned.
OK, thanks.
> BTW: would you please look over another 3 lane-reducing patches that have
> been updated? If ok, I would consider to check them in.
Sorry, I've been distracted by other
On Thu, Jun 27, 2024 at 9:40 PM Roger Sayle wrote:
>
>
> This patch generalizes some of the patterns in i386.md that recognize
> double word concatenation, so they handle sign_extend the same way that
> they handle zero_extend in appropriate contexts.
>
> As a motivating example consider the follo
On Fri, Jun 28, 2024 at 7:29 AM liuhongt wrote:
>
> late_combine will combine lshift + zero into *lshifrtsi3_1_zext which
> cause extra mov between gpr and kmask, add ?k to the pattern.
>
> gcc/ChangeLog:
>
> PR target/115610
> * config/i386/i386.md (<*insnsi3_zext): Add alternativ
On Fri, Jun 28, 2024 at 3:15 AM liuhongt wrote:
>
> for the testcase in the PR115406, here is part of the dump.
>
> char D.4882;
> vector(1) _1;
> vector(1) signed char _2;
> char _5;
>
>:
> _1 = { -1 };
>
> When assign { -1 } to vector(1} {signed-boolean:8},
> Since TYPE_PRECISION
On Fri, Jun 28, 2024 at 7:29 AM liuhongt wrote:
>
> Move pass_stv2 and pass_rpad after pre_reload pass_late_combine, also
> define target_insn_cost to prevent post_reload pass_late_combine to
> revert the optimziation did in pass_rpad.
>
> Adjust testcases since pass_late_combine generates better
Hi Thomas,
There are two things I think I can contribute to this discussion. The first is
that I have
a patch (from a year or two ago) for adding rtx_costs to the nvptx backend that
I will
respin, which will provide more backend control over combine-like pass
decisions.
The second is in res
On Fri, Jun 28, 2024 at 8:01 AM Richard Biener
wrote:
>
> On Fri, Jun 28, 2024 at 3:15 AM liuhongt wrote:
> >
> > for the testcase in the PR115406, here is part of the dump.
> >
> > char D.4882;
> > vector(1) _1;
> > vector(1) signed char _2;
> > char _5;
> >
> >:
> > _1 = { -1 };
Using auto_vec rather than vec for means the vectors are release
automatically upon return, to stop the leak. The problem seems is that
auto_vec is not really move-aware, only the specialization
is.
This is actually Jan's original suggestion
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655
On 6/28/24 6:18 AM, Pengxuan Zheng wrote:
This patch improves GCC’s vectorization of __builtin_popcount for aarch64 target
by adding popcount patterns for vector modes besides QImode, i.e., HImode,
SImode and DImode.
With this patch, we now generate the following for V8HI:
cnt v1.16b, v0.
But constexpr-ness of bit_cast has additional limitations and e.g.
providing an union as _Tp would be a hard-error. So we have two options:
- before bitcasting check if type can be bitcast-ed at compile-time,
- change the 'if constexpr' to regular 'if'.
If we go with the second solution then we
101 - 130 of 130 matches
Mail list logo