[PATCH v3 2/2] AArch64: Add SME LUTv2 intrinsics

2025-09-07 Thread Karl Meakin
Add intrinsic functions for the SME LUTv2 architecture extension (`svluti4_zt`, `svwrite_lane_zt` and `svwrite_zt`). gcc/ChangeLog: * config/aarch64/aarch64-sme.md (@aarch64_sme_write_zt): New insn. (aarch64_sme_lut_zt): Likewise. * config/aarch64/aarch64-sve-builtins-sha

[PATCH v3 0/2] aarch64: Add SME LUTv2 support

2025-09-07 Thread Karl Meakin
* More informative commit message * Document the feature flag in `gcc/doc/invoke.texi`. * Fix comments in `aarch64-option-extensions.def` * 2/2 * More informative commit message * V3: * 2/2 * Remove unused `(match_operand 0 "const0_operand")` from `@aarch64_sme_write_zt`.

[PATCH v3 1/2] AArch64: Add SME LUTv2 architecture extension

2025-09-06 Thread Karl Meakin
Add the SME LUTv2 architecture extension. Users can enable the extension by adding `+sme-lutv2` to `-march` or `-mcpu`, and test for its presence with the `__ARM_FEATURE_SME_LUTv2` macro. The intrinsics will be added in the next commit. gcc/ChangeLog: * config/aarch64/aarch64-c.cc (aarch6

Re: [PATCH v1 2/2] AArch64: Add LUTv2 intrinsics

2025-09-04 Thread Karl Meakin
On 03/09/2025 13:33, Kyrylo Tkachov wrote: Hi Karl, On 2 Sep 2025, at 16:16, Karl Meakin wrote: gcc/ChangeLog: * config/aarch64/aarch64-sme.md (@aarch64_sme_write_zt): New insn. (aarch64_sme_lut_zt): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (parse_type): New type format

[PATCH v2 1/2] AArch64: Add SME LUTv2 architecture extension

2025-09-03 Thread Karl Meakin
Add the SME LUTv2 architecture extension. Users can enable the extension by adding `+sme-lutv2` to `-march` or `-mcpu`, and test for its presence with the `__ARM_FEATURE_SME_LUTv2` macro. The intrinsics will be added in the next commit. gcc/ChangeLog: * config/aarch64/aarch64-c.cc (aarch6

[PATCH v2 2/2] AArch64: Add SME LUTv2 intrinsics

2025-09-03 Thread Karl Meakin
Add intrinsic functions for the SME LUTv2 architecture extension (`svluti4_zt`, `svwrite_lane_zt` and `svwrite_zt`). gcc/ChangeLog: * config/aarch64/aarch64-sme.md (@aarch64_sme_write_zt): New insn. (aarch64_sme_lut_zt): Likewise. * config/aarch64/aarch64-sve-builtins-sha

[PATCH v2 0/2] aarch64: Add SME LUTv2 support

2025-09-03 Thread Karl Meakin
This patch adds support for the new SME LUTv2 architecture extension as described in the ACLE. It adds the `+sme-lutv2` target flag, the `__ARM_FEATURE_SME_LUTv2` feature test macro and the `svluti4_zt`, `svwrite_lane_zt` and `svwrite_zt` intrinsics. Making use of the new instructions without the

Re: [PATCH v1 1/2] AArch64: Add `+sme-lutv2` flag

2025-09-03 Thread Karl Meakin
address these in the next revision in the series More comments inline. On 9/2/2025 3:16 PM, Karl Meakin wrote: gcc/ChangeLog: * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Conditonally define `__ARM_FEATURE_SME_LUTv2" macro. * config/aarch64/aarc

Re: [PATCH v1 2/2] AArch64: Add LUTv2 intrinsics

2025-09-03 Thread Karl Meakin
inline. On 9/2/2025 3:16 PM, Karl Meakin wrote: gcc/ChangeLog: * config/aarch64/aarch64-sme.md (@aarch64_sme_write_zt): New insn. (aarch64_sme_lut_zt): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (parse_type): New type format "%T". (struct luti_la

[PATCH v1 1/2] AArch64: Add `+sme-lutv2` flag

2025-09-02 Thread Karl Meakin
gcc/ChangeLog: * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Conditonally define `__ARM_FEATURE_SME_LUTv2" macro. * config/aarch64/aarch64-option-extensions.def (AARCH64_OPT_EXTENSION("sme-lutv2")): New optional architecture extension. * con

[PATCH v1 0/2] AArch64: Add SME LUTv2 support

2025-09-02 Thread Karl Meakin
This patch adds support for the new LUTv2 features as described in the ACLE. It adds the `+sme-lutv2` target flag, feature test macro and intrinsics. Making use of the new instructions without the intrinsics will be done in a follow up patch. ChangeLog: * V1: Initial series. Karl Meakin (2

[PATCH v1 2/2] AArch64: Add LUTv2 intrinsics

2025-09-02 Thread Karl Meakin
gcc/ChangeLog: * config/aarch64/aarch64-sme.md (@aarch64_sme_write_zt): New insn. (aarch64_sme_lut_zt): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (parse_type): New type format "%T". (struct luti_lane_zt_base): New function shape. (SHAPE): L

[PATCH v9 9/9] Update `cmpbr.c` tests

2025-08-07 Thread Karl Meakin
I have updated the tests in `cmpbr.c` to reflect the fixes. There are a few regressions, but they can be fixed later; let's just make GCC crash-free first. --- gcc/testsuite/gcc.target/aarch64/cmpbr.c | 334 +-- 1 file changed, 185 insertions(+), 149 deletions(-) diff --git a

Re: [PATCH 6/8] aarch64: Add cc clobber to compare-and-branch patterns

2025-08-07 Thread Karl Meakin
On 06/08/2025 12:47, Richard Henderson wrote: On 8/6/25 00:45, Karl Meakin wrote: Now that the body of `cbranch` and `cbranch` are the same, could we merge them into one rule? No, the bodies are the same but the predicates are not. r~ Good point, I missed that

Re: [PATCH 7/8] aarch64: Consider TARGET_CMPBR in rtx costs

2025-08-05 Thread Karl Meakin
On 04/08/2025 22:18, Richard Henderson wrote: gcc: * config/aarch64/aarch64.cc (aarch64_if_then_else_costs): Use aarch64_cb_rhs to match CB insns. --- gcc/config/aarch64/aarch64.cc | 5 + 1 file changed, 5 insertions(+) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/co

Re: [PATCH 8/8] aarch64: Use cc when CB/CBB/CBH is out-of-range

2025-08-05 Thread Karl Meakin
On 04/08/2025 22:18, Richard Henderson wrote: Middle distance branches between 1KiB and 1MiB may be implemented with cmp+branch instead of branch+branch. gcc: * config/aarch64/aarch64.cc (*aarch64_cb): Fall back to cmp/cmn + bcond if !far_branch. Adjust far_branch to 1M

Re: [PATCH 4/8] aarch64: Disable TARGET_CMPBR with aarch64_track_speculation

2025-08-05 Thread Karl Meakin
On 04/08/2025 22:18, Richard Henderson wrote: With -mtrack-speculation, CC_REGNUM must be used at every conditional branch. gcc: * config/aarch64/aarch64.h (TARGET_CMPBR): False when aarch64_track_speculation is true. --- gcc/config/aarch64/aarch64.h | 5 +++-- 1 file change

Re: [PATCH 3/8] aarch64: Drop cbranch4 expander

2025-08-05 Thread Karl Meakin
On 04/08/2025 22:18, Richard Henderson wrote: If we implement bare QI/HImode cbranch, movcc will ask aarch64_gen_compare_reg for a QI/HImode compare, which we cannot provide without modification elsewhere. However, we can usually get the extensions for free from surrounding operations. So e.g

Re: [PATCH 2/8] aarch64: Fix spelling of BRANCH_LEN_N_1KiB

2025-08-05 Thread Karl Meakin
On 04/08/2025 22:18, Richard Henderson wrote: One kilobyte not one kilobit. gcc: * config/aarch64/aarch64.md (BRANCH_LEN_N_1KiB): Rename from BRANCH_LEN_N_1Kib. --- gcc/config/aarch64/aarch64.md | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) di

Re: [PATCH 6/8] aarch64: Add cc clobber to compare-and-branch patterns

2025-08-05 Thread Karl Meakin
On 04/08/2025 22:18, Richard Henderson wrote: Some of the compare-and-branch patterns rely on CC for scratch in some of the alternative expansions. This is fine, because when the combined compare-and-branch patterns are formed by combine, we will be eliminating a write to CC, so CC is dead any

Re: [PATCH 1/8] aarch64: Drop label format argument from aarch64_gen_far_branch

2025-08-05 Thread Karl Meakin
On 04/08/2025 22:18, Richard Henderson wrote: There's no need for each branch-over-branch to choose its own label format. gcc: * config/aarch64/aarch64.cc (aarch64_gen_far_branch): Drop dest argument; always use "L". * config/aarch64/aarch64.md: Update to match.

Re: [PATCH 0/8] aarch64: CMPBR fixes

2025-08-05 Thread Karl Meakin
Thanks Richard. I had a WIP patch to fix some of these issues but you beat me to it, and I'll defer to your patch since you have more experience with GCC. I would ask, have you checked that the generated assembly in `gcc/testsuite/gcc.target/aarch64/cmpbr.c` hasn't changed? Th

[PATCH v3 0/2] middle-end: Enable masked load with non-constant offset

2025-07-15 Thread Karl Meakin
3: No changes. * v2: Make assertions in `vect_check_gather_scatter` into checking assertions. * v1: Initial patch Karl Meakin (2): AArch64: precommit test for masked load vectorisation. middle-end: Enable masked load with non-constant offset .../gcc.target/aarch64/sve/mask_load_2.c

[PATCH v3 2/2] middle-end: Enable masked load with non-constant offset

2025-07-15 Thread Karl Meakin
The function `vect_check_gather_scatter` requires the `base` of the load to be loop-invariant and the `off`set to be not loop-invariant. When faced with a scenario where `base` is not loop-invariant, instead of giving up immediately we can try swapping the `base` and `off`, if `off` is actually loo

[PATCH v3 1/2] AArch64: precommit test for masked load vectorisation.

2025-07-15 Thread Karl Meakin
Commit the test file `mask_load_2.c` before the vectorisation analysis is changed, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/mask_load_2.c: New test. --- .../gcc.target/aarch64/sve/mask_load_2.c | 23

Re: [PATCH v8 0/9] AArch64: CMPBR support

2025-07-03 Thread Karl Meakin
On 02/07/2025 18:45, Karl Meakin wrote: This patch series adds support for the CMPBR extension. It includes the new `+cmpbr` option and rules to generate the new instructions when lowering conditional branches. Changelog: * v8: - Support far branches for the `CBB` and `CBH` instructions

[PATCH v9 0/9] AArch64: CMPBR support

2025-07-02 Thread Karl Meakin
s for immediate RHSes in aarch64.cc: CBGE, CBHS, CBLE and CBLS have different ranges of allowed immediates than the other comparisons. Karl Meakin (9): AArch64: place branch instruction rules together AArch64: reformat branch instruction rules AArch64: rename branch instruction

[PATCH v9 2/9] AArch64: reformat branch instruction rules

2025-07-02 Thread Karl Meakin
Make the formatting of the RTL templates in the rules for branch instructions more consistent with each other. gcc/ChangeLog: * config/aarch64/aarch64.md (cbranch4): Reformat. (cbranchcc4): Likewise. (condjump): Likewise. (*compare_condjump): Likewise. (aar

[PATCH v9 6/9] AArch64: recognize `+cmpbr` option

2025-07-02 Thread Karl Meakin
Add the `+cmpbr` option to enable the FEAT_CMPBR architectural extension. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (cmpbr): New option. * config/aarch64/aarch64.h (TARGET_CMPBR): New macro. * doc/invoke.texi (cmpbr): New option. --- gcc/config

[PATCH v9 1/9] AArch64: place branch instruction rules together

2025-07-02 Thread Karl Meakin
The rules for conditional branches were spread throughout `aarch64.md`. Group them together so it is easier to understand how `cbranch4` is lowered to RTL. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Move. (*compare_condjump): Likewise. (aarch64_cb1): Likewise.

[PATCH v9 9/9] AArch64: make rules for CBZ/TBZ higher priority

2025-07-02 Thread Karl Meakin
Move the rules for CBZ/TBZ to be above the rules for CBB/CBH/CB. We want them to have higher priority because they can express larger displacements. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_cbz1): Move above rules for CBB/CBH/CB. (*aarch64_tbz1): Likewise. gcc/

[PATCH v9 3/9] AArch64: rename branch instruction rules

2025-07-02 Thread Karl Meakin
Give the `define_insn` rules used in lowering `cbranch4` to RTL more descriptive and consistent names: from now on, each rule is named after the AArch64 instruction that it generates. Also add comments to document each rule. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Rename to

[PATCH v9 5/9] AArch64: make `far_branch` attribute a boolean

2025-07-02 Thread Karl Meakin
The `far_branch` attribute only ever takes the values 0 or 1, so make it a `no/yes` valued string attribute instead. gcc/ChangeLog: * config/aarch64/aarch64.md (far_branch): Replace 0/1 with no/yes. (aarch64_bcond): Handle rename. (aarch64_cbz1): Likewise.

[PATCH v9 8/9] AArch64: rules for CMPBR instructions

2025-07-02 Thread Karl Meakin
Add rules for lowering `cbranch4` to CBB/CBH/CB when CMPBR extension is enabled. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_cb_rhs): New function. * config/aarch64/aarch64.cc (aarch64_cb_rhs): Likewise. * config/aarch64/aarch64.md (cbranch4): Rename to ...

[PATCH v9 7/9] AArch64: precommit test for CMPBR instructions

2025-07-02 Thread Karl Meakin
Commit the test file `cmpbr.c` before rules for generating the new instructions are added, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add `cmpbr` to the list of extensions. * gcc.target/aarch64/cmpbr.c: N

[PATCH v9 4/9] AArch64: add constants for branch displacements

2025-07-02 Thread Karl Meakin
Extract the hardcoded values for the minimum PC-relative displacements into named constants and document them. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant. (BRANCH_LEN_N_128MiB): Likewise. (BRANCH_LEN_P_1MiB): Likewise. (BRANCH_LE

[PATCH v8 8/9] AArch64: rules for CMPBR instructions

2025-07-02 Thread Karl Meakin
Add rules for lowering `cbranch4` to CBB/CBH/CB when CMPBR extension is enabled. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_cb_rhs): New function. * config/aarch64/aarch64.cc (aarch64_cb_rhs): Likewise. * config/aarch64/aarch64.md (cbranch4): Rename to ...

[PATCH v8 1/9] AArch64: place branch instruction rules together

2025-07-02 Thread Karl Meakin
The rules for conditional branches were spread throughout `aarch64.md`. Group them together so it is easier to understand how `cbranch4` is lowered to RTL. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Move. (*compare_condjump): Likewise. (aarch64_cb1): Likewise.

[PATCH v8 9/9] AArch64: make rules for CBZ/TBZ higher priority

2025-07-02 Thread Karl Meakin
Move the rules for CBZ/TBZ to be above the rules for CBB/CBH/CB. We want them to have higher priority because they can express larger displacements. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_cbz1): Move above rules for CBB/CBH/CB. (*aarch64_tbz1): Likewise. gcc/

[PATCH v8 4/9] AArch64: add constants for branch displacements

2025-07-02 Thread Karl Meakin
Extract the hardcoded values for the minimum PC-relative displacements into named constants and document them. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant. (BRANCH_LEN_N_128MiB): Likewise. (BRANCH_LEN_P_1MiB): Likewise. (BRANCH_LE

[PATCH v8 3/9] AArch64: rename branch instruction rules

2025-07-02 Thread Karl Meakin
Give the `define_insn` rules used in lowering `cbranch4` to RTL more descriptive and consistent names: from now on, each rule is named after the AArch64 instruction that it generates. Also add comments to document each rule. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Rename to

[PATCH v8 5/9] AArch64: make `far_branch` attribute a boolean

2025-07-02 Thread Karl Meakin
The `far_branch` attribute only ever takes the values 0 or 1, so make it a `no/yes` valued string attribute instead. gcc/ChangeLog: * config/aarch64/aarch64.md (far_branch): Replace 0/1 with no/yes. (aarch64_bcond): Handle rename. (aarch64_cbz1): Likewise.

[PATCH v8 6/9] AArch64: recognize `+cmpbr` option

2025-07-02 Thread Karl Meakin
Add the `+cmpbr` option to enable the FEAT_CMPBR architectural extension. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (cmpbr): New option. * config/aarch64/aarch64.h (TARGET_CMPBR): New macro. * doc/invoke.texi (cmpbr): New option. --- gcc/config

[PATCH v8 7/9] AArch64: precommit test for CMPBR instructions

2025-07-02 Thread Karl Meakin
Commit the test file `cmpbr.c` before rules for generating the new instructions are added, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add `cmpbr` to the list of extensions. * gcc.target/aarch64/cmpbr.c: N

[PATCH v8 2/9] AArch64: reformat branch instruction rules

2025-07-02 Thread Karl Meakin
Make the formatting of the RTL templates in the rules for branch instructions more consistent with each other. gcc/ChangeLog: * config/aarch64/aarch64.md (cbranch4): Reformat. (cbranchcc4): Likewise. (condjump): Likewise. (*compare_condjump): Likewise. (aar

[PATCH v8 0/9] AArch64: CMPBR support

2025-07-02 Thread Karl Meakin
ormat-patch`. * v4: - Added a commit to use HS/LO instead of CS/CC mnemonics. - Rewrite the range checks for immediate RHSes in aarch64.cc: CBGE, CBHS, CBLE and CBLS have different ranges of allowed immediates than the other comparisons. Karl Meakin (9): AArch64: place branch instru

Re: [PATCH v7 8/9] AArch64: rules for CMPBR instructions

2025-07-02 Thread Karl Meakin
On 01/07/2025 11:02, Richard Sandiford wrote: Karl Meakin writes: @@ -763,6 +784,68 @@ (define_expand "cbranchcc4" "" ) +;; Emit a `CB (register)` or `CB (immediate)` instruction. +;; The immediate range depends on the comparison code. +;; Comparisons agains

[PATCH v7 9/9] AArch64: make rules for CBZ/TBZ higher priority

2025-06-25 Thread Karl Meakin
Move the rules for CBZ/TBZ to be above the rules for CBB/CBH/CB. We want them to have higher priority because they can express larger displacements. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_cbz1): Move above rules for CBB/CBH/CB. (*aarch64_tbz1): Likewise. gcc/

[PATCH v2 2/2] middle-end: Enable masked load with non-constant offset

2025-06-25 Thread Karl Meakin
The function `vect_check_gather_scatter` requires the `base` of the load to be loop-invariant and the `off`set to be not loop-invariant. When faced with a scenario where `base` is not loop-invariant, instead of giving up immediately we can try swapping the `base` and `off`, if `off` is actually loo

[PATCH v7 3/9] AArch64: rename branch instruction rules

2025-06-25 Thread Karl Meakin
Give the `define_insn` rules used in lowering `cbranch4` to RTL more descriptive and consistent names: from now on, each rule is named after the AArch64 instruction that it generates. Also add comments to document each rule. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Rename to

[PATCH v6 6/9] AArch64: recognize `+cmpbr` option

2025-06-25 Thread Karl Meakin
Add the `+cmpbr` option to enable the FEAT_CMPBR architectural extension. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (cmpbr): New option. * config/aarch64/aarch64.h (TARGET_CMPBR): New macro. * doc/invoke.texi (cmpbr): New option. --- gcc/config

[PATCH v6 7/9] AArch64: precommit test for CMPBR instructions

2025-06-25 Thread Karl Meakin
Commit the test file `cmpbr.c` before rules for generating the new instructions are added, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add `cmpbr` to the list of extensions. * gcc.target/aarch64/cmpbr.c: N

[PATCH v7 2/9] AArch64: reformat branch instruction rules

2025-06-25 Thread Karl Meakin
Make the formatting of the RTL templates in the rules for branch instructions more consistent with each other. gcc/ChangeLog: * config/aarch64/aarch64.md (cbranch4): Reformat. (cbranchcc4): Likewise. (condjump): Likewise. (*compare_condjump): Likewise. (aar

[PATCH v1 2/2] middle-end: Enable masked load with non-constant offset

2025-06-25 Thread Karl Meakin
The function `vect_check_gather_scatter` requires the `base` of the load to be loop-invariant and the `off`set to be not loop-invariant. When faced with a scenario where `base` is not loop-invariant, instead of giving up immediately we can try swapping the `base` and `off`, if `off` is actually loo

[PATCH v7 6/9] AArch64: recognize `+cmpbr` option

2025-06-25 Thread Karl Meakin
Add the `+cmpbr` option to enable the FEAT_CMPBR architectural extension. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (cmpbr): New option. * config/aarch64/aarch64.h (TARGET_CMPBR): New macro. * doc/invoke.texi (cmpbr): New option. --- gcc/config

[PATCH v6 4/9] AArch64: add constants for branch displacements

2025-06-25 Thread Karl Meakin
Extract the hardcoded values for the minimum PC-relative displacements into named constants and document them. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant. (BRANCH_LEN_N_128MiB): Likewise. (BRANCH_LEN_P_1MiB): Likewise. (BRANCH_LE

[PATCH v7 1/9] AArch64: place branch instruction rules together

2025-06-25 Thread Karl Meakin
The rules for conditional branches were spread throughout `aarch64.md`. Group them together so it is easier to understand how `cbranch4` is lowered to RTL. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Move. (*compare_condjump): Likewise. (aarch64_cb1): Likewise.

[PATCH v7 8/9] AArch64: rules for CMPBR instructions

2025-06-25 Thread Karl Meakin
Add rules for lowering `cbranch4` to CBB/CBH/CB when CMPBR extension is enabled. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_cb_rhs): New function. * config/aarch64/aarch64.cc (aarch64_cb_rhs): Likewise. * config/aarch64/aarch64.md (cbranch4): Rename to ...

[PATCH v7 0/9] AArch64: CMPBR support

2025-06-25 Thread Karl Meakin
* v4: - Added a commit to use HS/LO instead of CS/CC mnemonics. - Rewrite the range checks for immediate RHSes in aarch64.cc: CBGE, CBHS, CBLE and CBLS have different ranges of allowed immediates than the other comparisons. Karl Meakin (9): AArch64: place branch instruction rules tog

[PATCH v7 7/9] AArch64: precommit test for CMPBR instructions

2025-06-25 Thread Karl Meakin
Commit the test file `cmpbr.c` before rules for generating the new instructions are added, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add `cmpbr` to the list of extensions. * gcc.target/aarch64/cmpbr.c: N

[PATCH v7 4/9] AArch64: add constants for branch displacements

2025-06-25 Thread Karl Meakin
Extract the hardcoded values for the minimum PC-relative displacements into named constants and document them. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant. (BRANCH_LEN_N_128MiB): Likewise. (BRANCH_LEN_P_1MiB): Likewise. (BRANCH_LE

[PATCH v7 5/9] AArch64: make `far_branch` attribute a boolean

2025-06-25 Thread Karl Meakin
The `far_branch` attribute only ever takes the values 0 or 1, so make it a `no/yes` valued string attribute instead. gcc/ChangeLog: * config/aarch64/aarch64.md (far_branch): Replace 0/1 with no/yes. (aarch64_bcond): Handle rename. (aarch64_cbz1): Likewise.

[PATCH v6 5/9] AArch64: make `far_branch` attribute a boolean

2025-06-24 Thread Karl Meakin
The `far_branch` attribute only ever takes the values 0 or 1, so make it a `no/yes` valued string attribute instead. gcc/ChangeLog: * config/aarch64/aarch64.md (far_branch): Replace 0/1 with no/yes. (aarch64_bcond): Handle rename. (aarch64_cbz1): Likewise.

[PATCH v1 1/2] AArch64: precommit test for masked load vectorisation.

2025-06-24 Thread Karl Meakin
Commit the test file `mask_load_2.c` before the vectorisation analysis is changed, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/mask_load_2.c: New test. --- .../gcc.target/aarch64/sve/mask_load_2.c | 23

[PATCH v6 0/9] AArch64: CMPBR support

2025-06-24 Thread Karl Meakin
. Testing done: `make bootstrap; make check` Karl Meakin (9): AArch64: place branch instruction rules together AArch64: reformat branch instruction rules AArch64: rename branch instruction rules AArch64: add constants for branch displacements AArch64: make `far_branch` attribute a boolean

[PATCH v6 9/9] AArch64: make rules for CBZ/TBZ higher priority

2025-06-24 Thread Karl Meakin
Move the rules for CBZ/TBZ to be above the rules for CBB/CBH/CB. We want them to have higher priority because they can express larger displacements. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_cbz1): Move above rules for CBB/CBH/CB. (*aarch64_tbz1): Likewise. gcc/

[PATCH v6 3/9] AArch64: rename branch instruction rules

2025-06-24 Thread Karl Meakin
Give the `define_insn` rules used in lowering `cbranch4` to RTL more descriptive and consistent names: from now on, each rule is named after the AArch64 instruction that it generates. Also add comments to document each rule. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Rename to

[PATCH v6 8/9] AArch64: rules for CMPBR instructions

2025-06-24 Thread Karl Meakin
Add rules for lowering `cbranch4` to CBB/CBH/CB when CMPBR extension is enabled. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_cb_rhs): New function. * config/aarch64/aarch64.cc (aarch64_cb_rhs): Likewise. * config/aarch64/aarch64.md (cbranch4): Rename to ...

[PATCH v6 1/9] AArch64: place branch instruction rules together

2025-06-24 Thread Karl Meakin
The rules for conditional branches were spread throughout `aarch64.md`. Group them together so it is easier to understand how `cbranch4` is lowered to RTL. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Move. (*compare_condjump): Likewise. (aarch64_cb1): Likewise.

[PATCH v6 2/9] AArch64: reformat branch instruction rules

2025-06-24 Thread Karl Meakin
Make the formatting of the RTL templates in the rules for branch instructions more consistent with each other. gcc/ChangeLog: * config/aarch64/aarch64.md (cbranch4): Reformat. (cbranchcc4): Likewise. (condjump): Likewise. (*compare_condjump): Likewise. (aar

[PATCH v1 0/2] middle-end: Enable masked load with non-constant offset

2025-06-24 Thread Karl Meakin
= 0; i < len; i++) { Array *p = pp[i]; if (p) { nRet += p->elems[idx]; } } return nRet; } ``` Changelog: - v1: Initial patch Karl Meakin (2): AArch64: precommit test for masked load vectorisation. middle-end: Enable masked load with non-co

[PATCH v5 03/10] AArch64: rename branch instruction rules

2025-06-19 Thread Karl Meakin
Give the `define_insn` rules used in lowering `cbranch4` to RTL more descriptive and consistent names: from now on, each rule is named after the AArch64 instruction that it generates. Also add comments to document each rule. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Rename to

[PATCH v5 07/10] AArch64: add `%j` and `%J` format specifiers

2025-06-19 Thread Karl Meakin
The CB family of instructions does not support using the CS or CC condition codes; instead the synonyms HS and LO must be used. GCC has traditionally used the CS and CC names. To work around this while avoiding test churn, add new `j` and `J` format specifiers; they will be used in the next commit

[PATCH v5 01/10] AArch64: place branch instruction rules together

2025-06-19 Thread Karl Meakin
The rules for conditional branches were spread throughout `aarch64.md`. Group them together so it is easier to understand how `cbranch4` is lowered to RTL. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Move. (*compare_condjump): Likewise. (aarch64_cb1): Likewise.

[PATCH v5 08/10] AArch64: precommit test for CMPBR instructions

2025-06-19 Thread Karl Meakin
Commit the test file `cmpbr.c` before rules for generating the new instructions are added, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add `cmpbr` to the list of extensions. * gcc.target/aarch64/cmpbr.c: N

[PATCH v5 02/10] AArch64: reformat branch instruction rules

2025-06-19 Thread Karl Meakin
Make the formatting of the RTL templates in the rules for branch instructions more consistent with each other. gcc/ChangeLog: * config/aarch64/aarch64.md (cbranch4): Reformat. (cbranchcc4): Likewise. (condjump): Likewise. (*compare_condjump): Likewise. (aar

[PATCH v5 10/10] AArch64: make rules for CBZ/TBZ higher priority

2025-06-19 Thread Karl Meakin
Move the rules for CBZ/TBZ to be above the rules for CBB/CBH/CB. We want them to have higher priority because they can express larger displacements. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_cbz1): Move above rules for CBB/CBH/CB. (*aarch64_tbz1): Likewise. gcc/

[PATCH v5 09/10] AArch64: rules for CMPBR instructions

2025-06-19 Thread Karl Meakin
Add rules for lowering `cbranch4` to CBB/CBH/CB when CMPBR extension is enabled. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_1Kib): New constant. (BRANCH_LEN_N_1Kib): Likewise. (cbranch4): Emit CMPBR instructions if possible. (cbranch4): New expand rul

[PATCH v5 04/10] AArch64: add constants for branch displacements

2025-06-19 Thread Karl Meakin
Extract the hardcoded values for the minimum PC-relative displacements into named constants and document them. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant. (BRANCH_LEN_N_128MiB): Likewise. (BRANCH_LEN_P_1MiB): Likewise. (BRANCH_LE

[PATCH v5 05/10] AArch64: make `far_branch` attribute a boolean

2025-06-19 Thread Karl Meakin
The `far_branch` attribute only ever takes the values 0 or 1, so make it a `no/yes` valued string attribute instead. gcc/ChangeLog: * config/aarch64/aarch64.md (far_branch): Replace 0/1 with no/yes. (aarch64_bcond): Handle rename. (aarch64_cbz1): Likewise.

[PATCH v5 06/10] AArch64: recognize `+cmpbr` option

2025-06-19 Thread Karl Meakin
Add the `+cmpbr` option to enable the FEAT_CMPBR architectural extension. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (cmpbr): New option. * config/aarch64/aarch64.h (TARGET_CMPBR): New macro. * doc/invoke.texi (cmpbr): New option. --- gcc/config

[PATCH v5 00/10] AArch64: CMPBR support

2025-06-19 Thread Karl Meakin
`--function-context` to * `git format-patch`. Testing done: `make bootstrap; make check` Karl Meakin (10): AArch64: place branch instruction rules together AArch64: reformat branch instruction rules AArch64: rename branch instruction rules AArch64: add constants for branch displacements

[PATCH v1] doc: Replace "fixed-point" with "integer"

2025-05-29 Thread Karl Meakin
In some places the documentation refers to "fixed-point" types or values when talking about plain integer types. Although this is meant to mean "the opposite of floating-point", it is misleading and can be confused with the fractional types that are also known as "fixed-point". For the avoidance of

[PATCH v4 03/10] AArch64: rename branch instruction rules

2025-05-28 Thread Karl Meakin
Give the `define_insn` rules used in lowering `cbranch4` to RTL more descriptive and consistent names: from now on, each rule is named after the AArch64 instruction that it generates. Also add comments to document each rule. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Rename to

[PATCH v4 04/10] AArch64: add constants for branch displacements

2025-05-28 Thread Karl Meakin
Extract the hardcoded values for the minimum PC-relative displacements into named constants and document them. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant. (BRANCH_LEN_N_128MiB): Likewise. (BRANCH_LEN_P_1MiB): Likewise. (BRANCH_LE

[PATCH v4 10/10] AArch64: Use HS/LO instead of CS/CC

2025-05-28 Thread Karl Meakin
The CB family of instructions does not support using the CS or CC condition codes; instead the synonyms HS and LO must be used. GCC has traditionally used the CS and CC names. To work around this while avoiding test churn, add new `j` and `J` format specifiers and use them when generating CB instru

[PATCH v4 09/10] AArch64: make rules for CBZ/TBZ higher priority

2025-05-28 Thread Karl Meakin
Move the rules for CBZ/TBZ to be above the rules for CBB/CBH/CB. We want them to have higher priority because they can express larger displacements. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_cbz1): Move above rules for CBB/CBH/CB. (*aarch64_tbz1): Likewise. gcc/

[PATCH v4 01/10] AArch64: place branch instruction rules together

2025-05-28 Thread Karl Meakin
The rules for conditional branches were spread throughout `aarch64.md`. Group them together so it is easier to understand how `cbranch4` is lowered to RTL. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Move. (*compare_condjump): Likewise. (aarch64_cb1): Likewise.

[PATCH v4 05/10] AArch64: make `far_branch` attribute a boolean

2025-05-28 Thread Karl Meakin
The `far_branch` attribute only ever takes the values 0 or 1, so make it a `no/yes` valued string attribute instead. gcc/ChangeLog: * config/aarch64/aarch64.md (far_branch): Replace 0/1 with no/yes. (aarch64_bcond): Handle rename. (aarch64_cbz1): Likewise.

[PATCH v4 08/10] AArch64: rules for CMPBR instructions

2025-05-28 Thread Karl Meakin
Add rules for lowering `cbranch4` to CBB/CBH/CB when CMPBR extension is enabled. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_1Kib): New constant. (BRANCH_LEN_N_1Kib): Likewise. (cbranch4): Emit CMPBR instructions if possible. (cbranch4): New expand rul

[PATCH v4 06/10] AArch64: recognize `+cmpbr` option

2025-05-28 Thread Karl Meakin
Add the `+cmpbr` option to enable the FEAT_CMPBR architectural extension. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (cmpbr): New option. * config/aarch64/aarch64.h (TARGET_CMPBR): New macro. * doc/invoke.texi (cmpbr): New option. --- gcc/config

[PATCH v4 07/10] AArch64: precommit test for CMPBR instructions

2025-05-28 Thread Karl Meakin
Commit the test file `cmpbr.c` before rules for generating the new instructions are added, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add `cmpbr` to the list of extensions. * gcc.target/aarch64/cmpbr.c: N

[PATCH v4 02/10] AArch64: reformat branch instruction rules

2025-05-28 Thread Karl Meakin
Make the formatting of the RTL templates in the rules for branch instructions more consistent with each other. gcc/ChangeLog: * config/aarch64/aarch64.md (cbranch4): Reformat. (cbranchcc4): Likewise. (condjump): Likewise. (*compare_condjump): Likewise. (aar

[PATCH v4 00/10] AArch64: CMPBR support

2025-05-28 Thread Karl Meakin
* Added a commit to use HS/LO instead of CS/CC mnemonics. * Rewrite the range checks for immediate RHSes in aarch64.cc: CBGE, CBHS, CBLE and CBLS have different ranges of allowed immediates than the other comparisons Karl Meakin (10): AArch64: place branch instruction rules together

[PATCH 01/10] AArch64: place branch instruction rules together

2025-05-16 Thread Karl Meakin
The rules for conditional branches were spread throughout `aarch64.md`. Group them together so it is easier to understand how `cbranch4` is lowered to RTL. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Move. (*compare_condjump): Likewise. (aarch64_cb1): Likewise.

[PATCH 10/10] Use HS/LO instead of CS/CC

2025-05-16 Thread Karl Meakin
The CB family of instructions does not support using the CS or CC condition codes; instead the synonyms HS and LO must be used. GCC has traditionally used the CS and CC names. To work around this while avoiding test churn, add new `j` and `J` format specifiers and use them when generating CB instru

[PATCH 02/10] AArch64: reformat branch instruction rules

2025-05-16 Thread Karl Meakin
Make the formatting of the RTL templates in the rules for branch instructions more consistent with each other. gcc/ChangeLog: * config/aarch64/aarch64.md (cbranch4): Reformat. (cbranchcc4): Likewise. (condjump): Likewise. (*compare_condjump): Likewise. (aar

[PATCH 08/10] AArch64: rules for CMPBR instructions

2025-05-16 Thread Karl Meakin
Add rules for lowering `cbranch4` to CBB/CBH/CB when CMPBR extension is enabled. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_1Kib): New constant. (BRANCH_LEN_N_1Kib): Likewise. (cbranch4): Emit CMPBR instructions if possible. (cbranch4): New expand rul

[PATCH 09/10] AArch64: make rules for CBZ/TBZ higher priority

2025-05-16 Thread Karl Meakin
Move the rules for CBZ/TBZ to be above the rules for CBB/CBH/CB. We want them to have higher priority because they can express larger displacements. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_cbz1): Move above rules for CBB/CBH/CB. (*aarch64_tbz1): Likewise. gcc/

[PATCH 03/10] AArch64: rename branch instruction rules

2025-05-16 Thread Karl Meakin
Give the `define_insn` rules used in lowering `cbranch4` to RTL more descriptive and consistent names: from now on, each rule is named after the AArch64 instruction that it generates. Also add comments to document each rule. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Rename to

  1   2   >