On Mon, 19 Feb 2024, Andre Vieira (lists) wrote: > Hi all, > > OK to backport this to gcc-12 and gcc-13? Patch applies cleanly, bootstrapped > and regression tested on aarch64-unknown-linux-gnu. Only change is in the > testcase as I had to use -march=armv9-a because -march=armv8-a+sve conflicts > with -mcpu=neoverse-n2 in previous gcc versions.
Yes. Thanks, Richard. > Kind Regards, > Andre > > On 20/12/2023 14:30, Richard Biener wrote: > > On Wed, 20 Dec 2023, Andre Vieira (lists) wrote: > > > >> Thanks, fully agree with all comments. > >> > >> gcc/ChangeLog: > >> > >> PR target/112787 > >> * tree-vect-generic (type_for_widest_vector_mode): Change function > >> to use original vector type and check widest vector mode has at > >> most > >> the same number of elements. > >> (get_compute_type): Pass original vector type rather than the element > >> type to type_for_widest_vector_mode and remove now obsolete check > >> for the number of elements. > > > > OK. > > > > Richard. > > > >> On 07/12/2023 07:45, Richard Biener wrote: > >>> On Wed, 6 Dec 2023, Andre Vieira (lists) wrote: > >>> > >>>> Hi, > >>>> > >>>> This patch addresses the issue reported in PR target/112787 by improving > >>>> the > >>>> compute type selection. We do this by not considering types with more > >>>> elements > >>>> than the type we are lowering since we'd reject such types anyway. > >>>> > >>>> gcc/ChangeLog: > >>>> > >>>> PR target/112787 > >>>> * tree-vect-generic (type_for_widest_vector_mode): Add a parameter to > >>>> control maximum amount of elements in resulting vector mode. > >>>> (get_compute_type): Restrict vector_compute_type to a mode no wider > >>>> than the original compute type. > >>>> > >>>> gcc/testsuite/ChangeLog: > >>>> > >>>> * gcc.target/aarch64/pr112787.c: New test. > >>>> > >>>> Bootstrapped and regression tested on aarch64-unknown-linux-gnu and > >>>> x86_64-pc-linux-gnu. > >>>> > >>>> Is this OK for trunk? > >>> > >>> @@ -1347,7 +1347,7 @@ optimize_vector_constructor (gimple_stmt_iterator > >>> *gsi) > >>> TYPE, or NULL_TREE if none is found. */ > >>> > >>> Can you improve the function comment? It also doesn't mention OP ... > >>> > >>> static tree > >>> -type_for_widest_vector_mode (tree type, optab op) > >>> +type_for_widest_vector_mode (tree type, optab op, poly_int64 max_nunits = > >>> 0) > >>> { > >>> machine_mode inner_mode = TYPE_MODE (type); > >>> machine_mode best_mode = VOIDmode, mode; > >>> @@ -1371,7 +1371,9 @@ type_for_widest_vector_mode (tree type, optab op) > >>> FOR_EACH_MODE_FROM (mode, mode) > >>> if (GET_MODE_INNER (mode) == inner_mode > >>> && maybe_gt (GET_MODE_NUNITS (mode), best_nunits) > >>> - && optab_handler (op, mode) != CODE_FOR_nothing) > >>> + && optab_handler (op, mode) != CODE_FOR_nothing > >>> + && (known_eq (max_nunits, 0) > >>> + || known_lt (GET_MODE_NUNITS (mode), max_nunits))) > >>> > >>> max_nunits suggests that known_le would be appropriate instead. > >>> > >>> I see the only other caller with similar "problems": > >>> > >>> } > >>> /* Can't use get_compute_type here, as > >>> supportable_convert_operation > >>> doesn't necessarily use an optab and needs two arguments. */ > >>> tree vec_compute_type > >>> = type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab); > >>> if (vec_compute_type > >>> && VECTOR_MODE_P (TYPE_MODE (vec_compute_type)) > >>> && subparts_gt (arg_type, vec_compute_type)) > >>> > >>> so please do not default to 0 but adjust this one as well. It also > >>> seems you then can remove the subparts_gt guards on both > >>> vec_compute_type uses. > >>> > >>> I think the API would be cleaner if we'd pass the original vector type > >>> we can then extract TYPE_VECTOR_SUBPARTS from, avoiding the extra arg. > >>> > >>> No? > >>> > >>> Thanks, > >>> Richard. > >> > > > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)