how to check if target supports andnot instruction ?
Hi, I was having a look at PR71636 and added the following pattern to match.pd: x & ((1U << b) - 1) -> x & ~(~0U << b) However the transform is useful only if the target supports "andnot" instruction. As pointed out by Marc in PR for -march=core2, lhs generates worse code than rhs, so we shouldn't do the transform if target doesn't support andnot insn. (perhaps we could do the reverse transform for target not supporting andnot?) However it seems andnot isn't a standard pattern name, so am not sure how to check if target supports andnot insn ? Thanks, Prathamesh
Re: how to check if target supports andnot instruction ?
On 12 October 2016 at 13:09, Prathamesh Kulkarni wrote: > Hi, > I was having a look at PR71636 and added the following pattern to match.pd: > x & ((1U << b) - 1) -> x & ~(~0U << b) > However the transform is useful only if the target supports "andnot" > instruction. > As pointed out by Marc in PR for -march=core2, lhs generates worse > code than rhs, oops, I meant to say rhs generates worse code than lhs for target not supporting andnot. > so we shouldn't do the transform if target doesn't support andnot insn. > (perhaps we could do the reverse transform for target not supporting andnot?) > > However it seems andnot isn't a standard pattern name, so am not sure > how to check > if target supports andnot insn ? > > Thanks, > Prathamesh
Re: how to check if target supports andnot instruction ?
On Wed, 12 Oct 2016, Prathamesh Kulkarni wrote: > Hi, > I was having a look at PR71636 and added the following pattern to match.pd: > x & ((1U << b) - 1) -> x & ~(~0U << b) > However the transform is useful only if the target supports "andnot" > instruction. > As pointed out by Marc in PR for -march=core2, lhs generates worse > code than rhs, > so we shouldn't do the transform if target doesn't support andnot insn. > (perhaps we could do the reverse transform for target not supporting andnot?) > > However it seems andnot isn't a standard pattern name, so am not sure > how to check > if target supports andnot insn ? There is no easy way - you can build RTL and run it through recog ... But generally on GIMPLE we do not want to apply such early target specific "optimization". Rather on GIMPLE we apply canonicalization and the above transform would be done by RTL expansion (or by a GIMPLE pass right before that). If you wanted to use match.pd for that we need to start thinking about how to handle target specific transforms that should only apply late -- for example by generating an alternate API from a separate .pd file or by adding sth similar to && reload_completed to patterns, say via doing at the end of match.pd (if (now_expanding_to_rtl) #include "target.pd" ) and have targets supply their own .pd file (where each pattern gets (if (now_expanding_to_rtl) ...) added). You'd have to supply empty target.pd or include tm.h and a new target macro specifying whether target.pd exists or not. Richard.
Re: how to check if target supports andnot instruction ?
On Wed, 12 Oct 2016, Prathamesh Kulkarni wrote: I was having a look at PR71636 and added the following pattern to match.pd: x & ((1U << b) - 1) -> x & ~(~0U << b) However the transform is useful only if the target supports "andnot" instruction. rth was selling the transformation as a canonicalization, which is beneficial when there is an andnot instruction, and neutral otherwise, so it could be done always. As pointed out by Marc in PR for -march=core2, lhs generates worse code than rhs, so we shouldn't do the transform if target doesn't support andnot insn. (perhaps we could do the reverse transform for target not supporting andnot?) Rereading my comment in the PR, I pointed out that instead of being neutral, the transformation was very slightly detrimental in one case (one extra mov) because of a RA issue. That doesn't mean we should avoid the transformation, just that we should fix the RA issue (by the way, if you have time to file a separate PR for the RA issue, that would be great, otherwise I'll try to do it at some point...). However it seems andnot isn't a standard pattern name, so am not sure how to check if target supports andnot insn ? -- Marc Glisse
Re: how to check if target supports andnot instruction ?
On Wed, 12 Oct 2016, Marc Glisse wrote: > On Wed, 12 Oct 2016, Prathamesh Kulkarni wrote: > > > I was having a look at PR71636 and added the following pattern to match.pd: > > x & ((1U << b) - 1) -> x & ~(~0U << b) > > However the transform is useful only if the target supports "andnot" > > instruction. > > rth was selling the transformation as a canonicalization, which is beneficial > when there is an andnot instruction, and neutral otherwise, so it could be > done always. Well, its three instructions to three instructions and a more expensive constant(?). ~0U might not be available as immediate for the shift instruction and 1U << b might be available as a bit-set instruction ... (vs. the andnot). So yes, we might decide to canonicalize to andnot (and decide that three binary to two binary and one unary op is "better"). So no excuse to explore the target specific .pd fragment idea ... :/ Richard. > > As pointed out by Marc in PR for -march=core2, lhs generates worse > > code than rhs, > > so we shouldn't do the transform if target doesn't support andnot insn. > > (perhaps we could do the reverse transform for target not supporting > > andnot?) > > Rereading my comment in the PR, I pointed out that instead of being neutral, > the transformation was very slightly detrimental in one case (one extra mov) > because of a RA issue. That doesn't mean we should avoid the transformation, > just that we should fix the RA issue (by the way, if you have time to file a > separate PR for the RA issue, that would be great, otherwise I'll try to do it > at some point...). > > > However it seems andnot isn't a standard pattern name, so am not sure how to > > check if target supports andnot insn ?
Potential bug with wide_int_storage::set_len
Hello, During the development of a patch I encountered some strange behavior and decided to investigate. The result of which is I think I found a bug with 'wide_int_storage::set_len' in gcc/wide-int.h. The function reads: inline void wide_int_storage::set_len (unsigned int l, bool is_sign_extended) { len = l; if (!is_sign_extended && len * HOST_BITS_PER_WIDE_INT > precision) val[len - 1] = sext_hwi (val[len - 1], precision % HOST_BITS_PER_WIDE_INT); } Now assume you have a wide_int_storage, lets call it 'result' with the following values: val = [ 0x, ...]; len = 1; precision = 32; Say you are running it on a 64-bit host: #define HOST_BITS_PER_WIDE_INT 64 and you call 'result.set_len (1, false);' Then this will sign extend the first element of val, to 0x, and I don't think this is what you want. Due to this, 'expand_expr' will expand a constant tree with unsigned integer type and value MAX_UINT to a rtx node (const_int -1). Am I missing something here? Cheers, Andre PS: I will be running tests with a patch to remove the negation in front of 'is_sign_extended' and post the patch in gcc-patches. If anyone thinks this is wrong and wants to spare me the effort please reply!
Potential bug with wide_int_storage::set_len
Hello, During the development of a patch I encountered some strange behavior and decided to investigate. The result of which is I think I found a bug with 'wide_int_storage::set_len' in gcc/wide-int.h. The function reads: inline void wide_int_storage::set_len (unsigned int l, bool is_sign_extended) { len = l; if (!is_sign_extended && len * HOST_BITS_PER_WIDE_INT > precision) val[len - 1] = sext_hwi (val[len - 1], precision % HOST_BITS_PER_WIDE_INT); } Now assume you have a wide_int_storage, lets call it 'result' with the following values: val = [ 0x, ...]; len = 1; precision = 32; Say you are running it on a 64-bit host: #define HOST_BITS_PER_WIDE_INT 64 and you call 'result.set_len (1, false);' Then this will sign extend the first element of val, to 0x, and I don't think this is what you want. Due to this, 'expand_expr' will expand a constant tree with unsigned integer type and value MAX_UINT to a rtx node (const_int -1). Am I missing something here? Cheers, Andre PS: I will be running tests with a patch to remove the negation in front of 'is_sign_extended' and post the patch in gcc-patches. If anyone thinks this is wrong and wants to spare me the effort please reply!
Re: Potential bug with wide_int_storage::set_len
On Wed, Oct 12, 2016 at 05:02:46PM +0100, Andre Vieira (lists) wrote: > Say you are running it on a 64-bit host: > #define HOST_BITS_PER_WIDE_INT 64 > > and you call 'result.set_len (1, false);' > > Then this will sign extend the first element of val, to > 0x, and I don't think this is what you want. > > Due to this, 'expand_expr' will expand a constant tree with unsigned > integer type and value MAX_UINT to a rtx node (const_int -1). That is correct. In RTL constants are always sign-extended from their precision to HOST_BITS_PER_WIDE_INT, regardless if it is "signed" or "unsigned" constant. Whether you treat the low precision bits of the constant as signed or unsigned is something encoded in the operation on it. Jakub
Re: Potential bug with wide_int_storage::set_len
On Wed, Oct 12, 2016 at 06:45:39PM +0200, Jakub Jelinek wrote: > On Wed, Oct 12, 2016 at 05:02:46PM +0100, Andre Vieira (lists) wrote: > > Say you are running it on a 64-bit host: > > #define HOST_BITS_PER_WIDE_INT 64 > > > > and you call 'result.set_len (1, false);' > > > > Then this will sign extend the first element of val, to > > 0x, and I don't think this is what you want. > > > > Due to this, 'expand_expr' will expand a constant tree with unsigned > > integer type and value MAX_UINT to a rtx node (const_int -1). > > That is correct. In RTL constants are always sign-extended from their > precision to HOST_BITS_PER_WIDE_INT, regardless if it is "signed" or > "unsigned" constant. Whether you treat the low precision bits of the > constant as signed or unsigned is something encoded in the operation on it. Here is what rtl.texi says on it: @item (const_int @var{i}) This type of expression represents the integer value @var{i}. @var{i} is customarily accessed with the macro @code{INTVAL} as in @code{INTVAL (@var{exp})}, which is equivalent to @code{XWINT (@var{exp}, 0)}. Constants generated for modes with fewer bits than in @code{HOST_WIDE_INT} must be sign extended to full width (e.g., with @code{gen_int_mode}). For constants for modes with more bits than in @code{HOST_WIDE_INT} the implied high order bits of that constant are copies of the top bit. Note however that values are neither inherently signed nor inherently unsigned; where necessary, signedness is determined by the rtl operation instead. Jakub
Re: Potential bug with wide_int_storage::set_len
> During the development of a patch I encountered some strange behavior > and decided to investigate. The result of which is I think I found a bug > with 'wide_int_storage::set_len' in gcc/wide-int.h. > > The function reads: > inline void > wide_int_storage::set_len (unsigned int l, bool is_sign_extended) > { > len = l; > if (!is_sign_extended && len * HOST_BITS_PER_WIDE_INT > precision) > val[len - 1] = sext_hwi (val[len - 1], > precision % HOST_BITS_PER_WIDE_INT); > } The code certainly lacks a comment explaining the apparent discrepancy. > Due to this, 'expand_expr' will expand a constant tree with unsigned > integer type and value MAX_UINT to a rtx node (const_int -1). As Jakub explained, that is as expected, even if a little surprising. -- Eric Botcazou