how to check if target supports andnot instruction ?

2016-10-12 Thread Prathamesh Kulkarni
Hi,
I was having a look at PR71636 and added the following pattern to match.pd:
x & ((1U << b) - 1) -> x & ~(~0U << b)
However the transform is useful only if the target supports "andnot"
instruction.
As pointed out by Marc in PR for -march=core2, lhs generates worse
code than rhs,
so we shouldn't do the transform if target doesn't support andnot insn.
(perhaps we could do the reverse transform for target not supporting andnot?)

However it seems andnot isn't a standard pattern name, so am not sure
how to check
if target supports andnot insn ?

Thanks,
Prathamesh


Re: how to check if target supports andnot instruction ?

2016-10-12 Thread Prathamesh Kulkarni
On 12 October 2016 at 13:09, Prathamesh Kulkarni
 wrote:
> Hi,
> I was having a look at PR71636 and added the following pattern to match.pd:
> x & ((1U << b) - 1) -> x & ~(~0U << b)
> However the transform is useful only if the target supports "andnot"
> instruction.
> As pointed out by Marc in PR for -march=core2, lhs generates worse
> code than rhs,
oops, I meant to say rhs generates worse code than lhs for  target not
supporting andnot.
> so we shouldn't do the transform if target doesn't support andnot insn.
> (perhaps we could do the reverse transform for target not supporting andnot?)
>
> However it seems andnot isn't a standard pattern name, so am not sure
> how to check
> if target supports andnot insn ?
>
> Thanks,
> Prathamesh


Re: how to check if target supports andnot instruction ?

2016-10-12 Thread Richard Biener
On Wed, 12 Oct 2016, Prathamesh Kulkarni wrote:

> Hi,
> I was having a look at PR71636 and added the following pattern to match.pd:
> x & ((1U << b) - 1) -> x & ~(~0U << b)
> However the transform is useful only if the target supports "andnot"
> instruction.
> As pointed out by Marc in PR for -march=core2, lhs generates worse
> code than rhs,
> so we shouldn't do the transform if target doesn't support andnot insn.
> (perhaps we could do the reverse transform for target not supporting andnot?)
> 
> However it seems andnot isn't a standard pattern name, so am not sure
> how to check
> if target supports andnot insn ?

There is no easy way - you can build RTL and run it through recog ...

But generally on GIMPLE we do not want to apply such early target specific
"optimization".  Rather on GIMPLE we apply canonicalization and the
above transform would be done by RTL expansion (or by a GIMPLE pass
right before that).  If you wanted to use match.pd for that we need
to start thinking about how to handle target specific transforms that
should only apply late -- for example by generating an alternate API
from a separate .pd file or by adding sth similar to && reload_completed
to patterns, say via doing at the end of match.pd

(if (now_expanding_to_rtl)

#include "target.pd"

)

and have targets supply their own .pd file (where each pattern
gets (if (now_expanding_to_rtl) ...) added).

You'd have to supply empty target.pd or include tm.h and a
new target macro specifying whether target.pd exists or not.

Richard.


Re: how to check if target supports andnot instruction ?

2016-10-12 Thread Marc Glisse

On Wed, 12 Oct 2016, Prathamesh Kulkarni wrote:


I was having a look at PR71636 and added the following pattern to match.pd:
x & ((1U << b) - 1) -> x & ~(~0U << b)
However the transform is useful only if the target supports "andnot"
instruction.


rth was selling the transformation as a canonicalization, which is 
beneficial when there is an andnot instruction, and neutral otherwise, so 
it could be done always.



As pointed out by Marc in PR for -march=core2, lhs generates worse
code than rhs,
so we shouldn't do the transform if target doesn't support andnot insn.
(perhaps we could do the reverse transform for target not supporting andnot?)


Rereading my comment in the PR, I pointed out that instead of being 
neutral, the transformation was very slightly detrimental in one case (one 
extra mov) because of a RA issue. That doesn't mean we should avoid the 
transformation, just that we should fix the RA issue (by the way, if you 
have time to file a separate PR for the RA issue, that would be great, 
otherwise I'll try to do it at some point...).


However it seems andnot isn't a standard pattern name, so am not sure 
how to check if target supports andnot insn ?


--
Marc Glisse


Re: how to check if target supports andnot instruction ?

2016-10-12 Thread Richard Biener
On Wed, 12 Oct 2016, Marc Glisse wrote:

> On Wed, 12 Oct 2016, Prathamesh Kulkarni wrote:
> 
> > I was having a look at PR71636 and added the following pattern to match.pd:
> > x & ((1U << b) - 1) -> x & ~(~0U << b)
> > However the transform is useful only if the target supports "andnot"
> > instruction.
> 
> rth was selling the transformation as a canonicalization, which is beneficial
> when there is an andnot instruction, and neutral otherwise, so it could be
> done always.

Well, its three instructions to three instructions and a more expensive
constant(?).  ~0U might not be available as immediate for the shift
instruction and 1U << b might be available as a bit-set instruction ...
(vs. the andnot).

So yes, we might decide to canonicalize to andnot (and decide that
three binary to two binary and one unary op is "better").

So no excuse to explore the target specific .pd fragment idea ... :/

Richard.

> > As pointed out by Marc in PR for -march=core2, lhs generates worse
> > code than rhs,
> > so we shouldn't do the transform if target doesn't support andnot insn.
> > (perhaps we could do the reverse transform for target not supporting
> > andnot?)
> 
> Rereading my comment in the PR, I pointed out that instead of being neutral,
> the transformation was very slightly detrimental in one case (one extra mov)
> because of a RA issue. That doesn't mean we should avoid the transformation,
> just that we should fix the RA issue (by the way, if you have time to file a
> separate PR for the RA issue, that would be great, otherwise I'll try to do it
> at some point...).
> 
> > However it seems andnot isn't a standard pattern name, so am not sure how to
> > check if target supports andnot insn ?



Potential bug with wide_int_storage::set_len

2016-10-12 Thread Andre Vieira (lists)
Hello,

During the development of a patch I encountered some strange behavior
and decided to investigate. The result of which is I think I found a bug
with 'wide_int_storage::set_len' in gcc/wide-int.h.

The function reads:
inline void
wide_int_storage::set_len (unsigned int l, bool is_sign_extended)
{
  len = l;
  if (!is_sign_extended && len * HOST_BITS_PER_WIDE_INT > precision)
val[len - 1] = sext_hwi (val[len - 1],
 precision % HOST_BITS_PER_WIDE_INT);
}

Now assume you have a wide_int_storage, lets call it 'result' with the
following values:
val = [ 0x, ...];
len = 1;
precision = 32;

Say you are running it on a 64-bit host:
#define HOST_BITS_PER_WIDE_INT 64

and you call 'result.set_len (1, false);'

Then this will sign extend the first element of val, to
0x, and I don't think this is what you want.

Due to this, 'expand_expr' will expand a constant tree with unsigned
integer type and value MAX_UINT to a rtx node (const_int -1).

Am I missing something here?

Cheers,
Andre

PS: I will be running tests with a patch to remove the negation in front
of 'is_sign_extended' and post the patch in gcc-patches. If anyone
thinks this is wrong and wants to spare me the effort please reply!


Potential bug with wide_int_storage::set_len

2016-10-12 Thread Andre Vieira (lists)
Hello,

During the development of a patch I encountered some strange behavior
and decided to investigate. The result of which is I think I found a bug
with 'wide_int_storage::set_len' in gcc/wide-int.h.

The function reads:
inline void
wide_int_storage::set_len (unsigned int l, bool is_sign_extended)
{
  len = l;
  if (!is_sign_extended && len * HOST_BITS_PER_WIDE_INT > precision)
val[len - 1] = sext_hwi (val[len - 1],
 precision % HOST_BITS_PER_WIDE_INT);
}

Now assume you have a wide_int_storage, lets call it 'result' with the
following values:
val = [ 0x, ...];
len = 1;
precision = 32;

Say you are running it on a 64-bit host:
#define HOST_BITS_PER_WIDE_INT 64

and you call 'result.set_len (1, false);'

Then this will sign extend the first element of val, to
0x, and I don't think this is what you want.

Due to this, 'expand_expr' will expand a constant tree with unsigned
integer type and value MAX_UINT to a rtx node (const_int -1).

Am I missing something here?

Cheers,
Andre

PS: I will be running tests with a patch to remove the negation in front
of 'is_sign_extended' and post the patch in gcc-patches. If anyone
thinks this is wrong and wants to spare me the effort please reply!


Re: Potential bug with wide_int_storage::set_len

2016-10-12 Thread Jakub Jelinek
On Wed, Oct 12, 2016 at 05:02:46PM +0100, Andre Vieira (lists) wrote:
> Say you are running it on a 64-bit host:
> #define HOST_BITS_PER_WIDE_INT 64
> 
> and you call 'result.set_len (1, false);'
> 
> Then this will sign extend the first element of val, to
> 0x, and I don't think this is what you want.
> 
> Due to this, 'expand_expr' will expand a constant tree with unsigned
> integer type and value MAX_UINT to a rtx node (const_int -1).

That is correct.  In RTL constants are always sign-extended from their
precision to HOST_BITS_PER_WIDE_INT, regardless if it is "signed" or
"unsigned" constant.  Whether you treat the low precision bits of the
constant as signed or unsigned is something encoded in the operation on it.

Jakub


Re: Potential bug with wide_int_storage::set_len

2016-10-12 Thread Jakub Jelinek
On Wed, Oct 12, 2016 at 06:45:39PM +0200, Jakub Jelinek wrote:
> On Wed, Oct 12, 2016 at 05:02:46PM +0100, Andre Vieira (lists) wrote:
> > Say you are running it on a 64-bit host:
> > #define HOST_BITS_PER_WIDE_INT 64
> > 
> > and you call 'result.set_len (1, false);'
> > 
> > Then this will sign extend the first element of val, to
> > 0x, and I don't think this is what you want.
> > 
> > Due to this, 'expand_expr' will expand a constant tree with unsigned
> > integer type and value MAX_UINT to a rtx node (const_int -1).
> 
> That is correct.  In RTL constants are always sign-extended from their
> precision to HOST_BITS_PER_WIDE_INT, regardless if it is "signed" or
> "unsigned" constant.  Whether you treat the low precision bits of the
> constant as signed or unsigned is something encoded in the operation on it.

Here is what rtl.texi says on it:
@item (const_int @var{i})
This type of expression represents the integer value @var{i}.  @var{i}
is customarily accessed with the macro @code{INTVAL} as in
@code{INTVAL (@var{exp})}, which is equivalent to @code{XWINT (@var{exp}, 0)}.

Constants generated for modes with fewer bits than in
@code{HOST_WIDE_INT} must be sign extended to full width (e.g., with
@code{gen_int_mode}).  For constants for modes with more bits than in
@code{HOST_WIDE_INT} the implied high order bits of that constant are
copies of the top bit.  Note however that values are neither
inherently signed nor inherently unsigned; where necessary, signedness
is determined by the rtl operation instead.

Jakub


Re: Potential bug with wide_int_storage::set_len

2016-10-12 Thread Eric Botcazou
> During the development of a patch I encountered some strange behavior
> and decided to investigate. The result of which is I think I found a bug
> with 'wide_int_storage::set_len' in gcc/wide-int.h.
> 
> The function reads:
> inline void
> wide_int_storage::set_len (unsigned int l, bool is_sign_extended)
> {
>   len = l;
>   if (!is_sign_extended && len * HOST_BITS_PER_WIDE_INT > precision)
> val[len - 1] = sext_hwi (val[len - 1],
>  precision % HOST_BITS_PER_WIDE_INT);
> }

The code certainly lacks a comment explaining the apparent discrepancy.

> Due to this, 'expand_expr' will expand a constant tree with unsigned
> integer type and value MAX_UINT to a rtx node (const_int -1).

As Jakub explained, that is as expected, even if a little surprising.

-- 
Eric Botcazou