Re: pa indirect_jump instruction

2015-07-05 Thread Richard Sandiford
Trevor Saunders  writes:
> On Tue, Jun 30, 2015 at 09:53:31PM +0100, Richard Sandiford wrote:
>> I have a series of patches to convert all non-optab instructions to
>> the target-insns.def interface.  config-list.mk showed up one problem
>> though.  The pa indirect_jump pattern is:
>> 
>> ;;; Hope this is only within a function...
>> (define_insn "indirect_jump"
>>   [(set (pc) (match_operand 0 "register_operand" "r"))]
>>   "GET_MODE (operands[0]) == word_mode"
>>   "bv%* %%r0(%0)"
>>   [(set_attr "type" "branch")
>>(set_attr "length" "4")])
>> 
>> so the C condition depends on operands[], which isn't usually allowed
>> for named patterns.  We get away with it at the moment because we only
>> test for the existence of HAVE_indirect_jump, not its value:
>
> yeah, I hit this a while ago and filed bug 66114.  It looks like I had
> trouble with fr30 too, is that fixed now?

Hmm, seems not.  The fr30 build stopped earlier for me due to a warning
turned error.  I suppose I should really have fixed all the warnings shown
by config-list.mk before doing this stuff...

Thanks,
Richard


Re: pa indirect_jump instruction

2015-07-05 Thread Trevor Saunders
On Sun, Jul 05, 2015 at 09:11:23AM +0100, Richard Sandiford wrote:
> Trevor Saunders  writes:
> > On Tue, Jun 30, 2015 at 09:53:31PM +0100, Richard Sandiford wrote:
> >> I have a series of patches to convert all non-optab instructions to
> >> the target-insns.def interface.  config-list.mk showed up one problem
> >> though.  The pa indirect_jump pattern is:
> >> 
> >> ;;; Hope this is only within a function...
> >> (define_insn "indirect_jump"
> >>   [(set (pc) (match_operand 0 "register_operand" "r"))]
> >>   "GET_MODE (operands[0]) == word_mode"
> >>   "bv%* %%r0(%0)"
> >>   [(set_attr "type" "branch")
> >>(set_attr "length" "4")])
> >> 
> >> so the C condition depends on operands[], which isn't usually allowed
> >> for named patterns.  We get away with it at the moment because we only
> >> test for the existence of HAVE_indirect_jump, not its value:
> >
> > yeah, I hit this a while ago and filed bug 66114.  It looks like I had
> > trouble with fr30 too, is that fixed now?
> 
> Hmm, seems not.  The fr30 build stopped earlier for me due to a warning
> turned error.  I suppose I should really have fixed all the warnings shown
> by config-list.mk before doing this stuff...

yeah, that's certainly a problem worth working on, but there's certainly
something to be said for not going too far down the yak shaving rabbit
whole.

Trev

far dow
> 
> Thanks,
> Richard


Allocation of hotness of data structure with respect to the top of stack.

2015-07-05 Thread Ajit Kumar Agarwal
All:

I am wondering allocation of hot data structure closer to the top of the stack 
increases the performance of the application.
The data structure are identified as hot and cold data structure and all the 
data structures are sorted in decreasing order of
The hotness and the hot data structure will be allocated closer to the top of 
the stack.

The load and store on accessing with respect to allocation of data structure on 
stack will be faster with allocation of hot
Data structure closer to the top of the stack.

Based on the above the code is generated with respect to load and store with 
the correct offset of the stack allocated on
the decreasing order of hotness.

Thoughts?

Thanks & Regards
Ajit


Reduction Pattern ( Vectorization or Parallelization)

2015-07-05 Thread Ajit Kumar Agarwal
All:

The scalar and array reduction patterns can be identified if the result of 
commutative updates
Is applied to the same scalar or array variables on the LHS with +, *, Min or 
Max. Thus the reduction pattern identified with 
the commutative update help  in vectorization or parallelization.

For the following code
For(j = 0; j <= N;j++)
{
   y = d[j];
For( I = 0 ; I  <8 ; i++)
X(a[i]) = X(a[i]) + c[i] * y;
}

Fig(1).

For the above code with the reduction pattern on X with respect to the outer 
loop  exhibits the commutative updates on + can be identified
In gcc as reduction pattern with respect to outer loops. I wondering whether 
this can be identified as reduction pattern which can reduce to vectorized
Code because of the X is indexed by another array as thus the access of X is 
not affine expression.

Does the above code can be identified as reduction pattern and transform to the 
vectorized or parallelize code.

Thoughts?

Thanks & Regards
Ajit




Re: Live on Exit renaming.

2015-07-05 Thread Steven Bosscher
On Sat, Jul 4, 2015 at 3:45 PM, Ajit Kumar Agarwal wrote:
> I am not sure why the above optimization is not implemented in GCC.

-fsplit-ivs-in-unroller

Ciao!
Steven


gcc-6-20150705 is now available

2015-07-05 Thread gccadmin
Snapshot gcc-6-20150705 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/6-20150705/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 225437

You'll find:

 gcc-6-20150705.tar.bz2   Complete GCC

  MD5=c2ac14a399dc81a20e649d6064d9dd2f
  SHA1=a22d5325a5c0d615a52b526bdd3141a772c4522c

Diffs from 6-20150628 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Live on Exit renaming.

2015-07-05 Thread Bin.Cheng
On Mon, Jul 6, 2015 at 6:02 AM, Steven Bosscher  wrote:
> On Sat, Jul 4, 2015 at 3:45 PM, Ajit Kumar Agarwal wrote:
>> I am not sure why the above optimization is not implemented in GCC.
>
> -fsplit-ivs-in-unroller

And thing might have changed.  Given the condition GCC does IVO on
gimple, unrolling on RTL, there is inconsistency between the two
optimizer since IVO takes register pressure of IVs into consideration
and assumes IVs will take single registers.  At least for some cases,
splitting live range of IVs results in bad code.  See PR29256 for more
information.  As described in the comment, actually I am going to do
some experiments disabling such transformation to see what happens.

Thanks,
bin
>
> Ciao!
> Steven


RE: Live on Exit renaming.

2015-07-05 Thread Ajit Kumar Agarwal


-Original Message-
From: Bin.Cheng [mailto:amker.ch...@gmail.com] 
Sent: Monday, July 06, 2015 7:04 AM
To: Steven Bosscher
Cc: Ajit Kumar Agarwal; l...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod 
Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Live on Exit renaming.

On Mon, Jul 6, 2015 at 6:02 AM, Steven Bosscher  wrote:
> On Sat, Jul 4, 2015 at 3:45 PM, Ajit Kumar Agarwal wrote:
>> I am not sure why the above optimization is not implemented in GCC.
>
> -fsplit-ivs-in-unroller

>>And thing might have changed.  Given the condition GCC does IVO on gimple, 
>>unrolling on RTL, there is inconsistency between the two optimizer since IVO 
takes register pressure of IVs into consideration and assumes IVs will take 
>>single registers.  At least for some cases, splitting live range of IVs 
>>results in bad >>code.  See PR29256 for more information.  As described in 
>>the comment, actually I am going to do some experiments disabling such 
>>transformation to see >>what happens.

The above optimization is implemented as a part of unroller in gimple. There is 
an unroller pass in rtl which does not have support for this 
optimization.  Shouldn't be the fsplit-ivs-in-unroller optimization implemented 
in the unroller pass of rtl. I am looking at the implementation
perspective for implementing the fsplit-ivs-in-unroller optimizations in the 
unroller rtl pass.

Thanks & Regards
Ajit

Thanks,
bin
>
> Ciao!
> Steven


Re: Live on Exit renaming.

2015-07-05 Thread Bin.Cheng
On Mon, Jul 6, 2015 at 12:02 PM, Ajit Kumar Agarwal
 wrote:
>
>
> -Original Message-
> From: Bin.Cheng [mailto:amker.ch...@gmail.com]
> Sent: Monday, July 06, 2015 7:04 AM
> To: Steven Bosscher
> Cc: Ajit Kumar Agarwal; l...@redhat.com; Richard Biener; gcc@gcc.gnu.org; 
> Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
> Subject: Re: Live on Exit renaming.
>
> On Mon, Jul 6, 2015 at 6:02 AM, Steven Bosscher  wrote:
>> On Sat, Jul 4, 2015 at 3:45 PM, Ajit Kumar Agarwal wrote:
>>> I am not sure why the above optimization is not implemented in GCC.
>>
>> -fsplit-ivs-in-unroller
>
>>>And thing might have changed.  Given the condition GCC does IVO on gimple, 
>>>unrolling on RTL, there is inconsistency between the two optimizer since IVO 
>takes register pressure of IVs into consideration and assumes IVs will 
>>>take single registers.  At least for some cases, splitting live range of IVs 
>>>results in bad >>code.  See PR29256 for more information.  As described in 
>>>the comment, actually I am going to do some experiments disabling such 
>>>transformation to see >>what happens.
>
> The above optimization is implemented as a part of unroller in gimple. There 
> is an unroller pass in rtl which does not have support for this
As far as I understand, fsplit-ivs-in-unroller is a transformation in
RTL unroller.

Thanks,
bin
> optimization.  Shouldn't be the fsplit-ivs-in-unroller optimization 
> implemented in the unroller pass of rtl. I am looking at the implementation
> perspective for implementing the fsplit-ivs-in-unroller optimizations in the 
> unroller rtl pass.
>
> Thanks & Regards
> Ajit
>
> Thanks,
> bin
>>
>> Ciao!
>> Steven


RE: Live on Exit renaming.

2015-07-05 Thread Ajit Kumar Agarwal


-Original Message-
From: Bin.Cheng [mailto:amker.ch...@gmail.com] 
Sent: Monday, July 06, 2015 10:26 AM
To: Ajit Kumar Agarwal
Cc: Steven Bosscher; l...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod 
Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Live on Exit renaming.

On Mon, Jul 6, 2015 at 12:02 PM, Ajit Kumar Agarwal 
 wrote:
>
>
> -Original Message-
> From: Bin.Cheng [mailto:amker.ch...@gmail.com]
> Sent: Monday, July 06, 2015 7:04 AM
> To: Steven Bosscher
> Cc: Ajit Kumar Agarwal; l...@redhat.com; Richard Biener; 
> gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli 
> Hunsigida; Nagaraju Mekala
> Subject: Re: Live on Exit renaming.
>
> On Mon, Jul 6, 2015 at 6:02 AM, Steven Bosscher  wrote:
>> On Sat, Jul 4, 2015 at 3:45 PM, Ajit Kumar Agarwal wrote:
>>> I am not sure why the above optimization is not implemented in GCC.
>>
>> -fsplit-ivs-in-unroller
>
>>>And thing might have changed.  Given the condition GCC does IVO on gimple, 
>>>unrolling on RTL, there is inconsistency between the two optimizer since IVO 
>takes register pressure of IVs into consideration and assumes IVs will 
>>>take single registers.  At least for some cases, splitting live range of IVs 
>>>results in bad >>code.  See PR29256 for more information.  As described in 
>>>the comment, actually I am going to do some experiments disabling such 
>>>transformation to see >>what happens.
>
> The above optimization is implemented as a part of unroller in gimple. 
> There is an unroller pass in rtl which does not have support for this
>>As far as I understand, fsplit-ivs-in-unroller is a transformation in RTL 
>>unroller.

My mistake. Yes you are right. The fsplit-ivs-in-unroller is a transformation 
in RTL unroller.
IVO on gimple doesn't take unrolling into consideration and assume to assign 
single register for IV candidates. My thinking is that 
Splitting IVs at RTL with the unroller removes the long dependent chains and 
thus makes the overlapping iterations and better
Register allocators and there is a chance of movement of independent code that 
got exposes with split-ivs-in-unroller.

You have mentioned that splitting of IV candidate reults in bad code.  I could 
see only the positive end of this optimizations.
Could you please elaborate on the negative end of the fsplit-ivs-in-unroller 
optimizations as you have mentioned that it results
In bad code in some cases.

Thanks & Regards 
Ajit

Thanks,
bin
> optimization.  Shouldn't be the fsplit-ivs-in-unroller optimization 
> implemented in the unroller pass of rtl. I am looking at the implementation 
> perspective for implementing the fsplit-ivs-in-unroller optimizations in the 
> unroller rtl pass.
>
> Thanks & Regards
> Ajit
>
> Thanks,
> bin
>>
>> Ciao!
>> Steven


Re: Possible issue with ARC gcc 4.8

2015-07-05 Thread Vineet Gupta
On Friday 03 July 2015 07:15 PM, Richard Biener wrote:
> On Fri, Jul 3, 2015 at 3:10 PM, Vineet Gupta  
> wrote:
>> Hi,
>>
>> I have the following test case (reduced from Linux kernel sources) and it 
>> seems
>> gcc is optimizing away the first loop iteration.
>>
>> arc-linux-gcc -c -O2 star-9000857057.c -fno-branch-count-reg --save-temps 
>> -mA7
>>
>> --->8-
>> static inline int __test_bit(unsigned int nr, const volatile unsigned long 
>> *addr)
>> {
>>  unsigned long mask;
>>
>>  addr += nr >> 5;
>> #if 0
>> nr &= 0x1f;
>> #endif
>>  mask = 1UL << nr;
>>  return ((mask & *addr) != 0);
>> }
>>
>> int foo (int a, unsigned long *p)
>> {
>>   int i;
>>   for (i = 63; i>=0; i--)
>>   {
>>   if (!(__test_bit(i, p)))
>>continue;
>>   a += i;
>>   }
>>   return a;
>> }
>> --->8-
>>
>> gcc generates following
>>
>> --->8-
>> .global foo
>> .type   foo, @function
>> foo:
>> ld_s r2,[r1,4]  < dead code
>> mov_s r2,63
>> .align 4
>> .L2:
>> sub r2,r2,1<-SUB first
>> cmp r2,-1
>> jeq.d [blink]
>> lsr r3,r2,5   <- BUG: first @mask is (1 << 62) NOT (1 << 63)
>> .align 2
>> .L4:
>> ld.as r3,[r1,r3]
>> bbit0.nd r3,r2,@.L2
>> add_s r0,r0,r2
>> sub r2,r2,1
>> cmp r2,-1
>> bne.d @.L4
>> lsr r3,r2,5
>> j_s [blink]
>> .size   foo, .-foo
>> .ident  "GCC: (ARCv2 ISA Linux uClibc toolchain 
>> arc-2015.06-rc1-21-g21b2c4b83dfa)
>> 4.8.4"
>> --->8-
>>
>> For initial 32 loop operations, this test is effectively doing 64 bit 
>> operation,
>> e.g. (1 << 63) in 32 bit regime. Is this supposed to be undefined, truncated 
>> to
>> zero or port specific.
>>
>> If it is truncate to zero then generated code below is not correct as it 
>> needs to
>> elide not just the first iteration (corresponding to i = 63) but 63..32
>>
>> Further ARCompact ISA provides that instructions involving bitpos operands 
>> BSET,
>> BCLR, LSL can any number whatsoever, but core will only use the lower 5 bits 
>> (so
>> clamping the bitpos to 0..31 w/o need for doing that in code.
>>
>> So is this a gcc bug, or some spec misinterpretation,.
> It is the C language standard that says that shifts like this invoke
> undefined behavior.

Right, but the compiler is a program nevertheless and it knows what to do when 
it
sees 1 << 62
It's not like there is an uninitialized variable or something which will provide
unexpected behaviour.
More importantly, the question is can ports define a specific behaviour for such
cases and whether that would be sufficient to guarantee the semantics.

The point being ARC ISA provides a neat feature where core only considers lower 
5
bits of bitpos operands. Thus we can make such behaviour not only deterministic 
in
the context of ARC, but also optimal, eliding the need for doing specific
masking/clamping to 5 bits.

-Vineet


Re: Live on Exit renaming.

2015-07-05 Thread Bin.Cheng
On Mon, Jul 6, 2015 at 1:16 PM, Ajit Kumar Agarwal
 wrote:
>
>
> -Original Message-
> From: Bin.Cheng [mailto:amker.ch...@gmail.com]
> Sent: Monday, July 06, 2015 10:26 AM
> To: Ajit Kumar Agarwal
> Cc: Steven Bosscher; l...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod 
> Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
> Subject: Re: Live on Exit renaming.
>
> On Mon, Jul 6, 2015 at 12:02 PM, Ajit Kumar Agarwal 
>  wrote:
>>
>>
>> -Original Message-
>> From: Bin.Cheng [mailto:amker.ch...@gmail.com]
>> Sent: Monday, July 06, 2015 7:04 AM
>> To: Steven Bosscher
>> Cc: Ajit Kumar Agarwal; l...@redhat.com; Richard Biener;
>> gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli
>> Hunsigida; Nagaraju Mekala
>> Subject: Re: Live on Exit renaming.
>>
>> On Mon, Jul 6, 2015 at 6:02 AM, Steven Bosscher  
>> wrote:
>>> On Sat, Jul 4, 2015 at 3:45 PM, Ajit Kumar Agarwal wrote:
 I am not sure why the above optimization is not implemented in GCC.
>>>
>>> -fsplit-ivs-in-unroller
>>
And thing might have changed.  Given the condition GCC does IVO on gimple, 
unrolling on RTL, there is inconsistency between the two optimizer since 
IVO >>takes register pressure of IVs into consideration and assumes IVs 
will take single registers.  At least for some cases, splitting live range 
of IVs results in bad >>code.  See PR29256 for more information.  As 
described in the comment, actually I am going to do some experiments 
disabling such transformation to see >>what happens.
>>
>> The above optimization is implemented as a part of unroller in gimple.
>> There is an unroller pass in rtl which does not have support for this
>>>As far as I understand, fsplit-ivs-in-unroller is a transformation in RTL 
>>>unroller.
>
> My mistake. Yes you are right. The fsplit-ivs-in-unroller is a transformation 
> in RTL unroller.
> IVO on gimple doesn't take unrolling into consideration and assume to assign 
> single register for IV candidates. My thinking is that
> Splitting IVs at RTL with the unroller removes the long dependent chains and 
> thus makes the overlapping iterations and better
> Register allocators and there is a chance of movement of independent code 
> that got exposes with split-ivs-in-unroller.
>
> You have mentioned that splitting of IV candidate reults in bad code.  I 
> could see only the positive end of this optimizations.
> Could you please elaborate on the negative end of the fsplit-ivs-in-unroller 
> optimizations as you have mentioned that it results
> In bad code in some cases.
I had pointed to PR29256 in previous message.  I also saw such
examples in different benchmarks, and the situation is even worse on
targets supporting auto-increment addressing mode.

Thanks,
bin
>
> Thanks & Regards
> Ajit
>
> Thanks,
> bin
>> optimization.  Shouldn't be the fsplit-ivs-in-unroller optimization
>> implemented in the unroller pass of rtl. I am looking at the implementation 
>> perspective for implementing the fsplit-ivs-in-unroller optimizations in the 
>> unroller rtl pass.
>>
>> Thanks & Regards
>> Ajit
>>
>> Thanks,
>> bin
>>>
>>> Ciao!
>>> Steven


Re: Possible issue with ARC gcc 4.8

2015-07-05 Thread Marc Glisse

On Mon, 6 Jul 2015, Vineet Gupta wrote:


It is the C language standard that says that shifts like this invoke
undefined behavior.


Right, but the compiler is a program nevertheless and it knows what to do when 
it
sees 1 << 62
It's not like there is an uninitialized variable or something which will provide
unexpected behaviour.
More importantly, the question is can ports define a specific behaviour for such
cases and whether that would be sufficient to guarantee the semantics.

The point being ARC ISA provides a neat feature where core only considers lower 
5
bits of bitpos operands. Thus we can make such behaviour not only deterministic 
in
the context of ARC, but also optimal, eliding the need for doing specific
masking/clamping to 5 bits.


IMO, writing a << (b & 31) instead of a << b has only advantages. It 
documents the behavior you are expecting. It makes the code 
standard-conformant and portable. And the back-ends can provide patterns 
for exactly this so they generate a single insn (the same as for a << b).


When I see x << 1024, 0 is the only value that makes sense to me, and I'd 
much rather get undefined behavior (detected by sanitizers) than silently 
get 'x' back.


--
Marc Glisse


Re: Possible issue with ARC gcc 4.8

2015-07-05 Thread Richard Biener
On Mon, Jul 6, 2015 at 7:30 AM, Vineet Gupta  wrote:
> On Friday 03 July 2015 07:15 PM, Richard Biener wrote:
>> On Fri, Jul 3, 2015 at 3:10 PM, Vineet Gupta  
>> wrote:
>>> Hi,
>>>
>>> I have the following test case (reduced from Linux kernel sources) and it 
>>> seems
>>> gcc is optimizing away the first loop iteration.
>>>
>>> arc-linux-gcc -c -O2 star-9000857057.c -fno-branch-count-reg --save-temps 
>>> -mA7
>>>
>>> --->8-
>>> static inline int __test_bit(unsigned int nr, const volatile unsigned long 
>>> *addr)
>>> {
>>>  unsigned long mask;
>>>
>>>  addr += nr >> 5;
>>> #if 0
>>> nr &= 0x1f;
>>> #endif
>>>  mask = 1UL << nr;
>>>  return ((mask & *addr) != 0);
>>> }
>>>
>>> int foo (int a, unsigned long *p)
>>> {
>>>   int i;
>>>   for (i = 63; i>=0; i--)
>>>   {
>>>   if (!(__test_bit(i, p)))
>>>continue;
>>>   a += i;
>>>   }
>>>   return a;
>>> }
>>> --->8-
>>>
>>> gcc generates following
>>>
>>> --->8-
>>> .global foo
>>> .type   foo, @function
>>> foo:
>>> ld_s r2,[r1,4]  < dead code
>>> mov_s r2,63
>>> .align 4
>>> .L2:
>>> sub r2,r2,1<-SUB first
>>> cmp r2,-1
>>> jeq.d [blink]
>>> lsr r3,r2,5   <- BUG: first @mask is (1 << 62) NOT (1 << 63)
>>> .align 2
>>> .L4:
>>> ld.as r3,[r1,r3]
>>> bbit0.nd r3,r2,@.L2
>>> add_s r0,r0,r2
>>> sub r2,r2,1
>>> cmp r2,-1
>>> bne.d @.L4
>>> lsr r3,r2,5
>>> j_s [blink]
>>> .size   foo, .-foo
>>> .ident  "GCC: (ARCv2 ISA Linux uClibc toolchain 
>>> arc-2015.06-rc1-21-g21b2c4b83dfa)
>>> 4.8.4"
>>> --->8-
>>>
>>> For initial 32 loop operations, this test is effectively doing 64 bit 
>>> operation,
>>> e.g. (1 << 63) in 32 bit regime. Is this supposed to be undefined, 
>>> truncated to
>>> zero or port specific.
>>>
>>> If it is truncate to zero then generated code below is not correct as it 
>>> needs to
>>> elide not just the first iteration (corresponding to i = 63) but 63..32
>>>
>>> Further ARCompact ISA provides that instructions involving bitpos operands 
>>> BSET,
>>> BCLR, LSL can any number whatsoever, but core will only use the lower 5 
>>> bits (so
>>> clamping the bitpos to 0..31 w/o need for doing that in code.
>>>
>>> So is this a gcc bug, or some spec misinterpretation,.
>> It is the C language standard that says that shifts like this invoke
>> undefined behavior.
>
> Right, but the compiler is a program nevertheless and it knows what to do 
> when it
> sees 1 << 62
> It's not like there is an uninitialized variable or something which will 
> provide
> unexpected behaviour.
> More importantly, the question is can ports define a specific behaviour for 
> such
> cases and whether that would be sufficient to guarantee the semantics.
>
> The point being ARC ISA provides a neat feature where core only considers 
> lower 5
> bits of bitpos operands. Thus we can make such behaviour not only 
> deterministic in
> the context of ARC, but also optimal, eliding the need for doing specific
> masking/clamping to 5 bits.

There is SHIFT_COUNT_TRUNCATED which allows you to combine
b & 31 with the shift value if you instead write a << (b & 31).

Of course a << 63 is still undefined behavior regardless of target behavior.

Richard.

> -Vineet