[ACTIVITY] 12-15 March 2013

2013-03-17 Thread Kugan

== Progress ==
- Monday was a public holiday
- Studied benchmarks and analysed the rtl/asm for redundant sign extensions
- Came up with simple test cases based on this (have 6 test cases now)
- Looked at gcc tree documentations and  helper macros;  tried it by 
adding more debug dumps for types at tree level


== Plan ==
- Start with the algorithm for sign extension elimination
- Skipping manual changes at source level/asm level to see the 
performance improvement for now.


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 18-22 March 2013

2013-03-24 Thread Kugan

== Progress ==
- Worked on VRP based zero/sign extension elimination at tree level
- Some regression test cases are failing; investigating them


== Plan ==
- Analyse the reason for test case failure and fix them
- Extend the code to handle all the possible cases (currently 
implemented for subset to get the framework working)


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 25-28 March 2013

2013-03-30 Thread Kugan

== Progress ==
working on http://cards.linaro.org/browse/TCWG-14
- All but 2 test cases are passing and extension elimination for the 
simple cases are eliminated
- one of the test failure was due to bug with value range propagation 
pass. Have to investigate this further.
- In one instance even though extensions are removed, 1 more instruction 
is generated (for crc) - Need to investigate it

- Got the chromebook and Set up Ubuntu; boot strapping gcc now


== Plan ==
- Start with the benchmarking
- investigate the performance and VRP issue
- Also start with http://cards.linaro.org/browse/TCWG-13 in parallel

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 2 - 5 Apr 2013

2013-04-07 Thread Kugan

== Summary ==
- benchmarking coremark with VRP based extension elimination
 * extension elimination in some cases affecting other optimizations
 * With this improvements are marginal (details below)

== Plan ==
- study crc  where extension elimination is resulting in bad code
- Find a solution

==Details==
If an assignment gimple statement has RHS expression value that can fit 
in LHS type, truncation is redundant. Zero/sign extensions are redundant 
in this case and rtl statement can be replaced as


from:
 (insn 12 11 0 (set (reg:SI 110 [ D.4128 ])
 (zero_extend:SI (subreg:HI (reg:SI 117) 0))) c5.c:8 -1
  (nil))
to:
 (insn 12 11 0 (set (subreg/s/u:HI (reg:SI 110 [ D.4128 ]) 0)
 (subreg:HI (reg:SI 117) 0)) c5.c:8 -1
  (nil))

With this change, for the following case:

 short unPack( unsigned char c )
 {
 /* Only want lower four bit nibble */
 c = c & (unsigned char)0x0F ;

 if( c > 7 ) {
 /* Negative nibble */
 return( ( short )( c - 16 ) ) ;
 }
 else
 {
 /* positive nibble */
 return( ( short )c ) ;
 }
 }


asm without elimination
unPack:
 @ args = 0, pretend = 0, frame = 0
 @ frame_needed = 0, uses_anonymous_args = 0
 @ link register save eliminated.
 and r0, r0, #15
 cmp r0, #7
 subhi   r0, r0, #16
 uxthhi  r0, r0
 sxthr0, r0
 bx  lr
 .size

asm with elimination
unPack:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
and r0, r0, #15
cmp r0, #7
subhi   r0, r0, #16
sxthr0, r0
bx  lr


In some cases, changed rtl statement is not eliminated by later passes 
and is generated as a mov instruction. Worse, it also seems to affect 
the other optimization passes and resulting in worse code for crc. Not 
found the cause for it yet.


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 8-12 April 2013

2013-04-14 Thread Kugan

== Summary ==
- http://cards.linaro.org/browse/TCWG-14
Coremark ARM mode gives about 2% performance improvement with about 1% 
code size reduction. Thumb2 mode however has performance regression even 
though code size reduces about 0.6%. Performance regression here is like 
what we are seeing in EPILOGUE_UESES 
changes(http://cards.linaro.org/browse/TCWG-13). Spawned spec in CBUILD 
to see the impact with spec2000.


- http://cards.linaro.org/browse/TCWG-13
Thumb2 mode performance regression is due to the percentage of time 
spent in core_state_transition. Looks like an alignment issue; same asm 
is generated for this function with the patch. Investigating it.


== Plan ==
- Plan to resolve http://cards.linaro.org/browse/TCWG-13 this week.
- Get spec2000 results for http://cards.linaro.org/browse/TCWG-14 to 
decide on the next step


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 15-19 April 2013

2013-04-21 Thread Kugan

== Summary ==
- http://cards.linaro.org/browse/TCWG-13
  - performance regression is due to alignment of function.
  - there is an increase in runtime of core_state_transition even 
though there is no difference  in the code generated with the patch
  - Adding nops seems to improve the locality and performance; it gets 
better than without the patch for THUMB2.


- Get spec2000 results for http://cards.linaro.org/browse/TCWG-14 to 
decide on the next step

   - Couldn’t get SPEC2000 results with CBUILD
   - set-up spec benchmarks in chromebook and now running locally.
- 
https://blueprints.launchpad.net/gcc-linaro/+spec/better-end-of-loop-counter-optimisation
   - Initial investigation shows that the code generated is same as 
expected.


== Plan ==
- http://cards.linaro.org/browse/TCWG-13 - follow it up
- Get spec2000 results for http://cards.linaro.org/browse/TCWG-14 to 
decide on the next step

- Look for improvement in VRP for zero/sign extension

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 22-26 April 2013

2013-04-28 Thread Kugan

== Summary ==

- http://cards.linaro.org/browse/TCWG-14
   - ran into space issues with chromebook and issues running spec2000 
locally due to that. Finally reinstalled Ubuntu on 32GB card and set-up 
everything.

   - There is a potential issue with zero/sign extension based VRP.
   - 254.gap goes into infinite loop. Investigating it.
   - checked VRP for improvement of zero/sign extension (missing case 
in CRC).


== Plan==
- http://cards.linaro.org/browse/TCWG-14
   - Find the cause for  254.gap infinite loop and fix it.
   - Find a solution to missing case in CRC

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 29 APR - 3 May 2013

2013-05-05 Thread Kugan

http://cards.linaro.org/browse/TCWG-14
   - Fixed 254.gap infinite loop.
   - Got spec2k results locally.
   - LSHIFT and RSHIFT introduces a performance regression for some 
benchmarks due to missing md pattern. Disabling it improve performance.


http://cards.linaro.org/browse/TCWG-55
  - Looked at the commits for specfp performance regression

== Plan ==
http://cards.linaro.org/browse/TCWG-14
   - Add the missing patterns to md to improve performance
   - Submit the patch upstream

http://cards.linaro.org/browse/TCWG-55
   - Find the reason for spec2k fp performance regression.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 6 - 10 May 2013

2013-05-12 Thread Kugan

== Done ==
http://cards.linaro.org/browse/TCWG-14
   - Posted patch upstream for review

http://cards.linaro.org/browse/TCWG-55
   - benchmarking ongoing with git bisect

http://cards.linaro.org/browse/TCWG-14
  - Discussed the results with ARM maintainers and pinged the original 
patch.


== Plan ==
http://cards.linaro.org/browse/TCWG-55
   - Find the reason for spec2k fp performance regression.

http://cards.linaro.org/browse/TCWG-14
   - plan action based on review.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 13-17 May 2013

2013-05-19 Thread Kugan

== Progress ==
http://cards.linaro.org/browse/TCWG-13
   - Resolved

http://cards.linaro.org/browse/TCWG-14
   - Modified patch Posted upstream for review

== Plan ==

http://cards.linaro.org/browse/TCWG-14
   - Address and review comments
http://cards.linaro.org/browse/TCWG-55
   - Reproduce it and find the reason for specfp regression

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 21-24 May 2013

2013-05-26 Thread Kugan


== Progress ==
VRP based zero/sign extension
- Got some review comments for the patch and started addressing them
- split the patch into two; 1. propagating value range and 2. do rtl 
expansion

- testing in progress
specfp regression
- Benchmarked spec2k for A15 with trunk and couldn't reproduce it.
- benchmarked spec2k for A9 with trunk and couldn't reproduce between 
24th and 28th


== Leave ==
- Monday off Sick

== Plan ==
VRP based zero/sign extension
- Send patch for review
specfp regression
- benchmark for trunk version on 23rd
- Resolve regression

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 27-31 May 2013

2013-06-02 Thread Kugan


== Progress ==
 * VRP based zero/sign extension
- Tested and posted the latest patch

 * Better end of loop counter optimisation
- Tree level optimization are optimized in mainline
- Christophe noted a slight change in asm generated from earlier 
version

- tracked down the patch causing this and communicated this.

 * Generate a single call to divmod
- Looked at expand_divmod to understand how __aeabi_idiv and
 __aeabi_idivmod are generated.


 == Plan ==

 * Better end of loop counter optimisation
- Change the pattern to remove this additional instruction if 
necessary.


 * Generate a single call to divmod
- Come up with a solution




___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Failure to optimise (a/b) and (a%b) into single __aeabi_idivmod call

2013-06-05 Thread Kugan

Hi,

I am looking at best approach for 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43721 - Failure to optimise 
(a/b) and (a%b) into single __aeabi_idivmod call in ARM architecture


In sumary, the following c code results in __aeabi_idivmod() call and 
one __aeabi_idiv() call even though the former already calculates the 
quotient.

int q = a / b;
int r = a % b;
return q + r;

My question is what would be the best way to handle it. As I see there 
are few options with some issues.


1. Handling in gimple level, try to reduce the operations to equivalent 
of this. We should do this for the targets without integer divide.

   {q, r} = a % b;
Gimple assign stmts have only one lhs operation (?). Therefore, lhs has 
to be made 64bit to signify return values of R0 and R1 returned 
together. I am not too sure of any implications on other architectures here.


2. Handling in expand_divmod. Here, when we see a div or mod operation, 
we will have to do a linear search to see if there is a valid equivalent 
operation to combine. If we find one, we can generate __aeabi_idivmod() 
and cache the result for the equivalent operation. As I see, this can 
get messy and might not be acceptable.


3. An RTL pass to process and combine these library calls. Possibly 
using cse. I am still looking at this.


4. Ramana tried a prototype to do the same using target pattens. He has 
ruled this out. (if you want more info, please refer to at 
https://code.launchpad.net/~ramana/gcc-linaro/divmodsi4-experiments)


Any suggestion for best way to handle this?

Thanks,
Kugan

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: Failure to optimise (a/b) and (a%b) into single __aeabi_idivmod call

2013-06-06 Thread Kugan

On 05/06/13 21:27, Matthew Gretton-Dann wrote:

Kugan,

I don't have the source code to hand but how are the
sin()/cos()->sincos() optimizations handled?
Thanks Matt. There is a tree level pass to combine sin()/cos() into 
sincos(). Commit that added this is: 
http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=121052. We can try 
doing same thing similar here.


Is there anyway we can know in the tree level that the target does not 
define integer divide?


Thanks,
Kugan




Thanks,

Matt

On 5 June 2013 11:44, Kugan  wrote:

Hi,

I am looking at best approach for
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43721 - Failure to optimise
(a/b) and (a%b) into single __aeabi_idivmod call in ARM architecture

In sumary, the following c code results in __aeabi_idivmod() call and one
__aeabi_idiv() call even though the former already calculates the quotient.
 int q = a / b;
 int r = a % b;
 return q + r;

My question is what would be the best way to handle it. As I see there are
few options with some issues.

1. Handling in gimple level, try to reduce the operations to equivalent of
this. We should do this for the targets without integer divide.
{q, r} = a % b;
Gimple assign stmts have only one lhs operation (?). Therefore, lhs has to
be made 64bit to signify return values of R0 and R1 returned together. I am
not too sure of any implications on other architectures here.

2. Handling in expand_divmod. Here, when we see a div or mod operation, we
will have to do a linear search to see if there is a valid equivalent
operation to combine. If we find one, we can generate __aeabi_idivmod() and
cache the result for the equivalent operation. As I see, this can get messy
and might not be acceptable.

3. An RTL pass to process and combine these library calls. Possibly using
cse. I am still looking at this.

4. Ramana tried a prototype to do the same using target pattens. He has
ruled this out. (if you want more info, please refer to at
https://code.launchpad.net/~ramana/gcc-linaro/divmodsi4-experiments)

Any suggestion for best way to handle this?

Thanks,
Kugan







___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: Failure to optimise (a/b) and (a%b) into single __aeabi_idivmod call

2013-06-06 Thread Kugan

On 06/06/13 22:03, Mans Rullgard wrote:

On 6 June 2013 12:09, Kugan  wrote:

On 05/06/13 21:27, Matthew Gretton-Dann wrote:


Kugan,

I don't have the source code to hand but how are the
sin()/cos()->sincos() optimizations handled?


Thanks Matt. There is a tree level pass to combine sin()/cos() into
sincos(). Commit that added this is:
http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=121052. We can try
doing same thing similar here.

Is there anyway we can know in the tree level that the target does not
define integer divide?


Some targets, e.g. MIPS, have a combined div/mod instruction.  Those could
benefit from this as well, unless they already achieve that optimisation
differently.


Thanks Mans. I will have a look at MIPS port.

Availability of sincos in the target is specified with the define 
TARGET_HAS_SINCOS and it is being used in tree level optimizations. We 
can do something similar (?) to get the information about availability 
of  ops/run time library call to consider optimizing.


Thanks,
Kugan




___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 2 - 7 June

2013-06-08 Thread Kugan

== Progress ==
 * Better end of loop counter optimisation
- Experimented with fixing the extra instruction.
- Found a possible way to fix it. Discussing it with Christophe.

 * Generate a single call to divmod
- Looked at the code including how sin()/cos() -> sincos() handling 
in gcc.

- Implemented a prototype and experimented.


 == Plan ==
 * VRP based zero/sign extension
- Ping the patch.

 * Generate a single call to divmod.
- Finish the prototype implementation and get the regression working
- Discuss in gcc mailing list for a good way to implement and get 
consensus with the results from prototype.



___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 11-14 June 2013

2013-06-16 Thread Kugan

== Public Holiday ==
 * Monday, 10 June (4 day week)

== Progress ==
 * VRP based zero/sign extension
- Pinged the patch.

 * Generate a single call to divmod
- Builtin based implementation  bootstrapped  and passes regression.
- posted patch and initiated discussion 
(http://gcc.gnu.org/ml/gcc/2013-06/msg00100.html)

- long long is not handled now

 * Better end of loop counter optimisation
- Dropped md patch
- Verified that Bin Cheng's RTL change noop_move_p fixes this.

== Plan ==
 * Respond to patch review
 * Set-up X86 benchmarking locally for Spec2k

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 17-21 June 2013

2013-06-23 Thread Kugan

== Progress ==

 * Generate single call to divmod
- Addressed review comments
- Proposed a new patch for discussion

 * spec2k comparison between ARM and x86
- Obtained Profile results for both architectures
- Analysing gzip and found some register allocation issues.

 * VRP based zero/sign extension
- Still no review for rtl changes
- Pinged again

== Plan ==

 * Continue with spec2k comparison between ARM and x86

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] Week 26

2013-06-30 Thread Kugan

== Progress ==
 * spec2k comparison between ARM and x86
- Specfp on x86 has some failures in trunk
- Trying 4.8 now
- Analysed some of the benchmarks and documenting them
- continuing with the rest.

== Plan ==
 * Continue with spec2k comparison between ARM and x86

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] Week 31

2013-08-04 Thread Kugan

== Progress ==
 - Addressed VRP based zero/sign extension elimination review comments
 - Make check passes but spec2k gcc fails in results comparison; 
investigating it
  - Started again with spec2k comparison for ARM to see potential 
optimization opportunities
  - Installed libacovea in Chromebook with some tweaks to study affects 
of optimization options


== Plan ==
 - post the VRP based zero/sign extension elimination patch
 - continue with pec2k comparison for ARM

== Misc ==
 - Was on holiday until 29th (4 day week)

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] Week 32

2013-08-11 Thread Kugan

== Progress ==
 * spec2k comparison between ARM and x86
- Read papers/documents that list gcc optimization/problem for arm
- Acovea runs fails in Chromebook; Looking into it
- continuing with analysing individual results

 * VRP based zero/sign extension elimination
- Final checks ongoing before posting the patch

== Plan ==
 * Continue with spec2k comparison between ARM and x86
 * post VRP based zero/sign extension elimination patch

== Misc ==
 * Was on Leave Thursday/Friday (3 Day week)
 * Watched Cauldron 2013 archives

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] Aug 12-16

2013-08-18 Thread Kugan

== Progress ==
 * spec2k comparison between ARM and x86
- Trying to reproduce some of the earlier optimization studies
- Set up Acovea Trying milepost gcc

 * VRP based zero/sign extension elimination
- Posted the modified patch

== Plan ==
 * Continue with spec2k comparison between ARM and x86
 * Wiki Cleanup

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 19 - 23 August 2013

2013-08-25 Thread Kugan

== Progress ==
 * spec2k comparison between ARM and x86
- Analysis ongoing
- Preparing to post the results for internal discussions

 * Back-porting
- Back ported all the assigned one
- There are some make check failures for thread related test cases 
with qemu; investigating it


== Plan ==
 * Continue with spec2k comparison between ARM and x86
 * Finish the back-ports

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [ACTIVITY] 19 - 23 August 2013

2013-08-26 Thread Kugan

On 26/08/13 17:21, Yvan Roux wrote:

Kugan,


  * Back-porting
 - Back ported all the assigned one
 - There are some make check failures for thread related test cases with
qemu; investigating it


This are known qemu threading issues, don't spend time on it ;)

Thanks Yvan. I had a suspicion and now you have confirmed it.

Kugan.


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] Week 35

2013-09-01 Thread Kugan

== Progress ==
 * spec2k comparison between ARM and x86
- Initiated discussion and addressing the feedback
- Looking at register allocation (IRA + Reload) for reduced test 
case and comparing with x86

- Analysed few more spec2k benchmarks

 * Back-porting
- All the back ports are checked and reviewed. Thanks Yvan.

== Plan ==
 * Continue with spec2k comparison between ARM and x86
 * Wiki Cleanup

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 2 - 6 September 2013

2013-09-08 Thread Kugan

== Progress ==
 * spec2k comparison between ARM and x86
- More benchmarks analysed.
- Plan to complete first iteration for spec2k-int benchmarks this week

 * Addressing patch review
- Woking on vrp based zero/sign extension elimination based on feedback
- Started addressing comments for divmod optimization

== Plan ==
 * Continue with spec2k comparison between ARM and x86
 * Wiki Cleanup


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] Week 37

2013-09-15 Thread Kugan

== Progress ==
 * Addressing patch review
- Woking on vrp based zero/sign extension elimination based on feedback
- Reworked the patch based and posted the first patch
- Speck2k benchmarking in progress for second patch and will post 
once finished.


 * spec2k comparison between ARM and x86
- More benchmarks analysed.
- Started with the report. Plan to send in next couple of days.
- Created more reduced test cases.

== Plan ==
 * Continue with spec2k comparison between ARM and x86
 * Start divmod work

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 16 -20 September 2013

2013-09-22 Thread Kugan

== Progress ==
 * Addressing patch review
- zero/sign extension preparation patch accepted.
- Christophe is helping to commit.

 * spec2k comparison between ARM and x86
- Looked at conditional code generation related code in gcc
- looked at ifcvt and code in arm.md and arm.c to understand the 
heuristics that dictate how much to conditionalize and what is causing 
some of the problems I am seeing.


== Plan ==
 * Continue with spec2k comparison between ARM and x86
 * Start looking at 64bit division

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 23 - 27 September 2013

2013-09-29 Thread Kugan
== Progress ==
 * MPFR and GMP Build error with Latest 4.8 Release
- Reproduced MPFR error with a simplified testcase
- Found the buggy codesequence and the patch that introduced it.
- Looking at solution for this
- Reprdouced GMP error with -march=armv7-a and -mthumb
- continuing with the invetigation

 * 64bit division
- Looked at current libgcc implementaion and studied the alorithm.

 * Addressing patch review
- Posted second patch for zero/sign extension

 * spec2k comparison between ARM and x86
- Ran spec2k with various options to validate code analysis

== Plan ==
 * Continue with spec2k comparison between ARM and x86
 * Continue 64bit division
 * Fix GMP and MPFR build error

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


MPFR build issue

2013-09-30 Thread Kugan
Hi,

With respect to the MPFR build error,
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58578), is movs appropraite
as shown below? When _err1 is negative the condition is evaluating it to
be true.


/* C Codeoes */
if (_err1 > 0)
  {

/* working code */
.loc 1 67 0
cmp r3, #0
ble .L6

/* not working code*/
.loc 1 67 0
movsr3, r3, asl #1
ble .L6



When I looked at the ARM documents, I found the following

Condition flags
---
If S is specified, for MOV instructions:

1.update the N and Z flags according to the result

2.can update the C flag during the calculation of Operand2 (see
Flexible second operand)

3.   do not affect the V flag.


And also
le  Signed less than or equal.  (Z==1) || (N!=V)


Does this means we cant movs at comparision.

http://gcc.gnu.org/ml/gcc-patches/2013-02/msg00861.html added this and I
guess it is intentional.

Am I missing anything here?

Thanks,
Kugan

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] Sept 29 - Oct 4

2013-10-05 Thread Kugan
== Progress ==
 * Lnaro 4.8 Bugfix
   - Fixed Bugs LP1232017 and LP1234060
   - gmp and mpfr make check now clean

 * spec2k comparison between ARM and x86
- Looked at register allocation issues
- Started working on slides for connect

== Plan ==
 * Continue with spec2k comparison between ARM and x86
 * Start again with 64bit division

== Misc ==
 * October 7th Public Holiday.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 8 - 11 October 2013

2013-10-13 Thread Kugan
== Progress ==
 * 4 Day week (October 7th Public Holiday)
 * 64bit division
- Looked at libgcc division implementations
- Implemented in C "Align divisor shift dividend method" for
procssors without div instrusttion
- gcc regression now pass with above

 * spec2k comparison between ARM and x86
- Continue working on slides for connect
- ran benchmarks to gather required results

== Plan ==
 * Continue with spec2k comparison between ARM and x86
 * Continye with 64bit division


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] Oct 14-18

2013-10-20 Thread Kugan
== Progress ==
 * 64bit division for targets without 32bit div instructions
   - Finished the implementation
   - Benchmarked it
   - Getting ready to post RFC patch

 * Connect Slides
- Worked on slides for connect

 * Vector regression in 4.8-2013.09
- Started bisecting to find the issue

== Plan ==
 * post RFC patch for 64bit division
 * Vector regression in 4.8-2013.09
 * Connect preparation

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 4 November - 8 November 2013

2013-11-08 Thread Kugan
== Progress ==
- Looked into Cbuild2 and current benchmarking scripts
- Talked to Rob to understand the requirements and plan for next 2 weeks
- Backported gcc testcase fix that causes regression in Linaro 4.8 for
armv5te.

== Plan ==
 - Taking leave next week.
 - Start with benchmarking scripts after that

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 18-22 Nov 2013

2013-11-24 Thread Kugan
== Progress ==
- cbuild2 benchmarking
   - Done the initial frame work (just with one benchmark) and sent out
for review
- 64bit division for ARMv7-A
   - Posted the patch and added target hook documentation required

== Plan ==
- Address zero/extension review comments
- post 64bit divide patch with the required documentation
- investigative bootstrapping issues
   - still waiting for h/w access



___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 25-29 Nov 2013

2013-12-01 Thread Kugan
== Progress ==
- cbuild2 benchmarking
   - Resolved access issues and imported into repository
   - continuing with the missing implementations

- 64bit division for ARMv7-A
   - Algorithm is committed to trunk
   - enablement patch is posted for review

- trunk bootstrapping issue
- Setup foundation model and resolved space and kernel panic issues
- There is a assembler compatibility issue with older assembler
(-mabi=lp64); trying with the latest binutils
- AARCH64 bootstrapping in progress with this
- AARCH32 bootstraps with glibc but ran into issues with eglibc.

== Plan ==
- Address zero/extension review comments
- investigative bootstrapping issues
- continue with cbuild2 benchmarking

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 02-06 December

2013-12-08 Thread Kugan

== Progress ==
  - aarch64 bootstrapping (TCWG-348 4/10)
 - worked on this posted patch

  - 64bit divide for Armv7-a (TCWG-26)
 - Patched committed and Card closed
 - http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=205444
 - http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=205666

  - Integrate benchmarking into Cbuildv2 (TCWG-360 2/10)
 - Added missing functionality

  - Zero/sign extension elimination (TCWG-15  4/10)
 - Sent a reworked patch for comment

== Plan ==
   - address aarch64 bootstrapping review
   - Continue with Integrate benchmarking into Cbuildv2

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[Weekly] 09-13 DEC 2013

2013-12-15 Thread Kugan

== Progress ==
  - aarch64 bootstrapping (TCWG-348 4/10)
 - patch committed and card closed
 - http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=205891

  - Integrate benchmarking into Cbuildv2 (TCWG-360 6/10)
 - Adding missing functionality

== Plan ==
   - Continue with Integrate benchmarking into Cbuildv2

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: segfault using __thread variable

2013-12-17 Thread Kugan
On 17/12/13 20:38, Matthew Gretton-Dann wrote:
> +Ryan, +Kugan,
> 
> On 17 December 2013 08:45, Michael Hudson-Doyle
>  wrote:
>> Will Newton  writes:
>>
>>> On 17 December 2013 07:53, Michael Hudson-Doyle
>>>  wrote:
>>>> Ah... found it!  This is the code that determines the offset to patch
>>>> into the code (elfnn-aarch64.c line 3845):
>>>>
>>>>   value = (symbol_got_offset (input_bfd, h, r_symndx)
>>>>+ globals->root.sgot->output_section->vma
>>>>+ globals->root.sgot->output_section->output_offset);
>>>>
>>>> and this is the code that determines the offset as written into the
>>>> relocation (elfnn-aarch64.c line 4248):
>>>>
>>>>   off = symbol_got_offset (input_bfd, h, r_symndx);
>>>>   ...
>>>>   rela.r_offset = globals->root.sgot->output_section->vma +
>>>> globals->root.sgot->output_offset + off;
>>>>
>>>> Can you see the difference?  The former is
>>>> "root.sgot->output_section->output_offset", the latter is
>>>> "root.sgot->output_offset".
>>>
>>> Yes, that does look a bit odd.
>>
>> Yes.  And one is the difference between the reloc and the code value and
>> the other is zero...
>>
>>>> This suggests the rather obvious attached patch.  I haven't tested this
>>>> exact patch, but its an obvious translation from a patch to
>>>> 692e2b8bcdd8325ebfbe1daace87100d53d15ad6^ which does work.  I also
>>>> haven't tested the second hunk at all, but it seems plausible...
>>>
>>> Thanks for you analysis, the fix does look plausible indeed. ;-)
>>>
>>> Have you verified it fixes the problem you were seeing?
>>
>> To be super correct, I have not verified that the patch I sent you, when
>> applied to binutils tip, fixes the problem.  But a patch that's
>> basically the same when applied to a slightly random commit from June
>> results in working binaries (and the unpatched version does not).
>>
>>> I'm about to disappear to sunnier climes
>>
>> One advantage of the southern hemisphere: my climes are already sunny...
>>
>>> for three weeks but I'll definitely look at it when I get back. I've
>>> added Marcus to CC in case he isn't reading this list.
>>
>> Cool.  Would it be useful to report the bug in
>> https://sourceware.org/bugzilla/ as well?
> 
> Yes please.
> 
> Ryan or Kugan can you look at fixing this please?

OK, I will look at it.

Thanks,
Kugan


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 16-20 December 2013

2013-12-22 Thread Kugan
== Progress ==
  - Integrate benchmarking into Cbuildv2 (TCWG-360 7/10)
 - Implementation mostly complete
 - Started testing to ensure compatible with cbuild1
 - Code available for comments at
https://git.linaro.org/toolchain/cbuild2.git/shortlog/refs/heads/benchmarking


  - Binutils Bug 16340 (1/10)
 - Posted the patch after regression testing and analysing the results

  - Mics (2/10)
 - Read relocation handling of tls and its implementation for aarch64

== Plan ==
   - Complete  Integrate benchmarking into Cbuildv2
   - Address comments for Binutils Bug 16340 and look to come up with a
simple testcase

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 23 - 27 December 2013

2013-12-29 Thread Kugan
== Progress ==
  - Short week (25th and 26th public holidays)

  - Integrate benchmarking into Cbuildv2 (TCWG-360 4/10)
 - First cut implementation complete and ready for review

  - Binutils Bug 16340 (2/10)
 - Posted modified patch after regression testing and analysing the
results


== Plan ==
   - ping modified zero/sign extension patch with benchmarking
   - set-up trunk daily regression testing
   - start TCWG-238

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 30 December 2013 - 3rd January 2014

2014-01-05 Thread Kugan
== Progress ==
  - Short week (4 Days)
  - TCWG-291 zero/sign extension elimination 4/10
 - made re-factoring and ran the regression
 - benchmarking in progress
  - TCWG-134 divmod 4/10
 - converted into latest pass structure and re-based the patch
 - found some regression failures and fixed them
 - regression testing ongoing

== Plan ==
  - post divmod and vrp patches

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 6-10 January 2014

2014-01-12 Thread Kugan
== Progress ==
  - Zero/sign extension elimination
 - experimented and discussed upstream. Still no confirmation (2/10)
  - Bug investigation LP1267122 and LP1263576 (6/10)
  - CBUILD2 Benchmarking (2/10)
 - More testing on chrome book

== Plan ==
   - Finish CBUILD2 Benchmarking
   - set-up trunk daily regression

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 13 - 17 January 2014

2014-01-19 Thread Kugan
== Progress ==
  - TLS GOT binutils patch and gcc fix for aarch64_build_constant are
committed upstream. (1/10)
  - TCWG-360 (closed) (1/10)
  - TCWG-349 (closed) (4/10)
  - set-up trunk daily regression (4/10)
- Resolved issues running address sanitiser with qemu
- started running daily cross regression testing with qemu and
native regression testing on chromebook.
- Started analysing failures.

== Plan ==
  - Continue with trunk daily regression
  - Look at pending optimization cards

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 20 - 24 January 2014

2014-01-26 Thread Kugan
== Misc ==
  - Monday 27/01/2014 - Public holiday

== Progress ==
  - LP#1191909: gold and -flto always fails with an internal error on
arm-linux-gnueabi* (2/10)
 - Reproduced it and Looking into the implementation for best fix.
  -c11-atomic-exec-5.c (4/10)
 - Issue is due to missing target hook and looking at implementing
  -LP#1270789: gcc 4.8: "invalid expression as operand" in aarch64
inline asm (2/10)
 - came-up with the patch. Still in the process of testing it. APM1
disc seems to be corrupt and that is holding this.
  - Benchmarking (1/10)
 - added -pgo for coremark
  - Set-up a Linux desktop (1/10)

== Plan ==
  - Continue with trunk daily regression
  - Fix assigned bugs

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


TARGET_ATOMIC_ASSIGN_EXPAND_FENV

2014-01-29 Thread Kugan
I had a look at the missing target hook TARGET_ATOMIC_ASSIGN_EXPAND_FENV
to fix the C11 memory model testcase in regressions in trunk.

I looked at the x86 implementation of this target hooks and x86 has
instructions (FNSTENV,FLDENV,FNSTSW,FNCLEX) for feholdexcept,
feclearexcept and feupdateenv. Does ARM has something similar? Any
pointers/links I can refer to.

Please see the gcc internal documentation for the target hook below.

— Target Hook: void TARGET_ATOMIC_ASSIGN_EXPAND_FENV (tree *hold, tree
*clear, tree *update)

ISO C11 requires atomic compound assignments that may raise
floating-point exceptions to raise exceptions corresponding to the
arithmetic operation whose result was successfully stored in a
compare-and-exchange sequence. This requires code equivalent to calls to
feholdexcept, feclearexcept and feupdateenv to be generated at
appropriate points in the compare-and-exchange sequence. This hook
should set *hold to an expression equivalent to the call to
feholdexcept, *clear to an expression equivalent to the call to
feclearexcept and *update to an expression equivalent to the call to
feupdateenv. The three expressions are NULL_TREE on entry to the hook
and may be left as NULL_TREE if no code is required in a particular
place. The default implementation leaves all three expressions as
NULL_TREE. The __atomic_feraiseexcept function from libatomic may be of
use as part of the code generated in *update.


Thanks,
Kugan

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 27 - 31 January 2014

2014-02-02 Thread Kugan
== Progress ==
* 4 Day week (Monday Public Holiday) (6/10)
* Posted patches for bugs assigned
  https://sourceware.org/ml/binutils/2014-01/msg00332.html
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60034
  http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00048.html

* TARGET_ATOMIC_ASSIGN_EXPAND_FENV implementation (2/10)
  Looked at glibc implementations of required functionality

== Plan ==
* Finish TARGET_ATOMIC_ASSIGN_EXPAND_FENV implementation
* Follow up on patches if required

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 3 -7 February 2014

2014-02-09 Thread Kugan
== Progress ==
 - Started implementing TARGET_ATOMIC_ASSIGN_EXPAND_FENV (5/10)
 - Regression testing with the implementation; found some issues and
discussed with Matt
 - Working on fixing them
 - Patch for Vectorizer generates unaligned access when
-mno-unaligned-access committed upstream (2/10)
 - This also triggered some regression with ARMv5 and looking into them
(2/10)
 - Set-up qemu aarch64 for gcc testing (1/10)

== Plan ==
 - Check ARMv5 regression for unaligned access
 - Look into vectorizer cost model/benchmarking

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


"invalid expression as operand" in aarch64 inline asm

2014-02-11 Thread Kugan
Hi,

I finally got around to checking the attached patch for the
https://bugs.launchpad.net/ubuntu/+source/gcc-4.8/+bug/1270789

I noticed attached patch causes regression for pr38151.c in gcc test-suite.

A reduced test-case that triggers this is:
static unsigned long global_max_fast;
int __libc_mallopt (int param_number, int value)
{
 __asm__ __volatile__ ("# %[_SDT_A2]" :: [_SDT_A2] "nor"
((global_max_fast)));
 global_max_fast = 1;
}

In this regard I have couple of questions:

1. Is the in-line asm valid? Look ok to me.
2. For the pr38151.c regression, asm diff is as shown below.

<   add x0, x0, :lo12:.LANCHOR0
<   ldr x0, [x0]
---
>   ldr x0, [x0,#:lo12:.LANCHOR0]

This causes:
pr38151.c:(.text+0x10c): relocation truncated to fit:
R_AARCH64_LDST64_ABS_LO12_NC against `.rodata'
collect2: error: ld returned 1 exit status.

If I however increase the alignment of .rodata where .LANCHOR0 is
defined, this passes.  Is alignment of BITS_PER_UNIT valid for
SYMBOL_REF? If I change it as I am doing this attached patch, is there
anything else I need to do.



Thanks,
Kugan
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 57b6645..3d15d54 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3193,6 +3193,8 @@ aarch64_classify_address (struct aarch64_address_info 
*info,
}
  else if (SYMBOL_REF_DECL (sym))
align = DECL_ALIGN (SYMBOL_REF_DECL (sym));
+ else if (GET_CODE(sym) == SYMBOL_REF)
+   align = GET_MODE_ALIGNMENT (GET_MODE (sym));
  else
align = BITS_PER_UNIT;
 
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 10-14 FEB 2014

2014-02-16 Thread Kugan
== Progress ==
 - vectorizer (7/10)
   - Looked into vectorizer target-hooks
   - enablement of half-float for vectorising; dropped the patch after
discussing
   - Started analysing code generated with unlimited model
 - gcc 4.8 regression (1/10)
   - Built Novemnebr and December release (native with system libc)
   - Ran spec2k fp. Can see regression for ammp with -O3 (with
-march=armv7-a and -mthumb)
  -Started bisecting

 - Chromebook reset-up (2/10)
   - SD card died and setup again
   - Using hdd for gcc testing

== Plan ==
 - Check ARMv5 regression for unaligned access
 - Look into vectorizer cost model/benchmarking

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [ACTIVITY] 10-14 FEB 2014

2014-02-19 Thread Kugan
Hi Ryan,

I don’t have any JIRA tickets created for these issues/bugs (there are
launchpad bug entries). Do you want me to create JIRA tickets when there
isn’t any?

Thanks,
Kugan

On 20/02/14 15:48, Ryan Arnold wrote:
> Hi Kugan, in the future it'd help me immensely if you could reference
> the Jira Issue that's related to your investigations/work.  If there
> isn't one, then we should be creating them.
> 
> Thanks!  Ryan
> 
> On Sun, Feb 16, 2014 at 5:31 PM, Kugan
>  wrote:
>> == Progress ==
>>  - vectorizer (7/10)
>>- Looked into vectorizer target-hooks
>>- enablement of half-float for vectorising; dropped the patch after
>> discussing
>>- Started analysing code generated with unlimited model
>>  - gcc 4.8 regression (1/10)
>>- Built Novemnebr and December release (native with system libc)
>>- Ran spec2k fp. Can see regression for ammp with -O3 (with
>> -march=armv7-a and -mthumb)
>>   -Started bisecting
>>
>>  - Chromebook reset-up (2/10)
>>- SD card died and setup again
>>- Using hdd for gcc testing
>>
>> == Plan ==
>>  - Check ARMv5 regression for unaligned access
>>  - Look into vectorizer cost model/benchmarking
>>
>> ___
>> linaro-toolchain mailing list
>> linaro-toolchain@lists.linaro.org
>> http://lists.linaro.org/mailman/listinfo/linaro-toolchain
> 
> 
> 

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: "invalid expression as operand" in aarch64 inline asm

2014-02-20 Thread Kugan


On 20/02/14 20:18, Matthew Gretton-Dann wrote:
> On 12 February 2014 02:29, Kugan  wrote:
>> Hi,
>>
>> I finally got around to checking the attached patch for the
>> https://bugs.launchpad.net/ubuntu/+source/gcc-4.8/+bug/1270789
>>
>> I noticed attached patch causes regression for pr38151.c in gcc test-suite.
>>
>> A reduced test-case that triggers this is:
>> static unsigned long global_max_fast;
>> int __libc_mallopt (int param_number, int value)
>> {
>>  __asm__ __volatile__ ("# %[_SDT_A2]" :: [_SDT_A2] "nor"
>> ((global_max_fast)));
>>  global_max_fast = 1;
>> }
>>
>> In this regard I have couple of questions:
>>
>> 1. Is the in-line asm valid? Look ok to me.
>> 2. For the pr38151.c regression, asm diff is as shown below.
>>
>> <   add x0, x0, :lo12:.LANCHOR0
>> <   ldr x0, [x0]
>> ---
>>>   ldr x0, [x0,#:lo12:.LANCHOR0]
>>
>> This causes:
>> pr38151.c:(.text+0x10c): relocation truncated to fit:
>> R_AARCH64_LDST64_ABS_LO12_NC against `.rodata'
>> collect2: error: ld returned 1 exit status.
>>
>> If I however increase the alignment of .rodata where .LANCHOR0 is
>> defined, this passes.  Is alignment of BITS_PER_UNIT valid for
>> SYMBOL_REF? If I change it as I am doing this attached patch, is there
>> anything else I need to do.
> 
> From the ARMARM:
> http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487a.b/index.html
> 
> The range on an LDR is:
>   Is the signed immediate byte offset, in the range -256 to 255,
> encoded in the "imm9" field.
>   For the 32-bit variant: is the optional positive immediate
> byte offset, a multiple of 4 in the range 0 to 16380, defaulting to 0
> and encoded in the "imm12" field as /4.
>   For the 64-bit variant: is the optional positive immediate
> byte offset, a multiple of 8 in the range 0 to 32760, defaulting to 0
> and encoded in the "imm12" field as /8.
> 
> So in this case where we're taking the low 12-bits of ANCHOR0 we
> should be ensuring it is aligned to 8-bytes (or less than 256 - but we
> can't necessarily tell that at compile time).
> 
> So I think your patch is correct - the symbol needs to be aligned to
> the size of the thing the symbol points to.

Sorry for not being clear with my query. With my earlier patch, I was
accessing the mentioned label assuming that it is  aligned.  But the
label was emitted without required alignment.  Therefore, I wanted to know
what I should do to force alignment to the label so that it is consistent.

I think it is a latent problem with alignment requirement for complex
type and is exposed by my patch. That is, we are not handling alignments
for COMPLEX_TYPE. Fixed that as well. Posted the patch as
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01282.html.

Thanks,
Kugan



___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] Feb 17-21

2014-02-23 Thread Kugan
== Progress ==
- TCWG-394 (5/10)
  - Found one more issue, regression tested and posted the patch
upstream for review
  - http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01282.html.

- TCWG-395 (1/10)
  - Started with a google doc for wiki update

- Speck regression analysis with 4.7, 4.8 Nov and Dec releases (2/10)
  - Native build of these versions and rank spec2k benchmarking for all
of them
  - Not able to reproduce it on a15 as reported

== Misc ==
 1 Day off sick

== Plan ==
- Get all the information required for wiki - TCWG-395
- Start working on TCWG-396

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] Week 11

2014-03-16 Thread Kugan

== Progress ==
4 Day week (10/03/2014 - Public holiday)

https://cards.linaro.org/browse/CARD-1210 (6/10)
  - Implemented patch to fix it and bootstrapped gcc
  - Regression testing pending

https://support.linaro.org/tickets/727 (2/10)
  - Looks like a tool-chain bug
  - Looking into it

https://bugs.launchpad.net/gcc-linaro/+bug/1270789
  - Addressing review comments

== Plan ==
  - Benchmark CRC with the patches
  - Set-up AARCH64 qemu system
  - Start with VRP patches


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[Weekly] 17-21 MAR 2014

2014-03-23 Thread Kugan
== Progress ==

CARD-1210 - optimizing prologue/epilogue  With -fomit-frame-pointer (4/10)
 - regression tested the patch and fixed issues
 - Dropping the patch and closing the  card as it has been worked on at arm.

TCWG-15 - zero/sign extension elimination for crc (3/10)
  - Looked at crc  and looked at the other optimizations listed in card 440.
  - Improved the local patch to remove the missing optimization

TCWG-413 - Running spec2006 with aarch64 (3/10)
 - Built and set-up spec2006.
 - Ran into to some compile time and run time errors

== Plan ==
Continue with  zero/sign extension elimination
Start looking at literal pool merging

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 24 - 28 March 2014

2014-03-30 Thread Kugan
== Progress ==
- TCWG-413 Spec2006 (7/10)
  - Investigated compiler error for 481.wrf  with FSF 4.8.2. Issue is
due to aarch64_cm pattern (fcmle and fcmlt supports only #0
as third val). This is already fixed in trunk and Linaro 4.8.

  - Ran profiling to analyse 4.9 regressions. Started looking into
P7Vitterbui which is one of the functions that performs badly.

- TCWG-291 CRC (3/10)
  Came up with a patch for improving vrp for test-case. Some c++ test
cases are failing in regression testing with this patch. Looking into it.

- TCWG-394 / PR60034
  Patch committed and card closed.
  http://gcc.gnu.org/viewcvs?rev=208949&root=gcc&view=rev

== Plan ==
Continue with Spec2006 and crc

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 1 - 4 April 2014

2014-04-07 Thread Kugan
- TCWG-413 Spec2006 (5/10)
  - Analysed 456.hmmer
  - In the process of opening performance bug reports
  - Started looking at 453.povray

- TCWG-291 CRC (2/10)
  - Not seeing performance improvement with redundant "and" instruction gone
  - Analysing with perf to see the reason

- LP1301335 (3/10)
  - SLP vectorizer ICEs for QT5 Webkit for Linaro 4.7
  - Doesn’t occur in trunk/4.8/4.7 FSF
  - Patch proposed for merge request which fixes
  - I also see some FAIL -> PASS in the regression with this patch
  - This patch is only relevant for Linaro 4.7 so we cant/don’t need to
upstream it (?)

== Plan ==
Continue with Spec2006 and crc

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: issue in Compiling GCC for ARMv7-a | Cortex-A9 | Hard Float | vpfv3-d16

2014-04-09 Thread Kugan


On 10/04/14 16:22, Anwej Alam wrote:
> Dear Yvan,
> 
> Thanks for your reply.
> We are trying to build native gcc compiler for CPU: nViDia Tegra 2 which
> has ARMv7-a, Cortex-A9 core. We are using host machine as i686 and OS:
> ubuntu 12.04. 

Since your host machine is i686, it is a cross compiler. Is there any
specific reason why you don’t want to use toolchain binary releases for
this? If you want to build a toolchain yourself,  you could consider
using a tool like crosstool-ng.

Thanks,
Kugan

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 7 - 11 April 2014

2014-04-13 Thread Kugan
== Progress ==
* TCWG-413 Spec2006 (6/10)
  - Setup chroot for aarch64
  - Created rootfs with 4.8/trunk and spec2006
  - booted created rootfs on foundation model with ubuntu kernel

* TCWG-291 CRC (3/10)
  - posted vrp patch upstream
  - with that seeing expected performance improvement
  - analysing crc complete and up-streaming activities pending

* LP1301335 and PR59695 back-porting (1/10)

== Plan ==
Benchmarking and FENV support

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 14 - 18 April 2014

2014-04-21 Thread Kugan
== Progress ==

* TCWG-238 (4/10)
  - Created scripts and spectools for spec2006 to work with cbuild2

* TCWG-413 Spec2006 (4/10)
  - Worked out Spec2006 config and src.alt for v1.1 src we have to work
correctly
  - Got it to work natively with Maxim's scripts
  - Did a trail run in apm openembedded; Needs ccrypt, tar with xz
compression support and system binutils with -mabi=ilp64 support

* 18th is Public Holiday (2/10)

== Plan ==

Benchmarking and FENV support

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [ACTIVITY] Week 17

2014-04-25 Thread Kugan


On 26/04/14 03:08, Renato Golin wrote:
> == Progress ==
> 
> * Holidays (3 days)
>  - Clearing emails/tasks backlog
>  - Some post-trip illness
> 
> * AArch64 vs. ARM64
>  - Comparing performance of both back-ends
What is AArch64 vs. ARM64. Is that something specific to LLVM?

Thanks,
Kugan

> 
> * Named Register
>  - Re-implementing after code review
>  - http://reviews.llvm.org/D3261
> 
> * Time
>  - CARD-1246 4/10
>  - Others   6/10
> 
> == Plan ==
> 
> * Finish named register in LLVM, check Clang
> * Continue testing and benchmarking ARM64 back-end
> * Have a try at CBuildv2
> 
> ___
> linaro-toolchain mailing list
> linaro-toolchain@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-toolchain
> 

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 21-25 April 2014

2014-04-27 Thread Kugan
== Progress ==
* Short week (21st and 25th are public holidays) (4/10)

* TCWG-447 (5/10)
  * Implemented and tested fenv target hooks, necessary built-ins and md
patterns
  * Posted RFC patches for review for both arm and aarch64
  * http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01743.html
  * http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01744.html

* TCWG-413 Spec2006 (1/10)
  * Finished the set-up
  * On hold for now

== Plan ==
  * upstream zero/sign extension elimination activities
  * start with literal pool merging

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 28 April - 2 May 2014

2014-05-05 Thread Kugan
== Progress ==
* TCWG-447 (5/10)
  * Re-spin few versions of the patches and posted after testing based
on reviews.
  * http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01743.html
  * http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01744.html

* TCWG-413 Spec2006 (5/10)
  * Updated the scripts to deploy libraries and to run cross spec
benchmarking with them.
  * Experimented with open embedded image generation for benchmarking-
still finding some issues even with a trusty chroot.
  * Started benchmarking and variance analysis.

== Plan ==
  * Benchmarking.
  * Upstream zero/sign extension elimination activities.
  * Start with literal pool merging.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 5 - 9 May 2014

2014-05-12 Thread Kugan
== Progress ==

* TCWG-413 (5/10)
 - Rebuild FSF 4.8, Linaro 4.8 and Linaro 4.9 releases for aarch64 with
crosstool-ng (Kept all the dependencies same and used different gcc).
 - Lost all the config for running benchmark on the test-machine and set
it up again.
 - Re-ran spec2k benchmarking and results.

* TCWG-468 (5/10)
 - Looked in detail IRA dumps and cost models.
 - Also looked at IRA and LRA code to get better understanding of the
algorithms.
 - Costs dumped seems odd and looking further.

== Plan ==
  * Benchmarking.
  * Upstream zero/sign extension elimination activities.
  * sha1 performance.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[RFC][AArch64] Remove CORE_REGS form reg_class

2014-05-14 Thread Kugan
Hi All,

AAarch64 back-end defines GENERAL_REGS and CORE_REGS with the same set
of register. Is there any reason why we need this?

target hooks like aarch64_register_move_cost doesn’t handle CORE_REGS.
In addition, IRA cost calculation also has logics like make common class
biggest of best and alternate; this might get confused with this.

Attached RFC patch removes it. regression tested for
aarch64-none-linux-gnu on qemu-aarch64 with now new regression. Is this OK ?

Thanks,
Kugan

gcc/

2014-05-14  Kugan Vivekanandarajah  

* config/aarch64/aarch64.c (aarch64_regno_regclass) : Change CORE_REGS
to GENERAL_REGS.
(aarch64_secondary_reload) : LikeWise.
(aarch64_class_max_nregs) : Remove CORE_REGS.
* config/aarch64/aarch64.h (enum reg_class) : Remove CORE_REGS.
(REG_CLASS_NAMES) : Likewise.
(REG_CLASS_CONTENTS) : LikeWise.
(INDEX_REG_CLASS) : Change CORE_REGS to GENERAL_REGS.


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index a3147ee..eee36ba 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3951,7 +3951,7 @@ enum reg_class
 aarch64_regno_regclass (unsigned regno)
 {
   if (GP_REGNUM_P (regno))
-return CORE_REGS;
+return GENERAL_REGS;
 
   if (regno == SP_REGNUM)
 return STACK_REG;
@@ -4102,12 +4102,12 @@ aarch64_secondary_reload (bool in_p ATTRIBUTE_UNUSED, 
rtx x,
   /* A TFmode or TImode memory access should be handled via an FP_REGS
  because AArch64 has richer addressing modes for LDR/STR instructions
  than LDP/STP instructions.  */
-  if (!TARGET_GENERAL_REGS_ONLY && rclass == CORE_REGS
+  if (!TARGET_GENERAL_REGS_ONLY && rclass == GENERAL_REGS
   && GET_MODE_SIZE (mode) == 16 && MEM_P (x))
 return FP_REGS;
 
   if (rclass == FP_REGS && (mode == TImode || mode == TFmode) && CONSTANT_P(x))
-  return CORE_REGS;
+  return GENERAL_REGS;
 
   return NO_REGS;
 }
@@ -4239,7 +4239,6 @@ aarch64_class_max_nregs (reg_class_t regclass, enum 
machine_mode mode)
 {
   switch (regclass)
 {
-case CORE_REGS:
 case POINTER_REGS:
 case GENERAL_REGS:
 case ALL_REGS:
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 7962aa4..3455ecc 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -409,7 +409,6 @@ extern unsigned long aarch64_tune_flags;
 enum reg_class
 {
   NO_REGS,
-  CORE_REGS,
   GENERAL_REGS,
   STACK_REG,
   POINTER_REGS,
@@ -424,7 +423,6 @@ enum reg_class
 #define REG_CLASS_NAMES\
 {  \
   "NO_REGS",   \
-  "CORE_REGS", \
   "GENERAL_REGS",  \
   "STACK_REG", \
   "POINTER_REGS",  \
@@ -436,7 +434,6 @@ enum reg_class
 #define REG_CLASS_CONTENTS \
 {  \
   { 0x, 0x, 0x },  /* NO_REGS */   \
-  { 0x7fff, 0x, 0x0003 },  /* CORE_REGS */ \
   { 0x7fff, 0x, 0x0003 },  /* GENERAL_REGS */  \
   { 0x8000, 0x, 0x },  /* STACK_REG */ \
   { 0x, 0x, 0x0003 },  /* POINTER_REGS */  \
@@ -447,7 +444,7 @@ enum reg_class
 
 #define REGNO_REG_CLASS(REGNO) aarch64_regno_regclass (REGNO)
 
-#define INDEX_REG_CLASSCORE_REGS
+#define INDEX_REG_CLASSGENERAL_REGS
 #define BASE_REG_CLASS  POINTER_REGS
 
 /* Register pairs used to eliminate unneeded registers that point into
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 12 - 16 May 2014

2014-05-18 Thread Kugan
== Progress ==
* TCWG-413 (8/10) sha1 performance
 - Looked at IRA dumps and aarch64 target hooks.
 - GCC now uses FP registers as register class and this results in lots
of fmovs for the test-case.
 - Discussed in list and tried spill_class hook for aarch64. This helps
sha1.
 - Regression tested the change.
 - Ran Spec2000 with the changes and  168.wupwise, 187.facerec  are failing.
 - Investigation continues.

* TCWG-468 (1/10)
 - Continuing with benchmarking.

* Set-up NX and started using it (1/10)

== Plan ==
  * Benchmarking.
  * Upstream zero/sign extension elimination activities.
  * sha1 performance.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: Using binfmt_misc for cross testing

2014-05-20 Thread Kugan


On 21/05/14 16:08, Maxim Kuvyrkov wrote:
> I have been thinking how to simplify cross-testing our toolchain for both 
> automated and development/debugging builds, and among various options the 
> most universal I came up with is ARM hardware + ssh + binfmt_misc + sshfs.  I 
> wonder if anyone has already tried this or can suggest alternatives which are 
> as universal.
> 
> Given:
> - host x86_64 development machine
> - cross-compiler
> - target hardware with fast network to the host
> - host and target have ssh
> - testsuite (gcc/glibc/gdb/etc)
> 
> Here is how it is going to work
> 
> 1. On host we create a simple wrapper script that will pass through its 
> arguments as command to execute on target via ssh:
> ===
> #!/bin/sh
> ssh -p 22NN $TARGET_BOARD "$@"
> ===
> 
> 2. We register this script in binfmt_misc to be used as interpreter for 
> target binaries.  Value of $TARGET_BOARD will be picked up from the 
> environment and can be set to different boards for different testsuite runs.
> 
> 3. The target board needs to be prepared for a particular testsuite run:
>   -- Runtime libraries need to be either copied or mounted via sshfs from the 
> host.  It is an open question how best to install several sets of libraries 
> (for parallel runs) so that each set appears to be main system libraries.  My 
> current thinking is a separate ssh server inside chroot per each test run.
>   -- Test directory needs to be sshfs mounted on target from host so that the 
> target could see test executables.
>   -- Preparation/finalization of the board can either be done explicitly 
> before/after testing.  Or it can be done on demand by the aforementioned 
> script: the script checks whether a multiplexed ssh socket exists, and, if 
> not, it prepares the board and starts a multiplexed ssh connection.
> 
Is this set-up for NX? Issue is how do we share the target board between
different users? We can of-course initiate a lava hacking session but
the amount of time we will have to wait to get the session active might
be too long depending on the target classes availability.

Thanks
Kugan

> 4. Testing is fired up as if it is normal "native" testing.  Whenever kernel 
> is given an ARM binary to execute -- it passes it off to wrapper, which 
> passes it off to the target board via ssh.  The board sees same filesystem as 
> host and happily executes binaries against toolchain runtime libraries.
> 
> Comments or rotten tomatoes?
> 
> Thank you,
> 
> --
> Maxim Kuvyrkov
> www.linaro.org
> 
> 
> 
> 
> ___
> linaro-toolchain mailing list
> linaro-toolchain@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-toolchain
> 

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 19 - 23 May 2014

2014-05-26 Thread Kugan
== Progress ==
* Zero/sign extension elimination (TCWG-15) (9/10)
  - Initiated discussion and started working on alternate approach to
handle this.
  - Gone through couple of prototype to understand approach.
  - Finshed first implrmentation; Needs cleanup and some failiures in
regression needs fixing.
* benchmarking (TCWG-468) (1/10)
  - Changes to Maxim's script is complete; Doing final testing before
committing.
* SAH1 performance (TCWG-413)
 - Committed a clean-up patch upstream.
 - Posted spill_class patch.
* FENV for C11 TCWG-447
 - Committed AArch64 part.

== Plan ==
  * Benchmarking.
  * Upstream zero/sign extension elimination activities.
  * sha1 performance.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 26 - 30 May 2014

2014-06-01 Thread Kugan
== Progress ==

* Zero/sign extension elimination (TCWG-15) (7/10)
  - regression tested and fixed all the issues
  - final bootstrap and regression testing for arm and x86_64 are ongoing
  - will post the patch for comment after checking the results

* benchmarking (TCWG-468) (1/10)
  - Set-up chrome-book for a15 release benchmarking

* SAH1 performance (TCWG-413) (2/10)
 - Christophe noted regression for aarch64_be due to clean-up patch.
 - register_move_cost hook in aarch64 does not handle all the cases
(CORE_REGS and POINTER_REGS) and due to this, it calculates FP2FP cost
for these classes . With CORE_REGS gone, costs for register classes are
now different. Cost table needs adjustment.

* FENV for C11 TCWG-447
 - Committed ARM part.

== Plan ==
  * Benchmarking.
  * Upstream zero/sign extension elimination activities.
  * sha1 performance.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 2 - 6 May 2014

2014-06-09 Thread Kugan
== Progress ==

* Zero/sign extension elimination (TCWG-15) (2/10)
 - Posted patch for comment

* benchmarking (TCWG-468) (1/10)
  - Ran a53 benchmarks

* regressions (7/10)
  - THUMB1 regression for ARM fenv
* Issue due to thumb1 not supporting mrc/mcr. Patch to fix this is
posted for review.
  - Regression when allocating 128bit integer to VFP register
* When LRA assigns  DImode value to TImode register, it is not
setting up it in the right place of TImode. Due to this, one of the
moves becomes dead. Patterns needs to be checked.
* VFP registers store big-endian values in little-endian format.
Hence, subreg for mode greater than word has to
be aware of this.  As it is, aarch64_cannot_change_mode_class will need
the fix like done in ARM.

== Plan ==
  * Benchmarking.
  * Upstream zero/sign extension elimination activities.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 9 - 13 June 2014

2014-06-15 Thread Kugan
== Progress ==
* Public holiday (2/10)
* Benchmarking  and regression testing (TCWG-468) (8/10)
  - Ran release benchmarking in chrome-book
  - Set-up package build environment and ran ubutest
  - spec2k benchmarking of patches that was included in 2014.06 release
on aarch64

== Plan ==
  * Benchmarking.
  * Upstream zero/sign extension elimination activities.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 16-20 June 2014

2014-06-22 Thread Kugan
== Progress ==
* Benchmarking  and regression testing (TCWG-468) (3/10)
  - More aarch64 spec2k benchmarking of patches and against fsf 4.9 release
  - Spec2k a15 benchmarking on chromebook

* Zero/sign extension elimination (TCWG-15) (5/10)
  - Fixed regressions found and broke the patch into two
  - Looking at improving zero/sign extension elimination as it is not
happening for one test-case now
  - Plan to benchmark the changes and post for review once this is done.

* LP spec2k regression bugs (2/10)
  - #1320965 : Cant reproduce ICE but goes into infinite loop with
-floop-interchange at -O3; Looking into it
  -#1330725 : Invalid as -fno-strict-overflow is needed for 254.gap.

== Plan ==
  * Benchmarking and spec2k bugs.
  * Upstream zero/sign extension elimination activities.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [Weekly] 23-27 JUNE 2014

2014-06-29 Thread Kugan
>  - Provide ldp/stp peephole optimization for Aarch64 [TCWG-446] [2/10]
In case you are not aware of, there was an earlier attempt for this and
the patch was posted here:
https://gcc.gnu.org/ml/gcc-patches/2013-03/msg01051.html

Thanks,
Kugan

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 23 - 27 June 2014

2014-06-29 Thread Kugan
== Progress ==

* Zero/sign extension elimination (TCWG-15) (10/10)
 - Posted two patches for review and gone through few iterations
 - Looked at flag_wrapv and !flag_strict_overflow regressions
   * ARM (and possibly some other targets) truncates negative values and
this makes them incompatible with the value range in SSA. One solution
is to ignore any gimple statements that load negative constants when
eliminating zero/sign extension elimination.
   * We also loose the OVF(INF) information in tree when they are
converted to wide_int and propagated to SSA.
 - Testing on a target that support PTR_EXTEND
   * Trying to set-up x86_64-linux with -mx32. Still not able to compile
as I am getting various errors in glibc. Looking into it,

== Plan ==
  * Upstream zero/sign extension elimination activities

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [ACTIVITY] 23 - 27 June 2014

2014-06-29 Thread Kugan
On 30/06/14 13:11, Pinski, Andrew wrote:
>> Testing on a target that support PTR_EXTEND
> 
> AARCH64 ILP32 is also a target which does PTR_EXTEND.
Thanks for that. I actually wanted to test on a target which will return
-1,1,0 for SUBREG_PROMOTED_UNSIGNED_P. Lookigg at gcc code, it will
happen only when #define POINTERS_EXTEND_UNSIGNED -1. That means ia64
and s390 seems to be the only targets to test. But Jakub suggested
ia64-hpux or x86_64-linux -mx32. Not sure how x86_64-linux -mx32 will
fall into this.

Thanks,
kugan

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 30 June - 4 July 2014

2014-07-06 Thread Kugan

== Progress ==
Zero/sign extension elimination (TCWG-291) 10/10
- Patch updated based on review comments.
- Regression tested with standard set-up
- Created test-cases.
- Set-up additional architectures for validation
   * aarch64-none-elf --with-abi=ilp32 (Foundation model)
   * aarch64-none-linux-gnu --with-abi=ilp32 seems to be broken.
   * Set-up qemu based s390x-ibm-linux
   * Tried x86_64-linux -mx32 but ran into many issues.
- Patch now waiting for s390x-ibm-linux. All others are OK.
- will post once the results are available

== Plan ==.
 - Spec2k regressions
 - sha1 regressions
 - 8th and 9th on holiday.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 7-11 July 2014

2014-07-13 Thread Kugan

== Progress ==

- 8th and 9th on holiday (4/10)

- Zero/sign extension elimination (TCWG-291) 3/10
  * Patch one is accepted
  * Posted the modified patch 2 after some discussions
  * Started analysing the code generated for coremark and spec2k
  * Looks there are more places that can be improved. I will post
additional patches after completing it

- Launchpad bugs (3/10)
  * https://bugs.launchpad.net/gcc-linaro/+bug/1320965
  * https://bugs.launchpad.net/gcc-linaro/+bug/1331112

== Plan ==

 - sha1 regressions
 - 15th and 16th on holiday

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [ACTIVITY] Week 30

2014-07-28 Thread Kugan


On 28/07/14 23:35, Renato Golin wrote:
> On 28 July 2014 14:27, Christopher Covington  wrote:
>>>  - Work around the lack of perf on v8?
>>
>> Out of curiosity, what exactly is missing?
> 
> Hi Christopher,
> 
> As far as I know, everything. But maybe this is just for the hardware
> we have in the lab...


It could be the kernel version we use in lab. AFAIK, Charles had some
luck profiling with perf on newer kernel. I am not sure about the
version though.

Thanks,
Kugan

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 28 July - 1 August 2014

2014-08-04 Thread Kugan

== Progress ==

- Travelling from TCWG sprint (2/10)

- Zero/sign extension elimination (TCWG-291) 2/10
  * Posted the modified patch and some discussions. Further testing.

-SHA1 regression (TCWG-468) 4/10
  * Looked at IRA's uses of back end cost model. It might be a
limitation (See the notes below). Looking at the test-case from sha1
which also has inline asm whose constraints are causing  further issues.

- Misc (2/10)
  * Looked at bugs assigned (https://bugs.linaro.org/show_bug.cgi?id=85)
  * Set-up LLVM

== Plan ==
 - Sha1 regressions
 - Fixing assigned Bugs



-
In AArch64, some of the integer  operations support “w” constraint
(FP_REGS). For example *addsi3_aarch64 pattern supports it. However, not
all of the integer operations supports it. In the cases where it is
supported,  all the operands have to be  in FP_REGS and it will not work
if we have one operand in FP_REGS and other in GENERAL_REGS.

If there is an allocno  whose pseudo register is used only in
*addsi3_aarch64 insns, it will have low cost for register class FP_REGS
(as in the case of a28 below exacted from an example).  If  the other
pseudo register used in *addsi3_aarch64 (a27 in the example below) is
also used in instructions (rorsi3_insn in the exaple below) that does
not support “w” constraint, there is going to be a cost involved in
moving it from  FP_REGS to  GENERAL_REGS (or other way)

Currently IRA dosent seems to be considering  this dependency in
considering this inter dependency in cost calculation.


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [ACTIVITY] 28 July - 1 August 2014

2014-08-04 Thread Kugan
> 
> -
> In AArch64, some of the integer  operations support “w” constraint (FP_REGS). 
> For example *addsi3_aarch64 pattern supports it. However, not all of the 
> integer operations supports it. In the cases where it is supported,  all the 
> operands have to be  in FP_REGS and it will not work if we have one operand 
> in FP_REGS and other in GENERAL_REGS.
> 
> If there is an allocno  whose pseudo register is used only in
> *addsi3_aarch64 insns, it will have low cost for register class FP_REGS (as 
> in the case of a28 below exacted from an example).  If  the other pseudo 
> register used in *addsi3_aarch64 (a27 in the example below) is also used in 
> instructions (rorsi3_insn in the exaple below) that does not support “w” 
> constraint, there is going to be a cost involved in moving it from  FP_REGS 
> to  GENERAL_REGS (or other way)
> 
> [Andrew] I bet if *aarch64_ashl_sisd_or_int_3 had ! on those 
> constraints it would be much better and work correctly.

Thanks for that. That is precisely what I am planning on investigating.

> Currently IRA dosent seems to be considering  this dependency in considering 
> this inter dependency in cost calculation.
> 
> 
> [Andrew] Yes because the back-end does not tell IRA anything about the 
> constraints.  It is not a limitation in IRA/LRA but rather how 
> *aarch64_ashl_sisd_or_int_3 says it can pick any of those alternatives 
> equally.  See 
> https://gcc.gnu.org/onlinedocs/gccint/Multi-Alternative.html#Multi-Alternative
>  
> 

If this is not a limitation on the part of the IRA and can be managed by
adjusting the cost for constraints, it will definitely save lot of time.
I saw one of your old patch for enabling "?" in IRA (I cant find the
link now). Is that problem fixed now ?

Thanks,
Kugan

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 4 - 8 August 2014

2014-08-10 Thread Kugan
== Progress ==

TCWG-445 - AArch64 does not generate post-decrement stores.
 * Issue resolved with Jiong's prologue/epilogue patch committed to trunk.
 * Closed the card.

TCWG-291 - Zero/sign extensions (5/10)
 * More review and posted patch based on that which was accepted
 * Ran full set of validation (including s390x, aarch64 be, x86 and arm)
 * Committed two outstanding patches
 * Closed the card.

TCWG-413 - Release benchmarking (2/10)
 * Benchmarking or release for a15 and a57

TCWG-468: Sha1 regress  (2/10)
 * Experimented with back-end patterns. Not much improvement for the
test-case.

- Misc (1/10)
 * Looked at gcc bugs and closed old ones.
 * Posted test-case patch.


== Plan ==
 - Sha1 regressions
 - Fixing assigned Bugs

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 11 - 15 August 2014

2014-08-18 Thread Kugan
== Progress ==

* Looked at implementation options for backlog cards and
closed/postponed cards that doss not benefit or cards that require
excessive re-architecture that will not be possible now. (5/10)
- TCWG-468 - Postponed after detailed study and discussion.
- TCWG-412 (Support literal/constant pool sharing (wont-fix): As it is,
intra procedularl literal pool fix happens in
TARGET_MACHINE_DEPENDENT_REORG with arm_reorg. It is quite complex and
somewahat hacky. it would become even more messy if we are to record
these information and reuse it for whole program as a way to share
literal poool. Additionally A-profile (which is Linaro's focus) dosent
benefit from this as we use movw/movt instead. Therefore decided to
clsoe this as wont fix.

* TCWG-413 - Release benchmarking (2/10)
- Benchmarking or release for a15 and a57

* Analysed coremark and spec2k for uxt/sxt optimizations (3/11)
- Studied coremark and have additional patches in development for
missing cases.

* Misc
- Looked at bug database and monitored patches relevant to arm/aasrch64

== Paln ==
* Look at open uxt/sxt bugs in gcc bugzilla
* Study coremark and then spec2k for uxt/sxt

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 18-22 August 2014

2014-08-24 Thread Kugan
== Progress ==

* TCWG-413 - Release benchmarking (2/10)
- Analysed all the patches gone into the release for performance impact

* TCWG-521 - Analysed Coremark and Spec2k for uxt/sxt optimizations
TCWG-521(3/10)
- Studied Coremark and have additional patches in development for
missing cases.

* bswap pr43550 (No card yet) (3/10)
- Looked at the pr and it is not fixable with VRP
- An alternate approach based patch is in testing to fix this.

* 1 day off Sick (2/10)

== Paln ==
* Continue with TCWG-521 and bswap pr43550

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 18 - 22 August 2014

2014-08-31 Thread Kugan
== Progress ==

* Regression on alphaev68-linux-gnu due to uxt/sxt commit (7/10)
 - built cross alphaev68-linux-gnu and reproduces it with qemu.
 - Issue is due to PROMOTE_MODE definition and VRP truncating values.
 - discussed upstream and after a failed patch, it was suggested that
the value ranges have to be calculated in promoted_mode precision.
 - Patch to do VRP in promoted_mode is in testing. There are few
failures to fix.
 - An easier fix probably is to check the promoted mode against
word_mode and disable uxt/sxt elimination.

* TCWG-521 - Analysed Coremark and Spec2k for uxt/sxt optimizations
TCWG-521(2/10)
 - Continue with Coremark and have additional patches in development for
missing cases.

* bswap pr43550 (No card yet) (1/10)
 - Finished regression testing. Will post the patch after resolving
regression with alpha.


== Paln ==
* Continue with VRP in promoted mode, TCWG-521 and bswap pr43550.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 1 - 5 September 2014

2014-09-07 Thread Kugan
== Progress ==
* Regression on alphaev68-linux-gnu due to uxt/sxt commit (6/10)
 - posted patch for promoted type based VRP after fixing issues found in
bootstrapping and regression testing. But had to drop this as this might
have performance implications.
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00288.html

 - Proposed setting a flag to indicating overflow/wrap so that we can
remove uxt/sxt safely.

* Preparation for connect hacking session (2/10)
  - Started with materials for the discussions.

* misc (2/10)
  - Visa
  - Meetings

== Issues ==
 - Value range data generated by VRP is not reliable for uxt/sxt removal
as wrapping/overflow is not propagated.
 - Patch for Calculating range in promoted type is rejected
 - Proposed propagating additional flag (with some tweaking to preserve
size) so that this information is available at the time we deicide on
uxt/sxt redundancy
 - If this cannot be agreed, this patch has to be reverted

== Plan ==
* finalize overflow/wrap propagation
* prepare for connect

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 22-26 Sept 2014

2014-09-28 Thread Kugan
== Progress ==
* LR register not used in leaf functions (TCWG-539) (2/10)
Posted the patch after regression testing
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01833.html

* AArch64 Spec2006 int regression (3/10)
 - After struggling to boot Juno, found the combination that works
 - Ran spec2006 int benchmarks to try reproduce
 - waiting for more information to continue

* Launchpad bugs (LP1331112, LP1332640. LP1331126 and LP1320965) (3/10)
 - Set-up cbuild2 and spec2k on aarch64 board
 - ran aarch32 and aarch64 in aarch64 board with 09.2014
 - Updated the bug entries with the results

* Misc (2/10)
 - Connect recovery (2/10)

== Plan ==
* 29/09/2014 - Public holiday
* Get back to zero/sign extension with pass to promote operations

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 29 September - 3 October 2014

2014-10-06 Thread Kugan
== Progress ==
* LR register not used in leaf functions (TCWG-539) (1/10)
Reviewed Jiong's changes.

* bug #412 (2/10)
- Seems to have been fixed but since there is not specific test-case
except that it happens with spec2k gcc, need more work to be entirely sure.

* AArch64 Spec2006 int regression (1/10)
 - Looked at bug databases and mailing list archive for more information

* armv3 (bug #85 and bug #410) (2/10)
 - Looked at both the outstanding bug to find more information
 - proposed using -mno-lra for armv3 after discussion with the team

* Misc (2/10)
 - Public holiday (2/10)
 - Annual Leave (2/10)

== Plan ==
* zero/sign extension elimination with widening types

== Holidays ==
* 06/10/2014 - Public holiday
* 07/10/2014 and 07/10/2014 - Annual leave

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 6-10 October 2014

2014-10-12 Thread Kugan

== Holiday ==
* Public Holiday (2/10)
* Leave (4/10)

== Progress ==
* Zero/sign extension elimination with widening types (4/10)
 - Started experimented with a pass for widening type.
 - Verified for one simple test-case.
 - Bootstrapping is failing and looking into it.

== Plan ==
* Continue with Zero/sign extension pass.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 13 - 17 October 2014

2014-10-19 Thread Kugan
== Progress ==

* Zero/sign extension elimination with widening types (TCWG-546 - 9/10)
 - Fixed ICEs and now can build the cross compiler and do the regression
testing with qemu
 - some test-cases are failing due to condition that rely on overflow;
this need fixing.
 - Bootstrapping on AArch64 still fails (but much later than previously).
 - Verified CRC is optimized

* Improve block memory operations by GCC (TCWG-142 - 1/10)
 - Looked at the changes since the card was drafted

== Plan ==

* Continue with Zero/sign extension pass.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 20 - 24 October 2014

2014-10-26 Thread Kugan
== Progress ==

* Zero/sign extension elimination with widening types (TCWG-546 - 10/10)
 - Re-wrote the pass from the results of experiments so far
 - Fixed most of the regression failures
 - 5 tests are still failing from C/C++/Fortran regression suite.


== Plan ==

* Continue with Zero/sign extension pass.
 - Get bootstrapping for ARM and AArch64 working
 - Fix remaining regression failures
 - Add detail dumps
 - Remove unnecessary copies (it is now being removed dead code
elimination pass)
 - Get patch ready for upstream discussion

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 27-31 October 2014

2014-10-31 Thread Kugan
== Progress ==
* Zero/sign extension elimination with widening types (TCWG-546 - 10/10)
 - Fixed regression failures
 - Fixed bootstrapping issues for ARM and AArch64
 - Re-factored and added some comments
 - x86-64 Bootstrapped and regression tested for all languages with
forced promotion. There is 6 differences in scanning for certain
instructions. All the execution tests are passing.  Needs further
investigation.

== Plan ==
* Continue with Zero/sign extension pass.
 - Benchmarking
 - Get patch ready for upstream discussion

* Improve block memory operations by GCC (TCWG-142)
  - Start work on this

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 3-7 November 2014

2014-11-09 Thread Kugan
== Progress ==
* Zero/sign extension elimination with widening types (TCWG-546 - 9/10)
 - benchmarked Spec2k and the improvements are very small
 - Coremark fared worse. Looked into the cases and relaxed some of the
constraints.
 - Subsequent passes are also not optimizing some of the expected cases
 - Re-factored and posted an RFC patch for comments at
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00756.html

* Improve block memory operations by GCC (TCWG-142 - 1/10)
  - Looked at the test-cases and gcc dumps

== Plan ==
* Continue with improve block memory operations by GCC.
* Continue with Zero/sign extension pass based on feedback.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 10 - 14 November 2014

2014-11-16 Thread Kugan
== Progress ==
* Zero/sign extension elimination with widening types (5/10)
 - Addressing comments from the review

* Improve block memory operations by GCC (TCWG-142 - 5/10)
  - Looked at gcc/glibc implementations
  - Experimented with x86_64 vs ARM and found different implementation
decisions
  - Discussed work items

== Plan ==
* Continue with improve block memory operations by GCC.
* Continue with Zero/sign extension.

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 17 - 21 November 2014

2014-11-23 Thread Kugan
== Progress ==
* Zero/sign extension elimination with widening types (2/10)
 - Addressing comments from the review

* Improve block memory operations by GCC (TCWG-142 - 3/10)
  -Looked at ARM vs AArch64

* BUG #880 (3/10)
  - Analysed tree dumps.
  - Updated bug report with the findings.

* MISC (2/10)
 - Looked at git and stg documents

== Plan ==
* Continue with Zero/sign extension.

== Planned Leave == 
* 27/11/2014 to 28/11/2014 - 2 days
* 11/12/2014 to 24/12/2014 - 10 days

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 24-28 November 2014

2014-11-30 Thread Kugan
== Progress ==
* Zero/sign extension elimination with widening types (1/10)
 - Addressing comments from the review

* BUG #398 #412 (5/10)
  - built kernel revision with  provided config and toolchain binary
release to reproduce gcc segafult. Couldn’t reproduce  it. Since there
is no more details to reproduce, closed it as cant reproduce.
  - Spec2k gcc optimization issue was reproduced and reduced test-case
was created.
  - dumps shows that this issue could be related to splitting constants
for early during expand might be the root cause.

* Holiday (4/10)

== Plan ==
* Continue with Zero/sign extension.
* BUG #412

== Planned Leave == 
* 11/12/2014 to 24/12/2014 - 10 days

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 1-5 December 2014

2014-12-07 Thread Kugan
== Progress ==
* Compiler-pass to widen computation to back-end promoted mode (6/10)
  - Addressed most of the comments. testing it.
  - Re-based the patch to latest type-safe changes
  - changed the design to remove CONVERT_EXPRs as much as possible and
generated AND_EXPR and SEXT_EXPR
  - Looking at generated code with benchmarks

Bugs (4/10)
* https://bugs.linaro.org/show_bug.cgi?id=575
* https://bugs.linaro.org/show_bug.cgi?id=398
* https://bugs.linaro.org/show_bug.cgi?id=933
* https://bugs.linaro.org/show_bug.cgi?id=412

== Plan ==
* Continue with Zero/sign extension.
* BUGS

== Planned Leave ==
* 11/12/2014 to 24/12/2014 - 10 days

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 8 - 12 December 2014

2014-12-14 Thread Kugan
== Progress ==
* Compiler-pass to widen computation to back-end promoted mode (2/10)
  - Looking at generated code with benchmarks
  - some benchmark analysis with coremark

* https://cards.linaro.org/browse/TCWG-486 (3/10)
  - Reviewed Zhenqiang's and Dmitry's patches and experimented
  - I have at lease one case which is not working with Zhenqiang's patch

* MISC (1/10)
  - Scanned gcc-patches and bug database

* Holiday (4/10)

== Planned Leave ==
* 11/12/2014 to 24/12/2014

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 29 December 2014 - 2 January 2015

2015-01-04 Thread Kugan
== Progress ==

* Back from holiday
* Public holiday Thursday (2/10)
* TCWG-486 (4/10)
  - Simplified existing patch
  - Discussed with Zhenqiang
* TCWG-555 (4/10)
  - propagate wrap/overflow information to ssa

== Plan ==
TCWG-486 and TCWG-555

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 5 - 9 January 2015

2015-01-11 Thread Kugan
== Progress ==

* TCWG-486 (4/10)
  - Ready to start benchmarking
  - Discussed with Bernie on new benchmarking set-up
  - Waiting for Ryan on permission

* https://bugs.linaro.org/show_bug.cgi?id=412 (4/10)
  - Created a reduced test-case and filed upstream bug
  - looked at GCSE code and ifcvt code/dumps

* gcc bugzilla and gcc-patches list  (2/10)
 - Review patches
 - Looked at arm related bugs

== Plan ==
TCWG-486 and TCWG-555

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


[ACTIVITY] 12-16 January 2015

2015-01-18 Thread Kugan
== Progress ==
* TCWG-554 (1/10)
 - Analysing coremark with widen type pass
* TCWG-547 (6/10)
 - Posted patches for vrp based extension elimination for review
* Connect Slides (3/10)
  - Started working

== Plan ==
* Finish connect slides
* TCWG-554

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


  1   2   >