Re: Live range shrinkage in pre-reload scheduling

2014-05-16 Thread Kyrill Tkachov

On 15/05/14 09:52, Ramana Radhakrishnan wrote:

On Thu, May 15, 2014 at 8:36 AM, Maxim Kuvyrkov
 wrote:

On May 15, 2014, at 6:46 PM, Ramana Radhakrishnan  
wrote:

I'm not claiming it's a great heuristic or anything.  There's bound to
be room for improvement.  But it was based on "reality" and real results.

Of course, if it turns out not be a win for ARM or s390x any more then it
should be disabled.

The current situation that Kyrill is investigating is a case where we
notice the first scheduler pass being a bit too aggressive with
creating ILP opportunities with the A15 scheduler that causes
performance differences with not turning on the first scheduler pass
vs using the defaults.

Charles has a work-in-progress patch that fixes a bug in SCHED_PRESSURE_MODEL 
that causes the above symptoms.  The bug causes 1st scheduler to unnecessarily 
increase live ranges of pseudo registers when there are a lot of instructions 
in the ready list.

Is this something that you've seen shows up in general integer code as
well ? Do you or Charles have an example for us to look at ? I'm not
sure what "lot of instructions in the ready list" really means here.
The specific case Kyrill's been looking into is Dhrystone Proc_8 when
tuned for a Cortex-A15 with neon and float-abi=hard but I am not sure
if that has "too many instructions" :) .

Kyrill, could you also look into the other cases we have from SPEC2k
where we see this as well and come back with any specific testcases
that Charles / Richard could also take a look into.

Hi all,

From what I can see the most significant regression from this pre-regalloc 
scheduling on SPEC2k is in 171.swim. It seems to suffer from similar symptoms to 
Proc_8 (lots of extra spills on the stack)


Looking forward to the patch :). Let me know if I can help with any 
testing/validation.


Kyrill

Charles, can you finish your patch in the next several days and post it for 
review?

I think we'll await this but we'll go look into some of the benchmarks.

Ramana


Thank you,

--
Maxim Kuvyrkov
www.linaro.org







Using particular register class (like floating point registers) as spill register class

2014-05-16 Thread Kugan
I would like to know if there is anyway we can use registers from
particular register class just as spill registers (in places where
register allocator would normally spill to stack and nothing more), when
it can be useful.

In AArch64, in some cases, compiling with -mgeneral-regs-only produces
better performance compared not using it. The difference here is that
when -mgeneral-regs-only is not used, floating point register are also
used in register allocation. Then IRA/LRA has to move them to core
registers before performing operations as shown below.

.
fmovs1, w8 <--
mov w21, 49622
movkw21, 0xca62, lsl 16
add w21, w16, w21
add w21, w21, w2
eor w10, w0, w10
add w10, w21, w10
ror w8, w7, 27
add w7, w10, w8
ror w7, w7, 27
fmovw0, s1 <--
add w7, w0, w7
add w13, w13, w7
fmovw0, s4 <--
add w0, w0, w20
fmovs4, w0 <--
ror w18, w18, 2
fmovw0, s2 <--
add w0, w0, w18
fmovs2, w0 <--
add w12, w12, w27
add w14, w14, w15
mov w15, w24
fmovx0, d3 <--
subsx0, x0, #1
fmovd3, x0 <--
bne .L2
fmovx0, d0 <--

 .

In this case, costs for allocnos calculated by IRA based on the cost
model supplied by the back-end is like:
a0(r667,l0) costs: GENERAL_REGS:0,0 FP_LO_REGS:3960,3960
FP_REGS:3960,3960 ALL_REGS:3960,3960 MEM:3960,3960

Thus, changing the cost of floating point register class is not going to
help. If I increase further, register allocated will just spill these
live ranges to memory and will ignore floating point register in this case.

Is there any other back-end in gcc that does anything to improve cases
like this, that I can refer to?

Thanks in advance,
Kugan


Re: Using particular register class (like floating point registers) as spill register class

2014-05-16 Thread pinskia


> On May 16, 2014, at 3:23 AM, Kugan  wrote:
> 
> I would like to know if there is anyway we can use registers from
> particular register class just as spill registers (in places where
> register allocator would normally spill to stack and nothing more), when
> it can be useful.
> 
> In AArch64, in some cases, compiling with -mgeneral-regs-only produces
> better performance compared not using it. The difference here is that
> when -mgeneral-regs-only is not used, floating point register are also
> used in register allocation. Then IRA/LRA has to move them to core
> registers before performing operations as shown below.

Can you show the code with fp register disabled?  Does it use the stack to 
spill?  Normally this is due to register to register class costs compared to 
register to memory move cost.  Also I think it depends on the processor rather 
the target.  For thunder, using the fp registers might actually be better than 
using the stack depending if the stack was in L1. 

Thanks,
Andrew

> 
> .
>fmovs1, w8 <--
>movw21, 49622
>movkw21, 0xca62, lsl 16
>addw21, w16, w21
>addw21, w21, w2
>eorw10, w0, w10
>addw10, w21, w10
>rorw8, w7, 27
>addw7, w10, w8
>rorw7, w7, 27
>fmovw0, s1 <--
>addw7, w0, w7
>addw13, w13, w7
>fmovw0, s4 <--
>addw0, w0, w20
>fmovs4, w0 <--
>rorw18, w18, 2
>fmovw0, s2 <--
>addw0, w0, w18
>fmovs2, w0 <--
>addw12, w12, w27
>addw14, w14, w15
>movw15, w24
>fmovx0, d3 <--
>subsx0, x0, #1
>fmovd3, x0 <--
>bne.L2
>fmovx0, d0 <--
> 
> .
> 
> In this case, costs for allocnos calculated by IRA based on the cost
> model supplied by the back-end is like:
> a0(r667,l0) costs: GENERAL_REGS:0,0 FP_LO_REGS:3960,3960
> FP_REGS:3960,3960 ALL_REGS:3960,3960 MEM:3960,3960
> 
> Thus, changing the cost of floating point register class is not going to
> help. If I increase further, register allocated will just spill these
> live ranges to memory and will ignore floating point register in this case.
> 
> Is there any other back-end in gcc that does anything to improve cases
> like this, that I can refer to?
> 
> Thanks in advance,
> Kugan


Re: Using particular register class (like floating point registers) as spill register class

2014-05-16 Thread Kugan


On 16/05/14 20:40, pins...@gmail.com wrote:
> 
> 
>> On May 16, 2014, at 3:23 AM, Kugan  wrote:
>>
>> I would like to know if there is anyway we can use registers from
>> particular register class just as spill registers (in places where
>> register allocator would normally spill to stack and nothing more), when
>> it can be useful.
>>
>> In AArch64, in some cases, compiling with -mgeneral-regs-only produces
>> better performance compared not using it. The difference here is that
>> when -mgeneral-regs-only is not used, floating point register are also
>> used in register allocation. Then IRA/LRA has to move them to core
>> registers before performing operations as shown below.
> 
> Can you show the code with fp register disabled?  Does it use the stack to 
> spill?  Normally this is due to register to register class costs compared to 
> register to memory move cost.  Also I think it depends on the processor 
> rather the target.  For thunder, using the fp registers might actually be 
> better than using the stack depending if the stack was in L1. 
Not all the LDR/STR combination match to fmov. In the testcase I have,

aarch64-none-linux-gnu-gcc sha_dgst.c -O2  -S  -mgeneral-regs-only
grep -c "ldr" sha_dgst.s
50
grep -c "str" sha_dgst.s
42
grep -c "fmov" sha_dgst.s
0

aarch64-none-linux-gnu-gcc sha_dgst.c -O2  -S
grep -c "ldr" sha_dgst.s
42
grep -c "str" sha_dgst.s
31
grep -c "fmov" sha_dgst.s
105

I  am not saying that we shouldn’t use floating point register here. But
from the above, it seems like register allocator is using it as more
like core register (even though the cost mode has higher cost) and then
moving the values to core registers before operations. if that is the
case, my question is, how do we just make this as spill register class
so that we will replace ldr/str with equal number of fmov when it is
possible.

Thanks,
Kugan


Offload Library

2014-05-16 Thread Kirill Yukhin
Dear steering committee,

To support the offloading features for Intel's Xeon Phi cards
we need to add a foreign library (liboffload) into the gcc repository.
README with build instructions is attached.

I am also copy-pasting the header comment from one of the liboffload files.
The header shown below will be in all the source files in liboffload.

Sources can be downloaded from [1].

Additionally to that sources we going to add few headers (released under GPL 
v2.1 license)
and couple of new sources (license in the bottom of the message).

Does this look OK?

[1] - https://www.openmprtl.org/sites/default/files/liboffload_oss.tgz

--
Thanks, K

/*
Copyright (c) 2014 Intel Corporation.  All Rights Reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

  * Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
  * Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
  * Neither the name of Intel Corporation nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

   README for Intel(R) Offload Runtime Library
   ===

How to Build Documentation
==

The main documentation is in Doxygen* format, and this distribution
should come with pre-built PDF documentation in doc/Reference.pdf.
However, an HTML version can be built by executing:

% doxygen doc/doxygen/config

in this directory.

That will produce HTML documentation in the doc/doxygen/generated
directory, which can be accessed by pointing a web browser at the
index.html file there.

If you don't have Doxygen installed, you can download it from
www.doxygen.org.


How to Build the Intel(R) Offload Runtime Library
=

The Makefile at the top-level will attempt to detect what it needs to
build the Intel(R) Offload Runtime Library.  To see the default settings,
type:

make info

You can change the Makefile's behavior with the following options:

root_dir: The path to the top-level directory containing the
  top-level Makefile.  By default, this will take on the
  value of the current working directory.

build_dir:The path to the build directory.  By default, this will
  take on value [root_dir]/build.

mpss_dir: The path to the Intel(R) Manycore Platform Software
  Stack install directory.  By default, this will take on
  the value of operating system's root directory.

compiler_host:Which compiler to use for the build of the host part.
  Defaults to "gcc"*.  Also supports "icc" and "clang"*.
  You should provide the full path to the compiler or it
  should be in the user's path.

compiler_host:Which compiler to use for the build of the target part.
  Defaults to "gcc"*.  Also supports "icc" and "clang"*.
  You should provide the full path to the compiler or it
  should be in the user's path.

options_host: Additional options for the host compiler.

options_target:   Additional options for the target compiler.

To use any of the options above, simple add =.  For
example, if you want to build with icc instead of gcc, type:

make compiler_host=icc compiler_target=icc


Supported RTL Build Configurations
==

Supported Architectures: Intel(R) 64, and Intel(R) Many Integrated
Core Architecture

  -
  |   icc/icl |gcc  |clang|
--|---|

Re: Using particular register class (like floating point registers) as spill register class

2014-05-16 Thread Andrew Haley
On 05/16/2014 12:05 PM, Kugan wrote:
> 
> 
> On 16/05/14 20:40, pins...@gmail.com wrote:
>>
>>
>>> On May 16, 2014, at 3:23 AM, Kugan  
>>> wrote:
>>>
>>> I would like to know if there is anyway we can use registers from
>>> particular register class just as spill registers (in places where
>>> register allocator would normally spill to stack and nothing more), when
>>> it can be useful.
>>>
>>> In AArch64, in some cases, compiling with -mgeneral-regs-only produces
>>> better performance compared not using it. The difference here is that
>>> when -mgeneral-regs-only is not used, floating point register are also
>>> used in register allocation. Then IRA/LRA has to move them to core
>>> registers before performing operations as shown below.
>>
>> Can you show the code with fp register disabled?  Does it use the stack to 
>> spill?  Normally this is due to register to register class costs compared to 
>> register to memory move cost.  Also I think it depends on the processor 
>> rather the target.  For thunder, using the fp registers might actually be 
>> better than using the stack depending if the stack was in L1. 
> Not all the LDR/STR combination match to fmov. In the testcase I have,
> 
> aarch64-none-linux-gnu-gcc sha_dgst.c -O2  -S  -mgeneral-regs-only
> grep -c "ldr" sha_dgst.s
> 50
> grep -c "str" sha_dgst.s
> 42
> grep -c "fmov" sha_dgst.s
> 0
> 
> aarch64-none-linux-gnu-gcc sha_dgst.c -O2  -S
> grep -c "ldr" sha_dgst.s
> 42
> grep -c "str" sha_dgst.s
> 31
> grep -c "fmov" sha_dgst.s
> 105
> 
> I  am not saying that we shouldn’t use floating point register here. But
> from the above, it seems like register allocator is using it as more
> like core register (even though the cost mode has higher cost) and then
> moving the values to core registers before operations. if that is the
> case, my question is, how do we just make this as spill register class
> so that we will replace ldr/str with equal number of fmov when it is
> possible.

I'm also seeing stuff like this:

=> 0x7fb72a0928 :  
add x21, x4, x21, lsl #3
=> 0x7fb72a092c :  
fmovw2, s8
=> 0x7fb72a0930 :  
str w2, [x21,#88]

I guess GCC doesn't know how to store an SImode value in an FP register into
memory?  This is  4.8.1.

Andrew.



soft-fp functions support without using libgcc

2014-05-16 Thread Sheheryar Zahoor Qazi
Hi all,
I am trying to provide soft-fp support to a an 18-bit soft-core
processor architecture at my university. But the problem is that
libgcc has not been cross-compiled for my target architecture and some
functions are missing so i cannot build libgcc.I believe soft-fp is
compiled in libgcc so i am usable to invoke soft-fp functions from
libgcc.
It is possible for me to provide soft-fp support without using libgcc.
How should i proceed in defining the functions? Any idea? And does any
archoitecture provide floating point support withoput using libgcc?

Regards
Sheheryar


Re: Offload Library

2014-05-16 Thread Ian Lance Taylor
On Fri, May 16, 2014 at 4:47 AM, Kirill Yukhin  wrote:
>
> To support the offloading features for Intel's Xeon Phi cards
> we need to add a foreign library (liboffload) into the gcc repository.
> README with build instructions is attached.

Can you explain why this library should be part of GCC, and how GCC
would use it?  I'm sure it's obvious to you but it's not obvious to
me.

Ian


Re: soft-fp functions support without using libgcc

2014-05-16 Thread Ian Lance Taylor
On Fri, May 16, 2014 at 6:34 AM, Sheheryar Zahoor Qazi
 wrote:
>
> I am trying to provide soft-fp support to a an 18-bit soft-core
> processor architecture at my university. But the problem is that
> libgcc has not been cross-compiled for my target architecture and some
> functions are missing so i cannot build libgcc.I believe soft-fp is
> compiled in libgcc so i am usable to invoke soft-fp functions from
> libgcc.
> It is possible for me to provide soft-fp support without using libgcc.
> How should i proceed in defining the functions? Any idea? And does any
> archoitecture provide floating point support withoput using libgcc?

I'm sorry, I don't understand the premise of your question.  It is not
necessary to build libgcc before building libgcc.  That would not make
sense.  If you have a working compiler that is missing some functions
provided by libgcc, that should be sufficient to build libgcc.

Ian


RE: Using particular register class (like floating point registers) as spill register class

2014-05-16 Thread Ian Bolton
> On 05/16/2014 12:05 PM, Kugan wrote:
> >
> >
> > On 16/05/14 20:40, pins...@gmail.com wrote:
> >>
> >>
> >>> On May 16, 2014, at 3:23 AM, Kugan
>  wrote:
> >>>
> >>> I would like to know if there is anyway we can use registers from
> >>> particular register class just as spill registers (in places where
> >>> register allocator would normally spill to stack and nothing more),
> when
> >>> it can be useful.
> >>>
> >>> In AArch64, in some cases, compiling with -mgeneral-regs-only
> produces
> >>> better performance compared not using it. The difference here is
> that
> >>> when -mgeneral-regs-only is not used, floating point register are
> also
> >>> used in register allocation. Then IRA/LRA has to move them to core
> >>> registers before performing operations as shown below.
> >>
> >> Can you show the code with fp register disabled?  Does it use the
> stack to spill?  Normally this is due to register to register class
> costs compared to register to memory move cost.  Also I think it
> depends on the processor rather the target.  For thunder, using the fp
> registers might actually be better than using the stack depending if
> the stack was in L1.
> > Not all the LDR/STR combination match to fmov. In the testcase I
> have,
> >
> > aarch64-none-linux-gnu-gcc sha_dgst.c -O2  -S  -mgeneral-regs-only
> > grep -c "ldr" sha_dgst.s
> > 50
> > grep -c "str" sha_dgst.s
> > 42
> > grep -c "fmov" sha_dgst.s
> > 0
> >
> > aarch64-none-linux-gnu-gcc sha_dgst.c -O2  -S
> > grep -c "ldr" sha_dgst.s
> > 42
> > grep -c "str" sha_dgst.s
> > 31
> > grep -c "fmov" sha_dgst.s
> > 105
> >
> > I  am not saying that we shouldn't use floating point register here.
> But
> > from the above, it seems like register allocator is using it as more
> > like core register (even though the cost mode has higher cost) and
> then
> > moving the values to core registers before operations. if that is the
> > case, my question is, how do we just make this as spill register
> class
> > so that we will replace ldr/str with equal number of fmov when it is
> > possible.
> 
> I'm also seeing stuff like this:
> 
> => 0x7fb72a0928  Thread*)+2500>:
> add   x21, x4, x21, lsl #3
> => 0x7fb72a092c  Thread*)+2504>:
> fmov  w2, s8
> => 0x7fb72a0930  Thread*)+2508>:
> str   w2, [x21,#88]
> 
> I guess GCC doesn't know how to store an SImode value in an FP register
> into
> memory?  This is  4.8.1.
> 

Please can you try that on trunk and report back.

Thanks,
Ian
 





RE: soft-fp functions support without using libgcc

2014-05-16 Thread Ian Bolton
> On Fri, May 16, 2014 at 6:34 AM, Sheheryar Zahoor Qazi
>  wrote:
> >
> > I am trying to provide soft-fp support to a an 18-bit soft-core
> > processor architecture at my university. But the problem is that
> > libgcc has not been cross-compiled for my target architecture and
> some
> > functions are missing so i cannot build libgcc.I believe soft-fp is
> > compiled in libgcc so i am usable to invoke soft-fp functions from
> > libgcc.
> > It is possible for me to provide soft-fp support without using
> libgcc.
> > How should i proceed in defining the functions? Any idea? And does
> any
> > archoitecture provide floating point support withoput using libgcc?
> 
> I'm sorry, I don't understand the premise of your question.  It is not
> necessary to build libgcc before building libgcc.  That would not make
> sense.  If you have a working compiler that is missing some functions
> provided by libgcc, that should be sufficient to build libgcc.

If you replace "cross-compiled" with "ported", I think it makes senses.
Can one provide soft-fp support without porting libgcc for their
architecture?

Cheers,
Ian





Re: soft-fp functions support without using libgcc

2014-05-16 Thread Paul_Koning

On May 16, 2014, at 12:25 PM, Ian Bolton  wrote:

>> On Fri, May 16, 2014 at 6:34 AM, Sheheryar Zahoor Qazi
>>  wrote:
>>> 
>>> I am trying to provide soft-fp support to a an 18-bit soft-core
>>> processor architecture at my university. But the problem is that
>>> libgcc has not been cross-compiled for my target architecture and
>> some
>>> functions are missing so i cannot build libgcc.I believe soft-fp is
>>> compiled in libgcc so i am usable to invoke soft-fp functions from
>>> libgcc.
>>> It is possible for me to provide soft-fp support without using
>> libgcc.
>>> How should i proceed in defining the functions? Any idea? And does
>> any
>>> archoitecture provide floating point support withoput using libgcc?
>> 
>> I'm sorry, I don't understand the premise of your question.  It is not
>> necessary to build libgcc before building libgcc.  That would not make
>> sense.  If you have a working compiler that is missing some functions
>> provided by libgcc, that should be sufficient to build libgcc.
> 
> If you replace "cross-compiled" with "ported", I think it makes senses.
> Can one provide soft-fp support without porting libgcc for their
> architecture?

By definition, in soft-fp you have to implement the FP operations in software.  
That’s not quite the same as porting libgcc to the target architecture.  It 
should translate to porting libgcc (the FP emulation part) to the floating 
point format being used.

In other words, if you want soft-fp for IEEE float, the job should be very 
simple because that has already been done.  If you want soft-fp for CDC 6000 
float, you have to do a full implementation of that.

paul



Re: we are starting the wide int merge

2014-05-16 Thread Gerald Pfeifer
On Sat, 10 May 2014, Gerald Pfeifer wrote:
> Since (at least) 16:40 UTC that day my i386-unknown-freebsd10.0 builds
> fail as follows:
> 
>   Comparing stages 2 and 3
>   warning: gcc/cc1obj-checksum.o differs
>   warning: gcc/cc1-checksum.o differs
>   warning: gcc/cc1plus-checksum.o differs
>   Bootstrap comparison failure!
>   gcc/fold-const.o differs
>   gcc/simplify-rtx.o differs
>   gcc/tree-ssa-ccp.o differs
> 
> (FreeBSD/i386 really builds for i486, but retains the original name;
> I'm traveling with limited access, but would not be surprised for this
> to also show up for i386-*-linux-gnu or i486-*-linux-gnu.)

Is anybody able to reproduce this, for example on a GNU/Linux system?

This tester of mine hasn't been able to bootstrap for nearly a week,
and timing-wise it would be really a coincidence were this not due to
wide-int.

Gerald


Re: RFC: Doc update for attribute

2014-05-16 Thread Carlos O'Donell
On 05/12/2014 11:13 PM, David Wohlferd wrote:
> After updating gcc's docs about inline asm, I'm trying to improve
> some of the related sections. One that I feel has problems with
> clarity is __attribute__ naked.
> 
> I have attached my proposed update. Comments/corrections are
> welcome.
> 
> In a related question:
> 
> To better understand how this attribute is used, I looked at the
> Linux kernel. While the existing docs say "only ... asm statements
> that do not have operands" can safely be used, Linux routinely uses
> asm WITH operands.

That's a bug. Period. You must not use naked with an asm that has
operands. Any kind of operand might inadvertently cause the compiler
to generate code and that would violate the requirements of the
attribute and potentially generate an ICE.

The correct solution, and we've talked about this in the past, is to
have the compiler generate a hard error if you use an asm statement
with operands and naked. I don't know what anyone ever got around to it.

> Some examples:
> 
> memory clobber operand:
> http://lxr.free-electrons.com/source/arch/arm/kernel/kprobes.c#L377 

Is this needed?

> Input arguments:
> http://lxr.free-electrons.com/source/arch/arm/mm/copypage-feroceon.c#L17

This is a bug and it's wrong. The naked asm can just assume the use of
first and second arguments as per AAPCS.

> Since I don't know why "asm with operands" was excluded from the
> existing docs, I'm not sure whether what Linux does here is supported
> or not (maybe with some limitations?). If someone can clarify, I'll
> add it to this text.

The "asm with operands" was excluded because to allow them in the
implementation would require gcc to potentially copy the argumnents
to temporary storage depending on their type. There is no prologue so
the compiler has no stack in which to place the arguments, therefore
the result is an impossible to satisfy constraint which usually results
in an ICE or compiler error.

Even if you said it was OK to use the incoming arguments with "r" type
operands the optimization level of the compile might inadvertently
try to force those values to the stack and that again is an impossible
to satisfy condition with a naked function.

> Even without discussing "asm with operands," I believe this text is 
> an improvement.
> Thanks in advance,
> dw
> 
> extend.texi.patch
> 
> 
> Index: extend.texi
> ===
> --- extend.texi   (revision 210349)
> +++ extend.texi   (working copy)
> @@ -3330,16 +3330,15 @@
>  
>  @item naked
>  @cindex function without a prologue/epilogue code
> -Use this attribute on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX and SPU
> -ports to indicate that the specified function does not need prologue/epilogue
> -sequences generated by the compiler.
> -It is up to the programmer to provide these sequences. The
> -only statements that can be safely included in naked functions are
> -@code{asm} statements that do not have operands.  All other statements,
> -including declarations of local variables, @code{if} statements, and so
> -forth, should be avoided.  Naked functions should be used to implement the
> -body of an assembly function, while allowing the compiler to construct
> -the requisite function declaration for the assembler.
> +This attribute is available on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX 
> +and SPU ports.  It allows the compiler to construct the requisite function 
> +declaration, while allowing the body of the function to be assembly code. 
> +The specified function will not have prologue/epilogue sequences generated 
> +by the compiler; it is up to the programmer to provide these sequences if 
> +the function requires them. The expectation is that only Basic @code{asm} 
> +statements will be included in naked functions (@pxref{Basic Asm}). While it 
> +is discouraged, it is possible to write your own prologue/epilogue code 
> +using asm and use ``C'' code in the middle.

I wouldn't remove the last sentence since IMO it's not the intent of the feature
to ever support that and the compiler doesn't guarantee it and may result
in wrong code given that `naked' is a fragile low-level feature.

>  
>  @item near
>  @cindex functions that do not handle memory bank switching on 68HC11/68HC12

Cheers,
Carlos.


Re: Offload Library

2014-05-16 Thread Thomas Schwinge
Hi!

On Fri, 16 May 2014 15:47:58 +0400, Kirill Yukhin  
wrote:
> To support the offloading features for Intel's Xeon Phi cards
> we need to add a foreign library (liboffload) into the gcc repository.

As written in the README, this library currently is specific to Intel
hardware (understandably, of course), and I assume also in the future is
to remain that way (?) -- should it thus get a more specific name in GCC,
than the generic liboffload?

> Additionally to that sources we going to add few headers [...]
> and couple of new sources

For interfacing with GCC, presumably.  You haven't stated it explicitly,
but do I assume right that this work will be going onto the
gomp-4_0-branch, integrated with the offloading work developed there, as
a plugin for libgomp?


Grüße,
 Thomas


pgpYu694qEtjc.pgp
Description: PGP signature


Re: Using particular register class (like floating point registers) as spill register class

2014-05-16 Thread Vladimir Makarov

On 2014-05-16, 6:23 AM, Kugan wrote:

I would like to know if there is anyway we can use registers from
particular register class just as spill registers (in places where
register allocator would normally spill to stack and nothing more), when
it can be useful.

In AArch64, in some cases, compiling with -mgeneral-regs-only produces
better performance compared not using it. The difference here is that
when -mgeneral-regs-only is not used, floating point register are also
used in register allocation. Then IRA/LRA has to move them to core
registers before performing operations as shown below.

.
fmovs1, w8 <--
mov w21, 49622
movkw21, 0xca62, lsl 16
add w21, w16, w21
add w21, w21, w2
eor w10, w0, w10
add w10, w21, w10
ror w8, w7, 27
add w7, w10, w8
ror w7, w7, 27
fmovw0, s1 <--
add w7, w0, w7
add w13, w13, w7
fmovw0, s4 <--
add w0, w0, w20
fmovs4, w0 <--
ror w18, w18, 2
fmovw0, s2 <--
add w0, w0, w18
fmovs2, w0 <--
add w12, w12, w27
add w14, w14, w15
mov w15, w24
fmovx0, d3 <--
subsx0, x0, #1
fmovd3, x0 <--
bne .L2
fmovx0, d0 <--

  .

In this case, costs for allocnos calculated by IRA based on the cost
model supplied by the back-end is like:
a0(r667,l0) costs: GENERAL_REGS:0,0 FP_LO_REGS:3960,3960
FP_REGS:3960,3960 ALL_REGS:3960,3960 MEM:3960,3960

Thus, changing the cost of floating point register class is not going to
help. If I increase further, register allocated will just spill these
live ranges to memory and will ignore floating point register in this case.

Is there any other back-end in gcc that does anything to improve cases
like this, that I can refer to?



There is a target hook spill_class.  You can see how can it be defined 
in i386.c.  Instead of memory, the pseudos are stored in vector regs. 
It is profitable for modern Intel processors which have a fast path 
between general regs and SSE regs.  It results in generation of smaller 
code too as movd is shorter than ld/st insns.


So you can increase costs of fp regs and define the hook, then fp regs 
will be used for pseudos not getting general regs and fmov will be 
generated instead of ld/st.


I am working on improving spilling general regs into vector ones.  So I 
hope there will be more cases when GCC does it.






Re: [GSoC] a wiki page on the gcc wiki

2014-05-16 Thread Roman Gareev
Thank you!

--
Cheers, Roman Gareev


Re: [GSoC] How to get started with the isl code generation

2014-05-16 Thread Roman Gareev
Hi Tobias,

> what is the difference you see between ISL AST generation and code
> generation?

By “ISL AST generation”, I mean ISL AST generation without generation
of GIMPLE code.

> What are your plans to separate the ISL AST generation? Do you foresee any
> difficulties/problems?

According to the plan mentioned in my proposal, I wanted to get more
familiar with ISL AST generation by generation of ISL AST in a file,
which is separate from the GCC sources. This could help to avoid
problems with interpretation and verification of results, because I
worked with my own input to ISL AST generator instead of the input
built by Graphite from GIMPLE code. This could also help to avoid
rebuilding of GCC in the process of debugging. However, I've come to
the conclusion that the way you advised me is better, because it helps
to save the time of integration of ISL AST generation in GCC.

I've set up a second code generation in parallel that generates ISL
AST and can be enabled by a command line flag. Could you please advise
me how to verify the results of this generation?

Below is the code of this generation.

--

Cheers, Roman Gareev


code
Description: Binary data


[GSoC] Status - 20140516

2014-05-16 Thread Maxim Kuvyrkov
Hi Community,

The community bonding period is coming to a close, students can officially 
start coding on Monday, May 19th.

In the past month the student should have applied for FSF copyright assignment 
and, hopefully, executed on a couple of test tasks to get a feel for GCC 
development.

The GSoC Reunion (an unconference to discuss results of concluded GSoC) will be 
held in San Jose, CA, on 23-26 October 2014.  GCC gets to send 2 delegates on 
Google's dime (airfare, hotel, food), but more can attend via a registration 
lottery and covering their own expenses.  If you are interested in going to 
GSoC Reunion, please let me know.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org





Re: [GSoC] Status - 20140516

2014-05-16 Thread Tobias Grosser



On 17/05/2014 00:27, Maxim Kuvyrkov wrote:

Hi Community,

The community bonding period is coming to a close, students can officially 
start coding on Monday, May 19th.

In the past month the student should have applied for FSF copyright assignment 
and, hopefully, executed on a couple of test tasks to get a feel for GCC 
development.


In the last mail, I got the impression that you will keep track of the 
copyright assignments. Is this the case?


Cheers,
Tobias


Re: [GSoC] Status - 20140516

2014-05-16 Thread Maxim Kuvyrkov
On May 17, 2014, at 10:41 AM, Tobias Grosser  wrote:

> 
> 
> On 17/05/2014 00:27, Maxim Kuvyrkov wrote:
>> Hi Community,
>> 
>> The community bonding period is coming to a close, students can officially 
>> start coding on Monday, May 19th.
>> 
>> In the past month the student should have applied for FSF copyright 
>> assignment and, hopefully, executed on a couple of test tasks to get a feel 
>> for GCC development.
> 
> In the last mail, I got the impression that you will keep track of the 
> copyright assignments. Is this the case?

Yes.  Two of the students already have copyright assignment in place, and I 
have asked the other 3 about their assignment progress today.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org



Re: [GSoC] Status - 20140516

2014-05-16 Thread Tobias Grosser

On 17/05/2014 00:43, Maxim Kuvyrkov wrote:

On May 17, 2014, at 10:41 AM, Tobias Grosser  wrote:




On 17/05/2014 00:27, Maxim Kuvyrkov wrote:

Hi Community,

The community bonding period is coming to a close, students can officially 
start coding on Monday, May 19th.

In the past month the student should have applied for FSF copyright assignment 
and, hopefully, executed on a couple of test tasks to get a feel for GCC 
development.


In the last mail, I got the impression that you will keep track of the 
copyright assignments. Is this the case?


Yes.  Two of the students already have copyright assignment in place, and I 
have asked the other 3 about their assignment progress today.


Great. Could you let me know when Roman's copyright assignment is in?

Thanks,
Tobias


Re: RFC: Doc update for attribute

2014-05-16 Thread David Wohlferd
Thank you for your response.  This is exactly what I wanted to know. One 
last question:



+While it
+is discouraged, it is possible to write your own prologue/epilogue code
+using asm and use ``C'' code in the middle.

I wouldn't remove the last sentence since IMO it's not the intent of the feature
to ever support that and the compiler doesn't guarantee it and may result
in wrong code given that `naked' is a fragile low-level feature.


I'm assuming you meant "would remove."

I wasn't comfortable including that sentence, but I was following the 
existing docs.  Since they said you could "only" use basic asm, 
following that with a warning to "avoid" locals/if/etc was really 
confusing without this text.


Also, as ugly as this is, apparently some people really do this (comment 
6): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43404#c6


We don't have to doc every crazy thing people try to do with gcc. But 
since it's out there, maybe we should this time?  If only to discourage it.


I'm *slightly* more in favor of keeping it.  But if you still feel it 
should go, it's gone.


Thanks,
dw