[Aarch64] Vector Function Application Binary Interface Specification for OpenMP

2017-03-15 Thread Sekhar, Ashwin
Hi GCC Team, Aarch64 Maintainers,


The rules in Vector Function Application Binary Interface Specification  for 
OpenMP  
(https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt)
  is used in x86 for generating the simd clones of a function.


Is there a similar one defined for Aarch64?


If not, would like to start a discussion on the same for Aarch64. To  kick 
start the same, a draft proposal for Aarch64 (on the same lines as  x86 ABI) is 
included below. The only change from x86 ABI is in the  function name mangling. 
Here the letter 'b' is used for indicating the  ASIMD isa.


Please review and comment.


Thanks and Regards,

Ashwin Sekhar T K



 CUT HERE --




 Aarch64 Vector Function Application Binary Interface Specification for OpenMP


1. Vector Function ABI Overview

Aarch64 Vector Function ABI provides ABI for the vector functions generated by
compiler supporting SIMD constructs of OpenMP 4.0 [1] in Aarch64. This is
based on the x86 Vector Function Application Binary Interface Specification for
OpenMP [2].



2. Vector Function ABI

Vector Function ABI defines a set of rules that the caller and the callee
functions must obey.

These rules consist of:
  * Calling convention
  * Vector length (the number of concurrent scalar invocations to be processed
    per invocation of the vector function)
  * Mapping from element data types to vector data types
  * Ordering of vector arguments
  * Vector function masking
  * Vector function name mangling
  * Compiler generated variants of vector function



2.1. Calling Convention

The vector functions should use calling convention described in Procedure Call
Standard for the ARM 64-bit Architecture (AArch64) [3].



2.2. Vector Length

Every vector variant of a SIMD-enabled function has a vector length (VLEN). If
OpenMP clause "simdlen" is used, the VLEN is the value of the argument of that
clause. The VLEN value must be power of 2. In other case the notion of the
function`s "characteristic data type" (CDT) is used to compute the vector
length.

CDT is defined in the following order:
  a) For non-void function, the CDT is the return type.
  b) If the function has any non-uniform, non-linear parameters, then the CDT
 is the type of the first such parameter.
  c) If the CDT determined by a) or b) above is struct, union, or class type
 which is pass-by-value (except for the type that maps to the built-in
 complex data type), the characteristic data type is int.
  d) If none of the above three cases is applicable, the CDT is int.

VLEN  = sizeof(vector_register) / sizeof(CDT),

For example, if ISA is ASIMD, sizeof(vector_register) = 16, as the vector
registers are 128 bit. And if the CDT of the function is "int", sizeof(CDT) = 4.
So, VLEN = 4.



2.3. Element Data Type to Vector Data Type Mapping

The vector data types for parameters are selected depending on ISA, vector
length, data type of original parameter, and parameter specification.

For uniform and linear parameters (detailed description could be found in [1]),
the original data type is preserved.

For vector parameters, vector data types are selected by the compiler. The
mapping from element data type to vector data type is described as below.

  * The bit size of vector data type of parameter is computed as:

    size_of_vector_data_type = VLEN * sizeof(original_parameter_data_type) * 8

    For instance, for ASIMD version of vector function with parameter data type
    "int": If VLEN = 4, size_of_vector_data_type = 4 * 4 * 8 = 128 (bits), which
    means one argument of type __m128 to be passed.

  * If the size_of_vector_data_type is greater than the width of the vector
    register, multiple vector registers are selected and the parameter will be
    passed in multiple vector registers.

    For instance, for ASIMD version of vector function with parameter data type
    "int":

    If VLEN = 8, size_of_vector_data_type = 8 * 4 * 8 = 256 (bits), so the
    vector data type is __m256, which means 2 arguments of type __m128 are to
    be passed.



2.4. Ordering of Vector Arguments

  * When a parameter in the original data type results in one argument in the
    vector function, the ordering rule is a simple one to one match with the
    original argument order.
    
    For example, when the original  argument list is (int a, float b,

Re: Obsolete powerpc*-*-*spe*

2017-03-15 Thread Olivier Hainque
Hello Andrew,

> On Mar 13, 2017, at 19:01 , Andrew Jenner  wrote:
> 
> I volunteer to be the point of contact for the SPE port.
> 
> Over here at CodeSourcery/Mentor Embedded, we have a strong interest in SPE 
> *not* being deprecated (we actively ship toolchain products with SPE 
> multilibs, and have customers for which these are important). We are 
> therefore volunteering resources (specifically, me) to maintain SPE upstream 
> as well.
> 
> I am in the process of developing some patches to add VLE support upstream 
> (and expect to be maintainer of those once they are committed) so it would be 
> a good fit for me to be the SPE maintainer as well.
> 
> We have been regularly running tests on the SPE multilibs (on our internal 
> branches) and they are in better shape than the test results Segher found 
> from 2015. We may have some (not yet upstreamed) patches that improve the 
> test results - I will be tracking these down and upstreaming them ASAP. I 
> will be expanding our regular build and test runs to cover trunk as well, and 
> will send test results to gcc-testsuite and report regressions.
> 
> If there is no objection, I will submit patches tomorrow to un-obsolete SPE 
> and add myself to the appropriate section of the MAINTAINERS file. The other 
> changes will come once stage 1 opens.

Thanks for volunteering!

As mentioned upthread, we (AdaCore) also have a significant user base,
so a strong interest in the port remaining alive and we'll be happy to
keep submitting patches we might have.

The perspective of seeing VLE support come in is great news :)

Best Wishes,


Olivier








Re: Diagnostics that should not be translated

2017-03-15 Thread Richard Earnshaw (lists)
On 15/03/17 02:43, Martin Sebor wrote:
> On 03/12/2017 04:51 PM, Roland Illig wrote:
>> Hi,
>>
>> the gcc.pot file currently contains more than 12000 messages to be
>> translated, which is a very high number. Many of these messages are
>> diagnostics, and they can be categorized as follows:
>>
>> * errors in user programs, reported via error ()
>> * additional info for internal errors, reported via error ()
>> * internal errors, reported via gfc_internal_error ()
>> * others
>>
>> In my opinion, there is no point in having internal error messages
>> translated, since their only purpose is to be sent back to the GCC
>> developers, instead of being interpreted and acted upon by the GCC user.
>> By not translating the internal errors, the necessary work for
>> translators can be cut down by several hundred messages.
>>
>> Therefore the internal errors should not be translated at all.
>> https://gcc.gnu.org/codingconventions.html#Diagnostics currently
>> mentions a few ways of emitting diagnostics, but there is no way to
>> produce untranslated diagnostics. Therefore I'd like to have a new
>> function error_no_i18n that is used instead of error for nonfatal
>> internal errors, like this:
>>
>> @code{error_no_i18n} is for diagnostics that are printed before invoking
>> internal_error. They are not translated.
> 
> AFAIK, internal errors are reported using the internal_error
> or internal_error_no_backtrace APIs.
> 
> Would using the existing internal_error{,no_backtrace}, and
> sorry work for this? (I.e., not translating those.)  If my
> count is right there are nearly 500 calls to these three in
> GCC sources so I'm not sure that would put enough of a dent
> in the 12K messages to translate but I'm even less sure that
> adding yet another API would do even that much.
> 

I think sorry messages need to be translated: the user has hit a
limitation in the compiler that they need to be aware of (such as
unsupported option combination).

R.

> There are 20-some-odd functions to report diagnostics in GCC
> (counting those in diagnostic-core.h).  I haven't used them all
> or even understand all their differences and use cases yet so
> I would rather no add more to the list if it can be helped.
> 
> Martin
> 



Buildbout: how add new target

2017-03-15 Thread Fiodar Stryzhniou via gcc
I want add new target arm-none-symbianelf. I have sh script what does magic. 
You can download it here - https://github.com/fedor4ever/GCC-4-Symbian
Fiodar Stryzhniou



Prerequisites and using autoconf-2.64

2017-03-15 Thread Martin Liška

Hello.

We declare that one should use autoconf --version == 2.64. I have a package 
that provides autoconf-2.64 binary
and I would like to ask whether our configure machinery can use the suffixed 
autoconf binary?

Thanks,
Martin


Re: Prerequisites and using autoconf-2.64

2017-03-15 Thread Andreas Schwab
On Mär 15 2017, Martin Liška  wrote:

> We declare that one should use autoconf --version == 2.64. I have a package 
> that provides autoconf-2.64 binary
> and I would like to ask whether our configure machinery can use the suffixed 
> autoconf binary?

make AUTOCONF=autoconf-2.64

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Prerequisites and using autoconf-2.64

2017-03-15 Thread Martin Liška

On 03/15/2017 02:23 PM, Andreas Schwab wrote:

On Mär 15 2017, Martin Liška  wrote:


We declare that one should use autoconf --version == 2.64. I have a package 
that provides autoconf-2.64 binary
and I would like to ask whether our configure machinery can use the suffixed 
autoconf binary?


make AUTOCONF=autoconf-2.64

Andreas.



Thanks!


March 2017 GNU Toolchain Update

2017-03-15 Thread Nick Clifton
Hi Guys,

  There is a lot to tell you about this time.  First up is glibc:
  
The GNU C Library version 2.25 is now available.  In this version you
will find:

  * Provisional support for ISO/IEC TR 24731-1:2010 which provides
replacement versions of many of the memory allocation functions
that include bounds checking.

  * Provisional support for ISO/IEC TR 24731-2:2010 which provides
replacement versions of many of the memory allocation functions
that use dynamically allocated memory to ensure that buffer
overflow do not occur.

  * Provisional support for *some* of the new features that are
defined in ISO/IEC TS 18661-4:2015, which is an implementation of
the ISO/IEC/IEEE 60559:2011 floating point standard.

  * Full support for ISO/IEC TS 18661-1:2014 standard which implements 
support for binary floating-point arithmetic conforming to
ISO/IEC/IEEE 60559:2011.  This does not cover decimal
floating-point arithmetic, nor does it cover many of the optional
features of IEC 60559.

  * Glibc can now be built with the stack smashing protector enabled.
This means that distributions can build glibc with 
--enable-stack-protector=strong defined and expect it to work. 

  * The function explicit_bzero, from OpenBSD, has been added to glibc.
It is intended to be used instead of memset() to erase sensitive
data after use; the compiler will not optimize out calls to
explicit_bzero even if they are "unnecessary" (in the sense that
no _correct_ program can observe the effects of the memory clear).

  * On ColdFire, MicroBlaze, Nios II and SH3, the float_t type is now
defined to float instead of double.  This does not affect the ABI of
any libraries that are part of the GNU C Library, but may affect the
ABI of other libraries that use this type in their interfaces.

  * On x86_64, when compiling with -mfpmath=387 or -mfpmath=sse+387,
the float_t and double_t types are now defined to long double
instead of float and double.  These options are not the default,
and this does not affect the ABI of any libraries that are part of
the GNU C Library, but it may affect the ABI of other libraries
that use this type in their interfaces, if they are compiled or
used with those options.

  * The getentropy and getrandom functions, and the 
header file have been added.

  * The buffer size for byte-oriented stdio streams is now limited to
8192 bytes by default.  Previously, on Linux, the default buffer
size on most file systems was 4096 bytes (and thus remains
unchanged), except on network file systems, where the buffer size
was unpredictable and could be as large as several megabytes.

  * GDB pretty printers have been added for mutex and condition
variable structures in POSIX Threads. When installed and loaded in
gdb these pretty printers show various pthread variables in
human-readable form when read using the 'print' or 'display'
commands in gdb.

  * A tunables feature has been added to allow tweaking of the runtime
for an application program.  This feature can be enabled with the
'--enable-tunables' configure flag when glibc is built.  The GNU C
Library manual has details on usage and README.tunables file has
instructions on adding new tunables to the library.

Meanwhile in the mainline development sources of glibc you can also
find these new features:

  * Unicode 9.0.0 Support: Character encoding, character type info,
and transliteration tables are all updated to Unicode 9.0.0.

  * The rpcgen, librpcsvc and related headers will only be built and
installed when glibc is configured with --enable-obsolete-rpc.
This allows alternative RPC implementations, like TIRPC, to be
used by default.  Applications needing features missing from TIRPC
should consider the rpcsvc-proto project developed by Thorsten
Kukuk (SUSE).


GDB 7.12.1 has been released.   This is a point release over 7.12
and so it only contains bug-fixes rather then new features, but if you
have been using 7.12 then it is worth upgrading to 7.12.1.

Version 5 of the DWARF debug standard has now been officially
released.  This has improvements in many areas, such as: better data
compression, separation of debugging data from executable files,
improved description of macros and source files, faster searching for
symbols, improved debugging of optimized code, as well as numerous
improvements in functionality and performance.

If you want to read the new standard it is here:
  http://dwarfstd.org/doc/DWARF5.pdf

Meanwhile in the development sources:

  * GDB can now access to the PKU register on x86_64 GNU/Linux. The
register is added by the Memory Protection Keys for Userspace
feature which will be available in future Intel CPUs.

  * Python scripts can now start, stop and access a running btrace
recording.

  * GDB now supports recording and replaying rdrand and rdseed Intel 

Re: Obsolete powerpc*-*-*spe*

2017-03-15 Thread Segher Boessenkool
Hi all,

On Wed, Mar 15, 2017 at 11:01:31AM +0100, Olivier Hainque wrote:
> > On Mar 13, 2017, at 19:01 , Andrew Jenner  wrote:
> > I volunteer to be the point of contact for the SPE port.
> > 
> > Over here at CodeSourcery/Mentor Embedded, we have a strong interest in SPE 
> > *not* being deprecated (we actively ship toolchain products with SPE 
> > multilibs, and have customers for which these are important). We are 
> > therefore volunteering resources (specifically, me) to maintain SPE 
> > upstream as well.
> > 
> > I am in the process of developing some patches to add VLE support upstream 
> > (and expect to be maintainer of those once they are committed) so it would 
> > be a good fit for me to be the SPE maintainer as well.
> > 
> > We have been regularly running tests on the SPE multilibs (on our internal 
> > branches) and they are in better shape than the test results Segher found 
> > from 2015. We may have some (not yet upstreamed) patches that improve the 
> > test results - I will be tracking these down and upstreaming them ASAP. I 
> > will be expanding our regular build and test runs to cover trunk as well, 
> > and will send test results to gcc-testsuite and report regressions.
> > 
> > If there is no objection, I will submit patches tomorrow to un-obsolete SPE 
> > and add myself to the appropriate section of the MAINTAINERS file. The 
> > other changes will come once stage 1 opens.
> 
> Thanks for volunteering!
> 
> As mentioned upthread, we (AdaCore) also have a significant user base,
> so a strong interest in the port remaining alive and we'll be happy to
> keep submitting patches we might have.
> 
> The perspective of seeing VLE support come in is great news :)

It is good to hear there are some parties who promise to do some actual
work for this :-)

I do not think VLE can get in, not in its current shape at least.  VLE
is very unlike PowerPC in many ways so it comes at a very big cost to
the port (maintenance and otherwise -- maintenance is what I care about
most).

Since SPE and VLE only share the part of the rs6000 port that doesn't
change at all (except for a bug fix once or twice a year), and everything
else needs special cases all over the place, it seems to me it would be
best for everyone if we split the rs6000 port in two, one for SPE and VLE
and one for the rest.  Both ports could then be very significantly
simplified.

I am assuming SPE and VLE do not support AltiVec or 64-bit PowerPC,
please correct me if that is incorrect.  Also, is "normal" floating
point supported at all?

Do you (AdaCore and Mentor) think splitting the port is a good idea?


Segher


Re: [RFC] Support register groups in inline asm

2017-03-15 Thread Andrew Senkevich
2016-12-05 16:31 GMT+01:00 Andrew Senkevich :
> 2016-11-16 8:02 GMT+03:00 Andrew Pinski :
>> On Tue, Nov 15, 2016 at 9:36 AM, Andrew Senkevich
>>  wrote:
>>> Hi,
>>>
>>> new Intel instructions AVX512_4FMAPS and AVX512_4VNNIW introduce use
>>> of register groups.
>>>
>>> To support register groups feature in inline asm needed some extension
>>> with new constraints.
>>>
>>> Current proposal is the following syntax:
>>>
>>> __asm__ (“SMTH %[group], %[single]" :
>>> [single] 
>>> "+x"(v0) :
>>> [group]
>>> "Yg4"(v1),  “1+1"(v2), “1+2"(v3), “1+3"(v4));
>>>
>>> where "YgN" constraint specifies group of N consecutive registers
>>> (which is started from register having number as "0 mod
>>> 2^ceil(log2(N))"),
>>> and "1+K" specifies the next registers in the group.
>>>
>>> Is this syntax ok? How to implement it?
>>
>>
>> Have you looked into how AARCH64 back-end handles this via OI, etc.
>> Like:
>> /* Oct Int: 256-bit integer mode needed for 32-byte vector arguments.  */
>> INT_MODE (OI, 32);
>>
>> /* Opaque integer modes for 3 or 4 Neon q-registers / 6 or 8 Neon d-registers
>>(2 d-regs = 1 q-reg = TImode).  */
>> INT_MODE (CI, 48);
>> INT_MODE (XI, 64);
>>
>>
>> And then it implements TARGET_ARRAY_MODE_SUPPORTED_P. target hook?
>> And the x2 types are defined as a struct of an array like:
>> typedef struct int8x8x2_t
>> {
>>   int8x8_t val[2];
>> } int8x8x2_t;
>
> Thanks!
>
> We have to update proposal with changing "+" symbol to "#" specifying
> offset in a group (to avoid overloading the other meaning of “+”
> specifying that operand is both input and output).
>
> So current proposal of syntax is:
>
> __asm__ (“INSTR %[group], %[single]" :
> [single] "+x"(v0) 
> :
> [group]
> "Yg4"(v1),  “1#1"(v2), “1#2"(v3), “1#3"(v4));
>
> where "YgN" constraint specifies group of N consecutive registers
> (which is started from register having number as "0 mod 2^ceil(log2(N))"),
> and "1#K" specifies the next registers in the group.
>
> Some other questions or comments?
>
> What about consensus on this syntax?

Hi Richard!

Can we have agreement on this syntax, what do you think?


--
WBR,
Andrew


Re: Diagnostics that should not be translated

2017-03-15 Thread Roland Illig
Am 15.03.2017 um 03:43 schrieb Martin Sebor:
> Would using the existing internal_error{,no_backtrace}, and
> sorry work for this? (I.e., not translating those.)  If my
> count is right there are nearly 500 calls to these three in
> GCC sources so I'm not sure that would put enough of a dent
> in the 12K messages to translate but I'm even less sure that
> adding yet another API would do even that much.

In relative terms the 500 may seem like not so much, but in absolute
terms they are still worth 1 to 3 days of translating work. Especially
since many of the terms in the internal errors are not as carefully
worded as the diagnostics targeted at the GCC user, and they contain
lots of technical terms for which there is no obvious translation.

For the German translation I took the easy path of making the German
internal errors exactly the same as the English ones, so whether this is
addressed or not won't make a difference for the upcoming release. It's
just that I think the other translators shouldn't need to go through the
same steps as I did.

Regards,
Roland


Re: Obsolete powerpc*-*-*spe*

2017-03-15 Thread Olivier Hainque

> On Mar 15, 2017, at 15:26 , Segher Boessenkool  
> wrote:

> I do not think VLE can get in, not in its current shape at least.  VLE
> is very unlike PowerPC in many ways so it comes at a very big cost to
> the port (maintenance and otherwise -- maintenance is what I care about
> most).
> 
> Since SPE and VLE only share the part of the rs6000 port that doesn't
> change at all (except for a bug fix once or twice a year), and everything
> else needs special cases all over the place, it seems to me it would be
> best for everyone if we split the rs6000 port in two, one for SPE and VLE
> and one for the rest.  Both ports could then be very significantly
> simplified.
> 
> I am assuming SPE and VLE do not support AltiVec or 64-bit PowerPC,
> please correct me if that is incorrect.  Also, is "normal" floating
> point supported at all?
> 
> Do you (AdaCore and Mentor) think splitting the port is a good idea?

That's actually an option we considered.

We haven't gone very far in studying what this would entail and were still
unclear on how much of a clean separation we could get without risking the
introduction of (too much) instability.

It seemed like avoiding code duplication (that would otherwise be a maintenance
issue) might require refactoring in sensitive areas, e.g. prologue/epilogue
expansion, but the perspective of getting two variants simpler to grasp on top
of common code definitely sounds attractive and worth some effort.

Olivier



Re: Diagnostics that should not be translated

2017-03-15 Thread Martin Sebor

On 03/15/2017 10:07 AM, Roland Illig wrote:

Am 15.03.2017 um 03:43 schrieb Martin Sebor:

Would using the existing internal_error{,no_backtrace}, and
sorry work for this? (I.e., not translating those.)  If my
count is right there are nearly 500 calls to these three in
GCC sources so I'm not sure that would put enough of a dent
in the 12K messages to translate but I'm even less sure that
adding yet another API would do even that much.


In relative terms the 500 may seem like not so much, but in absolute
terms they are still worth 1 to 3 days of translating work. Especially
since many of the terms in the internal errors are not as carefully
worded as the diagnostics targeted at the GCC user, and they contain
lots of technical terms for which there is no obvious translation.

For the German translation I took the easy path of making the German
internal errors exactly the same as the English ones, so whether this is
addressed or not won't make a difference for the upcoming release. It's
just that I think the other translators shouldn't need to go through the
same steps as I did.


I would suggest to open a request in Bugzilla then and explain
that internal_error{,no_backtrace} strings don't need to be
translated.  (From Richard's reply it sounds like the "sorry"
ones still do).

Personally, I think it's less work for everyone not to have to
worry about translating these so it seems like a win-win.  It
would be helpful to put in place some sort of a checker to catch
some of these problems (or unnecessary annotation if there's
consensus about your proposal) early, during development.

Since there have been a number of general suggestions recently
to improve how GCC deals with internationalization it might
also be helpful to summarize those that end up adopted to
the GCC Diagnostic Guidelines Wiki:

  https://gcc.gnu.org/wiki/DiagnosticsGuidelines

Martin


Re: Obsolete powerpc*-*-*spe*

2017-03-15 Thread Sandra Loosemore

On 03/15/2017 08:26 AM, Segher Boessenkool wrote:


Since SPE and VLE only share the part of the rs6000 port that doesn't
change at all (except for a bug fix once or twice a year), and everything
else needs special cases all over the place, it seems to me it would be
best for everyone if we split the rs6000 port in two, one for SPE and VLE
and one for the rest.  Both ports could then be very significantly
simplified.

I am assuming SPE and VLE do not support AltiVec or 64-bit PowerPC,
please correct me if that is incorrect.  Also, is "normal" floating
point supported at all?

Do you (AdaCore and Mentor) think splitting the port is a good idea?


I can't speak on behalf of Mentor, and Andrew is the target expert here, 
but we currently do ship all of SPE, VLE, and AltiVec multilibs in the 
same powerpc-eabi toolchain.  Specifically, the list is


default (603e, e300c3, G2, etc)
-te500v1
-te500v2
-te500mc
-te600
-te200z0
-te200z3
-te200z4

plus some soft-float variants, etc.  Splitting these up into multiple 
toolchains that have to be statically configured for a particular 
architecture wouldn't be zero-cost either for us, other groups in Mentor 
that repackage our toolchains, or our end users.


I'm wondering whether the code in the rs6000 backend could be refactored 
to better abstract and separate the complicated processor-dependent bits?


-Sandra



Re: Obsolete powerpc*-*-*spe*

2017-03-15 Thread Segher Boessenkool
On Wed, Mar 15, 2017 at 11:12:53AM -0600, Sandra Loosemore wrote:
> >Do you (AdaCore and Mentor) think splitting the port is a good idea?
> 
> I can't speak on behalf of Mentor, and Andrew is the target expert here, 
> but we currently do ship all of SPE, VLE, and AltiVec multilibs in the 
> same powerpc-eabi toolchain.  Specifically, the list is
> 
> default (603e, e300c3, G2, etc)
> -te500v1
> -te500v2

These two are SPE.

> -te500mc
> -te600

These two are "classic" PowerPC.

> -te200z0
> -te200z3
> -te200z4

These are VLE?  Do some of those also support PowerPC?

> plus some soft-float variants, etc.  Splitting these up into multiple 
> toolchains that have to be statically configured for a particular 
> architecture wouldn't be zero-cost either for us, other groups in Mentor 
> that repackage our toolchains, or our end users.

SPE *always* has to be statically configured for; you do not get support
for the SPE ABIs unless you specifically configure for it.

> I'm wondering whether the code in the rs6000 backend could be refactored 
> to better abstract and separate the complicated processor-dependent bits?

All the abstraction and indirection is part of what makes things so
complex :-(

Things that are complicated are for example the things that touch the
code in many places.  Like, vector types, register classes, output
modifiers, ABIs.  All of which are all over the machine description
and hooks.


Segher


Re: Obsolete powerpc*-*-*spe*

2017-03-15 Thread David Edelsohn
On Wed, Mar 15, 2017 at 1:12 PM, Sandra Loosemore
 wrote:
> On 03/15/2017 08:26 AM, Segher Boessenkool wrote:
>
>> Since SPE and VLE only share the part of the rs6000 port that doesn't
>> change at all (except for a bug fix once or twice a year), and everything
>> else needs special cases all over the place, it seems to me it would be
>> best for everyone if we split the rs6000 port in two, one for SPE and VLE
>> and one for the rest.  Both ports could then be very significantly
>> simplified.
>>
>> I am assuming SPE and VLE do not support AltiVec or 64-bit PowerPC,
>> please correct me if that is incorrect.  Also, is "normal" floating
>> point supported at all?
>>
>> Do you (AdaCore and Mentor) think splitting the port is a good idea?
>
>
> I can't speak on behalf of Mentor, and Andrew is the target expert here, but
> we currently do ship all of SPE, VLE, and AltiVec multilibs in the same
> powerpc-eabi toolchain.  Specifically, the list is
>
> default (603e, e300c3, G2, etc)
> -te500v1
> -te500v2
> -te500mc
> -te600
> -te200z0
> -te200z3
> -te200z4
>
> plus some soft-float variants, etc.  Splitting these up into multiple
> toolchains that have to be statically configured for a particular
> architecture wouldn't be zero-cost either for us, other groups in Mentor
> that repackage our toolchains, or our end users.
>
> I'm wondering whether the code in the rs6000 backend could be refactored to
> better abstract and separate the complicated processor-dependent bits?

Sandra,

PowerPC, SPE and VLE are, to a large extent, different ISAs that share
some instruction mnemonics.  It requires overloading basic,
fundamental patterns in the GCC machine description.  Regardless of
way in which it is abstracted and refactored, it is going to be
complicated and difficult to maintain.

VLE is not going to be merged into the current rs6000 port of GCC.  If
AdaCore, Mentor and its customers want that functionality in FSF GCC,
they are going to have to take on the burden of a separate port.

I realize that splitting the port is not zero cost for Mentor, but
currently the vast majority of the maintenance cost is falling to the
IBM GCC Toolchain team.  That is not equitable and no longer
sustainable.  IBM can't shoulder the burden of lowering the
development expense for other vendors.

Thanks, David


Re: Obsolete powerpc*-*-*spe*

2017-03-15 Thread Andrew Jenner

On 15/03/2017 14:26, Segher Boessenkool wrote:

I do not think VLE can get in, not in its current shape at least.


That's unfortunate. Disregarding the SPE splitting plan for a moment, 
what do you think would need to be done to get it into shape? I had 
thought we were almost there with the patches that I sent to you and 
David off-list last year.


> VLE

is very unlike PowerPC in many ways so it comes at a very big cost to
the port (maintenance and otherwise -- maintenance is what I care about
most).


I completely understand.


Since SPE and VLE only share the part of the rs6000 port that doesn't
change at all (except for a bug fix once or twice a year), and everything
else needs special cases all over the place, it seems to me it would be
best for everyone if we split the rs6000 port in two, one for SPE and VLE
and one for the rest.  Both ports could then be very significantly
simplified.

I am assuming SPE and VLE do not support AltiVec or 64-bit PowerPC,
please correct me if that is incorrect.  Also, is "normal" floating
point supported at all?


My understanding is that SPE is only present in the e500v1, e500v2 and 
e200z[3-7] cores, all of which are 32-bit only and do not have classic 
floating-point units. SPE and Altivec cannot coexist as they have some 
overlapping instruction encodings. The successor to e500v2 (e500mc) 
reinstated classic floating-point and got rid of SPE.



Do you (AdaCore and Mentor) think splitting the port is a good idea?


It wouldn't have been my preference, but I can understand the appeal of 
that plan for you. I'm surprised that the amount of shared code between 
SPE and PowerPC is as little as you say, but you have much more 
experience with the PowerPC port than I do, so I'll defer to your 
expertise on that matter.


Are you proposing to take on the task of actually splitting it yourself? 
If so, that would make me a lot happier about it.



>> -te200z0
>> -te200z3
>> -te200z4
>
> These are VLE?

Yes.

> Do some of those also support PowerPC?

All the e200 cores apart from e200z0 can execute 32-bit instructions as 
well as VLE, though we'll always generate VLE code when targetting them 
(otherwise they're fairly standard).


Andrew