ARM Multilibs with --with-mode=thumb

2013-05-08 Thread gnubie gnubie
Hi,
I've noticed odd behaviour when building an ARM compiler with GCC 4.7,
--with-mode=thumb and multilibs enabled.

If I do a standard c/c++ newlib build with the following multilib options:
MULTILIB_OPTIONS += marm mthumb
MULTILIB_DIRNAMES+= arm thumb

we get the following static libs:
./arm-none-eabi/lib/libssp_nonshared.a
./arm-none-eabi/lib/libcc.a
./arm-none-eabi/lib/libc.a
./arm-none-eabi/lib/libsupc++.a
./arm-none-eabi/lib/libnosys.a
./arm-none-eabi/lib/libstdc++.a
./arm-none-eabi/lib/libm.a
./arm-none-eabi/lib/thumb/libssp_nonshared.a
./arm-none-eabi/lib/thumb/libcc.a
./arm-none-eabi/lib/thumb/libc.a
./arm-none-eabi/lib/thumb/libsupc++.a
./arm-none-eabi/lib/thumb/libnosys.a
./arm-none-eabi/lib/thumb/libstdc++.a
./arm-none-eabi/lib/thumb/libm.a
./arm-none-eabi/lib/thumb/libssp.a
./arm-none-eabi/lib/thumb/libg.a
./arm-none-eabi/lib/libssp.a
./arm-none-eabi/lib/libg.a
./lib/gcc/arm-none-eabi/4.7.3/libgcc.a
./lib/gcc/arm-none-eabi/4.7.3/libgcov.a
./lib/gcc/arm-none-eabi/4.7.3/thumb/libgcc.a
./lib/gcc/arm-none-eabi/4.7.3/thumb/libgcov.a
./lib/libarm-none-eabi-sim.a
./lib/libiberty.a

That's all great.  Now, if I enable thumb mode as the default with
--with-mode=thumb, I get the following libs:
./arm-none-eabi/lib/libcc.a
./arm-none-eabi/lib/libc.a
./arm-none-eabi/lib/libnosys.a
./arm-none-eabi/lib/libm.a
./arm-none-eabi/lib/thumb/libssp_nonshared.a
./arm-none-eabi/lib/thumb/libcc.a
./arm-none-eabi/lib/thumb/libc.a
./arm-none-eabi/lib/thumb/libsupc++.a
./arm-none-eabi/lib/thumb/libnosys.a
./arm-none-eabi/lib/thumb/libstdc++.a
./arm-none-eabi/lib/thumb/libm.a
./arm-none-eabi/lib/thumb/libssp.a
./arm-none-eabi/lib/thumb/libg.a
./arm-none-eabi/lib/libg.a
./lib/gcc/arm-none-eabi/4.7.3/libgcc.a
./lib/gcc/arm-none-eabi/4.7.3/libgcov.a
./lib/gcc/arm-none-eabi/4.7.3/thumb/libgcc.a
./lib/gcc/arm-none-eabi/4.7.3/thumb/libgcov.a
./lib/libarm-none-eabi-sim.a
./lib/libiberty.a

As you can see, we've lost a load of arm libs: libssp, libstdc++ and libsupc++.

I haven't tried 4.8 yet, but i can't see any bug reports to suggest
anything has changed.

What am I missing here?

Thanks,
Carlos


RE: BImode and STORE_VALUE_FLAG

2013-05-08 Thread Paulo Matos

> -Original Message-
> From: Mikael Pettersson [mailto:mi...@it.uu.se]
> Sent: 04 May 2013 11:51
> To: Paulo Matos
> Cc: gcc@gcc.gnu.org
> Subject: Re: BImode and STORE_VALUE_FLAG
> 
> I can't comment on the code in question, but the backend for m68k may be
> affected
> since it defines STORE_FLAG_VALUE as -1.  Do you have a testcase that would
> cause
> wrong code, or a patch to cure the issue?  I'd be happy to do some testing on
> m68k-linux.
>

Mikael,

Still related to this issue, I think I found a bug that affects m68k due to the 
use of STORE_FLAG_VALUE != 1.

Try the following example (this is a trimmed down version of vector-compare-1.c 
from gcc testsuite):

int main (int argc, char *argv[]) {
int i, ires;
volatile int i0 = 2;
volatile int i1 = 2;

ires = (i0 >= i1);

if (ires != (i0 >= i1 ? -1 : 0)) {
  __builtin_printf ("%i != ((" "%i" " " ">=" " " "%i" " ? -1 : 0) ", 
(ires), (i0), (i1));
  return 1;
}

return 0;
}

I haven't tried to run it in m68k-linux since I don't have binutils-m68k 
installed but I assume it will print something like:
-1 != ((2 >= 2 ? -1 : 0)

and return exit code 1.

I did run m68k cc1 (gcc-4.7.3) and dumped logs and found the problem (which 
matches what I am seeing with my port).
We get to vrp1 with:
  D.1392_5 = i0.0D.1390_3 >= i1.1D.1391_4;
  iresD.1386_6 = (intD.1) D.1392_5;
  # VUSE <.MEMD.1405_18>
  i0.3D.1394_7 ={v} i0D.1387;
  # VUSE <.MEMD.1405_18>
  i1.4D.1395_8 ={v} i1D.1388;
  if (i0.3D.1394_7 >= i1.4D.1395_8)
goto ;
  else
goto ;
  # SUCC: 4 [50.0%]  (true,exec) 3 [50.0%]  (false,exec)

  # BLOCK 3 freq:5000
  # PRED: 2 [50.0%]  (false,exec)
  # SUCC: 4 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:1
  # PRED: 2 [50.0%]  (true,exec) 3 [100.0%]  (fallthru,exec)
  # iftmp.2D.1393_1 = PHI <-1(2), 0(3)>
  if (iftmp.2D.1393_1 != iresD.1386_6)
goto ;
  else
goto ;
  # SUCC: 5 [62.2%]  (true,exec) 6 [37.8%]  (false,exec)

The important bits are:
  D.1392_5 = i0.0D.1390_3 >= i1.1D.1391_4;
  iresD.1386_6 = (intD.1) D.1392_5;
...
  # iftmp.2D.1393_1 = PHI <-1(2), 0(3)>
  if (iftmp.2D.1393_1 != iresD.1386_6)
goto ;
  else
goto ;

vrp1 will then proceed to find the ranges for D.1392_5 = i0.0D.1390_3 >= 
i1.1D.1391_4;
Since this is a comparison set_value_range_to_truthvalue is called and returns 
the range [0, 1].
Then vrp1 simplifies the phi node to iftmp.2D.1393_1 = PHI < 0 > since -1 is 
not within the range.

From hereon a couple of simplifications break the remainder of the cgraph and 
generates incorrect code.

Can you reproduce this?

Cheers,

Paulo Matos


Re: BImode and STORE_VALUE_FLAG

2013-05-08 Thread Andreas Schwab
"Paulo Matos"  writes:

> I haven't tried to run it in m68k-linux since I don't have binutils-m68k 
> installed but I assume it will print something like:
> -1 != ((2 >= 2 ? -1 : 0)
>
> and return exit code 1.

I'm getting "1 != ((2 >= 2 ? -1 : 0)" with 4.7.3.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: How am I supposed to verify gcc-4.8.0 download when you provide no .sig file?...

2013-05-08 Thread Larry Evans
On 04/29/13 19:35, Scott Baldwin wrote:
> I was able to verify it with the .sig from gnu.org ftp, along with the info
> at http://ftp.gnu.org/ about where to obtain the gnu-keyring.gpg file.
> 
> A suggestion... In addition to making sure the .sig is copied to your
> mirrors, I recommend including the gnu-keyring.gpg info (from
> http://ftp.gnu.org) at http://gcc.gnu.org/mirrors.html instead of just
> saying "The archives on these mirrors will be signed by one of the following
> GnuPG keys: ..." and listing the fingerprints (but not providing the actual
> keys).
> 
> One more thing... 4.8.0 was signed with an expired key:
> 
>   $ gpg --verify --keyring ./gnu-keyring.gpg ./gcc-4.8.0.tar.gz.sig
>   gpg: Signature made Fri 22 Mar 2013 08:32:29 AM CDT using DSA key ID
> C3C45C06
>   gpg: Good signature from "Jakub Jelinek "
>   gpg: Note: This key has expired!
>   Primary key fingerprint: 33C2 35A3 4C46 AA3F FB29  3709 A328 C3A2
> C3C4 5C06
> 
[snip]
Using the following files:

http://open-source-box.org/gcc/gcc-4.8.0/gcc-4.8.0.tar.bz2
http://open-source-box.org/gcc/gcc-4.8.0/gcc-4.8.0.tar.bz2.sig
http://ftp.gnu.org/gnu/gnu-keyring.gpg

the verification command and result are:

~/download/gcc/4.8 $ gpg --verify --keyring ./gnu-keyring.gpg
./gcc-4.8.0.tar.bz2.sig
gpg: Signature made Fri Mar 22 08:32:18 2013 CDT using DSA key ID C3C45C06
gpg: Good signature from "Jakub Jelinek "
gpg: WARNING: This key is not certified with a trusted signature!
gpg:  There is no indication that the signature belongs to the
owner.
Primary key fingerprint: 33C2 35A3 4C46 AA3F FB29  3709 A328 C3A2 C3C4 5C06
~/download/gcc/4.8 $

Should I be worried about the gpg: WARNING?

TIA.
-Larry





Re: OpenACC support in 4.9

2013-05-08 Thread Torvald Riegel
On Tue, 2013-05-07 at 13:00 +0400, Evgeny Gavrin wrote:
> Hi, all!
> 
>  > Which accelerators do you intent to handle? "Accelerator" is a rather
>  > broad term, covering DSPs, GPUs, Intel's MIC, ...
> The idea is to emit OpenCL from high-GIMPLE, for know. So, any device 
> that has OpenCL support can be utilized by ACC.
> Maybe, we'll be able to reuse some parts from graphite/opengpu projects, 
> but this is not clear for now.

I don't disagree that this could be useful for a proof-of-concept, but
I'm wondering whether this is really useful to our users in the long
term.  We don't have any OpenCL implementation in GCC, so if we'd use
OpenCL below OpenACC, we'd bring in dependencies to the OpenCL
implementation, and GCC's OpenACC support would be like a
close-to-frontend translation layer to an OpenCL implementation (with
probably a glue library component too).

Also, if the representation that we eventually want to have in GCC for
accelerators isn't quite like OpenCL, then we'd be translating OpenACC
to this representation back to OpenCL.

Perhaps the HSA efforts that Richard mentioned could be a useful
lower-level target too?  Samsung is listed as an HSA founding member;
are you involved with HSA?

>  > Is there a specific reason for targeting 1.0 instead of 2.0 (besides 2.0
>  > still being declared as a draft)?
> You've named the main reason why we're targeting OpenACC1 - it's stable 
> and it's a good starting point for the initial implementation. BTW, 
> OpenACC2 differs not much from the previous version. Major improvements 
> covers only runtime library.

True, most of the differences seems to be in the Data API and such.
However, when I last looked at the list of proposed changes, there is a
proposal for allowing for calling non-inlined code.  It seems that this
would need to be part of general accelerator support (unless you'd want
to *require* inlining and LTO).  Vectorization features such as Cilk
also have something similar ("elemental functions"), so there seems to
be a need for this.

BTW, what's the state of OpenACC in general?  OpenACC 1.0 has been
released, but I had a few open questions after reading it a while ago.
Is 1.0 supposed to be a first draft, or indeed something that's expected
to be stable and useful for a long time?

>  > - but I don't think anyone will work on OpenMP 4.0's 'target' feature
>  > soon as the enough work on the non-'target' features remains.
> OpenMP's 'target' is definitely inspired by OpenACC. So, I think it'll 
> be possible to reuse/share most of BE part from OpenACC implementation, 
> once it's finished.

I'd agree with Jeff regarding this aspect: Even if we can't get a single
language-level standard for this in practice, we should try hard to have
somewhat compatible semantics in the different standards, and have a
internal representation in GCC that can represent those semantics.  It
seems unlikely that we can or want to support several incompatible
semantics, or have several different sets of middle-end IRs for
accelerators.  Thus, for GCC, I believe that the semantics that we want
to support are an important open question; those will also be affected
by the targets that we want to support (whether hardware or sth like
OpenCL).

Torvald



Re: OpenACC support in 4.9

2013-05-08 Thread Torvald Riegel
On Tue, 2013-05-07 at 17:34 +0200, Jakub Jelinek wrote:
> On Tue, May 07, 2013 at 11:02:08AM +0200, Tobias Burnus wrote:
> > Richard Biener wrote:
> > >We're going to look at supporting HSA from GCC (which would make
> > >it more or less trivial to also target openCL I think)
> > 
> > For the friends of link-time optimization (LTO):
> > 
> > Unless I missed some fine point in OpenACC and OpenMP's target, they
> > only work with directives which are locally visible. Thus, if one
> > does a function call in the device/target section, it can only be
> > placed on the accelerator if the function can be inlined.
> 
> No, OpenMP 4.0 has
> #pragma omp declare target
> ...
> #pragma omp end declare target
> where you can define/declare functions and variables in that ... and those
> are all marked for cloning for the target device (guess parsing of
> the above construct is going to just add "omp declare target" attribute
> to all those variables/functions and we'd then just clone the functions
> and map the variables into the target code).

Additional examples of such "special" functions are (1) the OpenMP /
Cilk+ SIMD functions (aka "elemental functions), for which programmers
make assertions such as that they can do with weaker forward progress
guarantees (eg, no locks in SIMD code), and (2) transaction-safe
functions for TM (but there the programmer doesn't make an assertion,
but a requirement.  Both might or might not need special code being
generated.  OpenACC 2.0 also proposes a similar feature (but the
description didn't seem like a finished spec back when I read it).

So, this isn't just about accelerators.


Torvald



Re: OpenACC support in 4.9

2013-05-08 Thread Torvald Riegel
On Tue, 2013-05-07 at 10:27 +0200, Richard Biener wrote:
> On Mon, May 6, 2013 at 5:16 PM, Jeff Law  wrote:
> > On 05/06/2013 07:41 AM, Tobias Burnus wrote:
> >>
> >> Evgeny Gavrin wrote:
> >>>
> >>> What do you think about support of OpenACC 1.0
> >>> (http://www.openacc-standard.org/) in gcc?
> >>
> >>
> >> I like the idea - though, I wonder whether OpenMP 4.0's "target"* would
> >> be the better choice as it looks a bit more flexible and better defined.
> >> (Conceptually, they are very similar; I think the
> >> middle-end/back-end/library part would even be the same.)
> >
> > We're certainly hoping that OpenACC & OpenMP 4 & Cilk+ can share certain
> > parts of their implementations.  We're already seeing OpenMP 4 and Cilk
> > starting to converge on some stuff.
> >
> > In a perfect world, there'd only be one standard for this stuff.  That's not
> > likely, so I'd be happy with parsing/FE kinds of things being specific to
> > each standard with everything from initial gimple generation through the
> > optimizers being shared.  That may not ultimately be possible, but I think
> > that's the right way to look at the work.
> 
> We're going to look at supporting HSA from GCC

Could you elaborate on those plans?

>  (which would make it
> more or less trivial to also target openCL I think) and also hope to leverage
> parts of the GOMP infrastructure for this

Are you thinking about leveraging the compiler side of GOMP, or libgomp?
I can see reasons for the former, but I'm not sure the latter is the
best approach to support for HSA.

> (GOMP is currently the only
> way to annotate parallel regions, apart from autodetecting them).  If Cilk+
> and OpenACC provide additional ways of annotating parallel regions then
> it would be nice to have the middle-end see only a single consistent way
> of a parallel region.

I agree that having one way of annotating parallel regions or task in
code would be useful.  There's also the TM infrastructure, which isn't
about parallelism but very much about annotated regions with additional
constraints on code in the regions, etc.; so it might perhaps be useful
too.  I believe it's the latter that's important here (and HW
heterogeneity), not whether we want to execute them in parallel or not
(i.e., you don't need language constructs to support parallel
execution...).


Torvald




Re: BImode and STORE_VALUE_FLAG

2013-05-08 Thread Paulo J. Matos

On 08/05/13 14:54, Andreas Schwab wrote:


I'm getting "1 != ((2 >= 2 ? -1 : 0)" with 4.7.3.

Andreas.



As I expected. That doesn't sound good but I am unsure on what to do 
about it. I will investigate the case further tomorrow.


I expect m68k to also fail the vector-compare-1.c gcc test, is this correct?

--
PMatos



Re: OpenACC support in 4.9

2013-05-08 Thread Torvald Riegel
On Tue, 2013-05-07 at 12:46 +0200, Richard Biener wrote:
> On Tue, May 7, 2013 at 12:42 PM, Richard Biener
>  wrote:
> > On Tue, May 7, 2013 at 11:02 AM, Tobias Burnus  wrote:
> >> Richard Biener wrote:
> >>>
> >>> We're going to look at supporting HSA from GCC (which would make it more
> >>> or less trivial to also target openCL I think)
> >>
> >>
> >> For the friends of link-time optimization (LTO):
> >>
> >> Unless I missed some fine point in OpenACC and OpenMP's target, they only
> >> work with directives which are locally visible. Thus, if one does a 
> >> function
> >> call in the device/target section, it can only be placed on the accelerator
> >> if the function can be inlined.
> >>
> >> Thus, it would be useful, if LTO could be used to inline such function into
> >> device code. I know one OpenACC code which calls functions in different
> >> translation units (TU) - and the Cray compiler handles this via LTO. Thus,
> >> it would be great if the HSA/OpenMP target/OpenACC middle-end 
> >> infrastructure
> >> could do likewise, which also means deferring the error that an external
> >> function cannot be used to the middle-end/LTO FE and not placing it into 
> >> the
> >> FE. - In the mentioned code, the called function does not have any OpenACC
> >> annotation but only consists of constructs which are permitted by the
> >> accelerator - thus, no automatic code gen of accelerator code happens for
> >> that. TU.
> >>
> >> (I just want to mention this to ensure that this kind of LTO/accelerator
> >> inlining is kept in mind when implementing the infrastructure for
> >> HSA/OpenACC/OpenMP target/OpenCL - even if cross-TU inlining is not
> >> supported initially.)
> >
> > In my view we'd get the "regular" OpenMP processing done during omp
> > lowering/expansion (which happens before LTO) which should mark the
> > generated worker functions apropriately.  Emitting accelerator code should
> > then happen at LTRANS time, thus after all IPA inlining took place.  The
> > interesting bits we can borrow from OMP is basically marking of functions
> > that are a) interesting, b) possible to transform.  Unmarked functions / 
> > loops
> > will have to go the autopar way, thus we have to prove via dependence 
> > analysis
> > that executing iterations in parallel is possible.
> 
> Btw, we plan to re-use the GOMP runtime as otherwise any synchronisation
> between accelerator code and regular thread code is impossible.

I can't follow this line of reasoning.  Can you elaborate?  Which kind
of synchronization are you referring to?

As far as parallel execution and resource management is concerned,
libgomp has just the kinds of scheduler that you need in the OpenMP rule
set.  Work-stealing schedulers such as Cilk's are others, and might
actually become the more common approach.  And there are other thread
pools that programs might use; e.g., there's lots of discussion about
all this in ISO C++ study group 1 on parallelism and concurrency, and
several different proposals.

With that in mind, I'm wondering whether the cooperative scheduling that
we likely need should be at a lower level than libgomp or the Cilk
runtime.  Otherwise, libgomp needs to become the scheduler that runs
them all (that is, if you want it to work well when combined with other
abstractions for parallelism), and I'm not sure whether that's the right
approach.

> Which
> means changing the GOMP runtime in a way to be able to pass a descriptor
> which eventually has accelerator code (and a fallback regular function so
> you can disable accelerator usage at runtime).

It probably should be a list of different codes -- you might have more
than one suitable accelerator available.

BTW: What about putting this topic on the Cauldron agenda?  Is there
still time available to discuss what GCC might do regarding accelerators
and HW heterogeneity?


Torvald



Re: BImode and STORE_VALUE_FLAG

2013-05-08 Thread Andreas Schwab
"Paulo J. Matos"  writes:

> As I expected. That doesn't sound good

In which way is it not good?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: BImode and STORE_VALUE_FLAG

2013-05-08 Thread Paulo J. Matos

On 08/05/13 21:29, Andreas Schwab wrote:

"Paulo J. Matos"  writes:


As I expected. That doesn't sound good


In which way is it not good?

Andreas.



Shouldn't we expect ires to be -1 (STORE_FLAG_VALUE) and therefore the 
condition of the if be false if everything is fine?


Otherwise if, independently of STORE_FLAG_VALUE results of comparisons 
are 0 or 1, can you explain the gcc test vector-compare-1.c 
(http://repo.or.cz/w/official-gcc.git/blob/gcc-4_7-branch:/gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c)?


This compares two vectors and then checks that each of the elements obey 
the comparison. If you, like me, have no builtin vector comparisons then 
the vector comparison is exploded into the comparison of each of the 
elements and respective assignment into the comparison result vector.


Then, what worries me is:
 8   if ((res)[__i] != ((i0)[__i] op (i1)[__i] ? -1 : 0)) \

Can you explain the reasoning behind this?
[it just occurred to me that what I might be missing is that scalar 
comparisons returns 0 or 1 and vector comparisons returns a vector with 
all zeros or a vector with all -1]


Let me add that I just noticed that this test disappeared from HEAD, so 
there might be a reason for that. I don't have a cloned rep here so I 
can't see why it was removed (or even it was just moved, instead of 
removed).


--
PMatos



Re: BImode and STORE_VALUE_FLAG

2013-05-08 Thread Andreas Schwab
"Paulo J. Matos"  writes:

> Shouldn't we expect ires to be -1 (STORE_FLAG_VALUE)

??? Boolean expressions in C evaluate to 0/1.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: BImode and STORE_VALUE_FLAG

2013-05-08 Thread Paulo J. Matos

On 08/05/13 23:10, Andreas Schwab wrote:

"Paulo J. Matos"  writes:


Shouldn't we expect ires to be -1 (STORE_FLAG_VALUE)


??? Boolean expressions in C evaluate to 0/1.

Andreas.



Agreed, I worked till too late yesterday, I am sorry.
Further to this matter, can you explain the reasoning behind 
vector-compare-1.c?