Re: [RFC] Offloading Support in libgomp

2013-08-29 Thread Michael V. Zolotukhin
> Perhaps instead of passing array of { void *hostaddr; size_t length; char 
> kind; }
> and length we could pass 3 arrays and length (the same for all of them).
> I can see 2 advantages of doing that:
> 1) the sizes are often constant and the kinds are always constant, so
> we could often allocate those last 2 or just last array in .rodata, wouldn't
> need to initialize it dynamically
> 2) for the host fallback, we could just pass the first array unmodified as
> the .omp_target_data structure, no need to copy the host addresses
Agree with both points, very nice idea.

Michael
>   Jakub


Re: Automated Toolchain Building and Testing

2013-08-29 Thread Jan-Benedict Glaw
On Thu, 2013-08-29 01:18:32 +, paul_kon...@dell.com  
wrote:
> On Aug 28, 2013, at 8:52 PM, Samuel Mi  wrote:
> > On Thu, Aug 29, 2013 at 2:54 AM, Jan-Benedict Glaw  
> > wrote:
> > > On Thu, 2013-08-29 02:43:54 +0800, Samuel Mi  
> > > wrote:
> > > > > ...or can you, instead of using the Java-based
> > > > > client part of Jenkins, issue all commands over a SSH (or maybe even
> > > > > Telnet...) session?  Is there a module for this available?
> > > > If making jenkins running on target systems you want whether old or
> > > > modern, then take a look at Jenkins-SSH
> > > > (https://wiki.jenkins-ci.org/display/JENKINS/Jenkins+SSH) to remotely
> > > > control over ssh.
> > > This looks like a SSH connector for the Jenkins server side, no?
> > No. Actually, Jenkins implements a built-in SSH server within itself.
> > At this point, it's consider to be a normal SSH server. So, you can
> > remotely access Jenkins server via SSH after setting up corresponding
> > configurations within it.
> 
> What non-antique Linux doesn't come with Python?

GCC and Binutils don't exclusively run on Linux platforms. Actually,
the Linux targets are those which are actually tested the most by
usual day-to-day usage.  I'd specifically like to run tests on
platforms that are _not_ used by large user bases, but still supported
by GCC.

Those platforms tend to be non-Linux, non-i386, non-recent, or any
combination if those. *This* is what I want to support. (It still
should also run on Linux, though.)

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of:   ...und wenn Du denkst, es geht nicht mehr,
the second  :  kommt irgendwo ein Lichtlein her.


signature.asc
Description: Digital signature


Re: Automated Toolchain Building and Testing

2013-08-29 Thread Rainer Orth
Jan-Benedict Glaw  writes:

> On Wed, 2013-08-28 23:26:29 +0800, Samuel Mi  wrote:
>> Looks like you for now have been trying to find out a solution
>> suitable for you to automatically build GCC from source combined with
>> certain continuous systems like Jenkins. As a matter of fact, Jenkins
>> is exactly a good choice to do such thing just mentioned, due to
>> itself with so many plugins[1] you can pick up to fit your needs.
>
> I'm not too sure if Jenkins is actually a good choice, just because I
> question that there's a working Java especially for old Unix-alike
> systems that GCC still (in theory) supports. What about eg. older IRIX
> or Ultrix systems?  ...or can you, instead of using the Java-based

I honestly wouldn't worry about such legacy systems: their respective
maintainers take care of testing them, and it would be hard nowadays to
even find both hardware and OS media to set up a new system.

FWIW, IRIX 6.5 was last supported in gcc 4.7, and that's the only IRIX
release I'm still testing.  IRIX 5.3/6.x was deprecated in gcc 4.5
already, a release no longer supported and thus irrelevant.  Same for
Tru64 UNIX: V5.1 support was deprecated in 4.7; still testing that
either.

Hope this helps.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Automated Toolchain Building and Testing

2013-08-29 Thread Jan-Benedict Glaw
On Thu, 2013-08-29 10:34:40 +0200, Rainer Orth  
wrote:
> Jan-Benedict Glaw  writes:
> > On Wed, 2013-08-28 23:26:29 +0800, Samuel Mi  wrote:
> > > Looks like you for now have been trying to find out a solution
> > > suitable for you to automatically build GCC from source combined with
> > > certain continuous systems like Jenkins. As a matter of fact, Jenkins
> > > is exactly a good choice to do such thing just mentioned, due to
> > > itself with so many plugins[1] you can pick up to fit your needs.
> >
> > I'm not too sure if Jenkins is actually a good choice, just because I
> > question that there's a working Java especially for old Unix-alike
> > systems that GCC still (in theory) supports. What about eg. older IRIX
> > or Ultrix systems?  ...or can you, instead of using the Java-based
> 
> I honestly wouldn't worry about such legacy systems: their respective
> maintainers take care of testing them, and it would be hard nowadays to
> even find both hardware and OS media to set up a new system.

Well, I do.

Just for example, I do care about the VAX backend. I've had something
similar to my current build robot running for a while, then it broke
(due to other circumstances) and it took me quite some time to track
down some "easter eggs" introduced in the mean time.  I'd like to
catch such things early, personally for the VAX backend, but if
possible in general.

Shouldn't be too hard to have something that dispatches commands
purely through SSH.  If there's nothing available, I'll just write it.

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of:   http://www.eyrie.org/~eagle/faqs/questions.html
the second  :


signature.asc
Description: Digital signature


Re: all_ones_mask_p clarification

2013-08-29 Thread Richard Biener
On Wed, Aug 28, 2013 at 7:15 PM, Mike Stump  wrote:
> On Aug 28, 2013, at 2:40 AM, Richard Biener  
> wrote:
>> Digging shows I at one point removed all this code - but people objected and 
>> I
>> had to revert it :/
>
> [ oh,, sorry to hear ]  I got rid of it as well, and then the test suite beat 
> on me til I relented.
>
>> I suppose this kind of cleanup should be done on trunk, without introducing
>> wide-int first.
>
> I expect you meant, should be done on trunk and not the branch, however it 
> can be read as withholding the wide-int branch from trunk until it is fixed.  
> You did mean the former, right?  If the later I'd object.

Yes, the former.

>> Note that rejecting all unsigned masks doesn't make much sense to me,
>> so it must be some special-case in the caller that is wrong.
>
> Agreed, but, it's independent of wide-int.  When the wide-int branch goes in, 
> we will notice if trunk fixed the issue or not and will adjust our code to 
> match the then current trunk.

Fine.

Richard.


Re: [RFC] Offloading Support in libgomp

2013-08-29 Thread Richard Biener
On Wed, Aug 28, 2013 at 1:37 PM, Jakub Jelinek  wrote:
> On Wed, Aug 28, 2013 at 01:21:53PM +0200, Richard Biener wrote:
>> My thought was that we need to have control over scheduling and thus have
>> a single runtime to be able to execute the following in parallel on the
>> accelerator and the CPU:
>>
>> #pragma omp parallel
>> {
>> #pragma omp target
>>for (;;)
>>  ...
>> #pragma omp for
>>   for (;;)
>>  ...
>> }
>> #pragma omp wait
>>
>> that is, the omp target dispatch may not block the CPU.  I can hardly
>
> OpenMP #pragma omp target blocks the host CPU until the accelerator code
> finishes.  So if the goal is to spawn some accelerator code in parallel with
> parallelized host code, you'd need to make the code more complicated.
> I guess you could
> #pragma omp parallel
> {
> #pragma omp single
> #pragma omp target
> {
> #pragma omp parallel
> ...
> }
> #pragma omp for schedule(dynamic, N)
> for (;;)
> ...
> }
> or similar, then only one of the host parallel threads would spawn the
> target code, wait for it to be done and other threads in the mean time
> would do the worksharing (and the dynamic schedule would make sure that
> if the target region took long time, then no work or almost no work would be
> scheduled for the thread executing the target region).
>
>> > In the Intel MIC case (the only thing I've looked briefly at for how the
>> > offloading works - the COI library) you can load binaries and shared
>> > libraries either from files or from host memory image, so e.g. you can
>> > embed the libgomp library, some kind of libm and some kind of libc
>> > (would that be glibc, newlib, something else?) compiled for the target
>> > into some data section inside of the plugin or something
>> > (or load it from files of course).  No idea how you do this in the
>> > HSAIL case, or PTX.
>>
>> For HSA you can do arbitrary calls to CPU code (that will then of course
>> execute on the CPU).
>
> GCC compiles into assembly or bytecode for HSAIL, right, and that then is
> further processed by some (right now proprietary?) blob.  The question is
> does this allow linking of multiple HSAIL bytecode objects/libraries, etc.
> Say you have something providing (a subset of) C library, math library,
> libgomp, then say for OpenMP one host shared library provides some
> #pragma omp declare target
> ...
> #pragma omp end declare target
> routine, and another shared library uses #pragma omp target and calls that
> routine from there.  So, I'd assume you have some HSAIL assembly/bytecode
> in each of the shared libraries, can you link that together and tell the
> runtime to execute some (named?) routine in there?

(un)fortunately the HSA runtime spec doesn't talk about the whole relocation
business so for now we end up passing all object addresses as arguments
to the HSAIL code and access everything indirectly.  For the above case it
means you have to glue everything together manually in some weird ways.
Eventually the HSA folks need to think about this of course.

Richard.

> Jakub


Prototypes for builtin functions

2013-08-29 Thread Paulo Matos
Hi,

I would like to hear how other architectures organize their builtin/intrinsic 
headers.

Until recently we had a header that would look like:

/* Types */
typedef char   V8B  __attribute__ ((vector_size (8)));
...

/* Prototypes */
extern V8B __vec_put_v8b (V8B B, char C, unsigned char D);
...

The problem with this approach (I found out) is that GCC after seeing the 
prototype changes the location of the definition of the builtin from 
BUILTINS_LOCATION to the headerfile/linenumber and then when calling 
DECL_IS_BUILTIN on __vec_put_v8b tree it returns 0. This blocks a few 
optimizations (I noticed this when specifically checking why some functions 
were not being unrolled properly).

So, I commented out the prototypes from the intrinsics header and left only the 
type definitions, however, tests on intrinsics fail because if I do:
V8B put_v8b_test (V8B a, char value, char index)
{
 V8B b = __vec_put_v8b (a, value, index);
 return b;
}

GCC complains with:
error: incompatible type for argument 1 of '__vec_put_v8b'
note: expected '__vector(8) signed char' but argument is of type 'V8B'

What's the correct way to create the intrinsics header?

-- 
Paulo Matos



Re: Automated Toolchain Building and Testing

2013-08-29 Thread Diego Novillo
On Thu, Aug 29, 2013 at 6:02 AM, Jan-Benedict Glaw  wrote:
> On Thu, 2013-08-29 10:34:40 +0200, Rainer Orth 
>  wrote:
>>
>> I honestly wouldn't worry about such legacy systems: their respective
>> maintainers take care of testing them, and it would be hard nowadays to
>> even find both hardware and OS media to set up a new system.
>
> Well, I do.

That's fine, but I don't think we should not hold a good solution in
the quest for the perfect one. How about we start with this version?
Whoever is interested in extending it to other systems, can do it
incrementally.

I have not yet caught up to the whole thread, but I suppose the
possibility of running it on the Compile Farm has been discussed?


Diego.


Re: Automated Toolchain Building and Testing

2013-08-29 Thread Jan-Benedict Glaw
On Thu, 2013-08-29 07:21:28 -0400, Diego Novillo  wrote:
> On Thu, Aug 29, 2013 at 6:02 AM, Jan-Benedict Glaw  wrote:
> > On Thu, 2013-08-29 10:34:40 +0200, Rainer Orth 
> >  wrote:
> > > I honestly wouldn't worry about such legacy systems: their respective
> > > maintainers take care of testing them, and it would be hard nowadays to
> > > even find both hardware and OS media to set up a new system.
> >
> > Well, I do.
> 
> That's fine, but I don't think we should not hold a good solution in
> the quest for the perfect one. How about we start with this version?
> Whoever is interested in extending it to other systems, can do it
> incrementally.

It's already running :)  This thread is already about the next
version. The current variant will easily run on anything that is able
to run GIT (probably any other SCM) and SSH (and it'd be easy to adopt
it to anything else like telnet or rlogin.)

> I have not yet caught up to the whole thread, but I suppose the
> possibility of running it on the Compile Farm has been discussed?

David Edelsohn already pointed me to that and I requested an accound
some time ago, but I'm still waiting for a reply. (Though it's holiday
time, so I guess the people doing account maintenance are just out for
a trip.)

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: Gib Dein Bestes. Dann übertriff Dich selbst!
the second  :


signature.asc
Description: Digital signature


DX register is not a return value for i386?

2013-08-29 Thread Ilya Enkovich
Hi,

function_value_regno_p hook implementation for i386 target
(ix86_function_value_regno_p) always returns false for DX register.
But DX register is used to return 128 bit values an AX:DX. Is it
intentional or a bug?

I'm asking because it causes problem with mode switching which fails
if see 'use' insn at the end of the function whose argument is not a
register holding returned value. I'm choosing what should be fixed
here, a mode switching or a hook implementation.

Thanks,
Ilya


Re: DX register is not a return value for i386?

2013-08-29 Thread H.J. Lu
On Thu, Aug 29, 2013 at 7:33 AM, Ilya Enkovich  wrote:
> Hi,
>
> function_value_regno_p hook implementation for i386 target
> (ix86_function_value_regno_p) always returns false for DX register.
> But DX register is used to return 128 bit values an AX:DX. Is it
> intentional or a bug?
>
> I'm asking because it causes problem with mode switching which fails
> if see 'use' insn at the end of the function whose argument is not a
> register holding returned value. I'm choosing what should be fixed
> here, a mode switching or a hook implementation.
>

It is not just %dx.  %st1 and %xmm1 are used to return
complex value.  You need to check hard_regno_nregs for
how many hard registers are used.

-- 
H.J.


__mips16_xxx and .globl

2013-08-29 Thread Reed Kotler
I have implemented this gcc mips16 floating point scheme in llvm/clang 
and ran into one interesting issue.


In gcc mips16, for all the hard float routines, i.e. __mips16_xxx, gcc 
emits a .globl for them.


It does not do this for other routines like strcmp for example or puts.

If don't remit the .globl's for these in -fPIC mode, then when I run 
this one heavy math use  program, it runs really slow because it seems 
to be constantly in the loader doing something.


If I edit the .s file and add the .globl's then it runs at normal speed.

Does anyone know what the issue would be here?

Without the .globl, the type is UNDEFINED and with the .globl the type 
of OBJECT for these __mips16_xxx routines.


Reed





Re: __mips16_xxx and .globl

2013-08-29 Thread Reed Kotler

I forgot to mention that this only happens with I"m linking as C++

On 08/29/2013 02:07 PM, Reed Kotler wrote:

I have implemented this gcc mips16 floating point scheme in llvm/clang
and ran into one interesting issue.

In gcc mips16, for all the hard float routines, i.e. __mips16_xxx, gcc
emits a .globl for them.

It does not do this for other routines like strcmp for example or puts.

If don't remit the .globl's for these in -fPIC mode, then when I run
this one heavy math use  program, it runs really slow because it seems
to be constantly in the loader doing something.

If I edit the .s file and add the .globl's then it runs at normal speed.

Does anyone know what the issue would be here?

Without the .globl, the type is UNDEFINED and with the .globl the type
of OBJECT for these __mips16_xxx routines.

Reed









gcc-4.8-20130829 is now available

2013-08-29 Thread gccadmin
Snapshot gcc-4.8-20130829 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20130829/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch 
revision 202089

You'll find:

 gcc-4.8-20130829.tar.bz2 Complete GCC

  MD5=375e05d325008d30e90cfb29b227b497
  SHA1=96363cebc48d85b8464bec44e0b6d6731be5b638

Diffs from 4.8-20130822 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.