> -----Original Message----- > From: Eray Ozkural [mailto:[EMAIL PROTECTED] > Sent: Friday, December 14, 2007 2:11 AM > To: Tom Elken > Cc: beowulf@beowulf.org > Subject: Re: [Beowulf] Using Autoparallel compilers or > Multi-Threaded librarieswith MPI > > On Dec 12, 2007 7:35 PM, Tom Elken <[EMAIL PROTECTED]> wrote: > > Results of the VERY non-scientific survey: > > > > # reporting use of Autoparallel features with MPI: 0 > > > > # reporting use of multi-threaded math libraries with MPI: 1 > > > Well, then, is there really such a thing that extracts > threads from those > horrible C codes and generates MPI code?
I have heard of SW tools that try to do some of that, but they did not achieve much commercial success. But that is not what I meant. I guess I was relying on memory of readers about my original post about this subject. Since that post was way back in November, that was a dangerous assumption. Thankfully we have an archive: http://www.beowulf.org/archive/2007-November/020211.html 'Autoparallel features with MPI' came from this in the original post: "I was wondering how many people use either auto-parallel compiler features, or multi-threaded math libraries (Goto, MKL, ACML, etc.) to provide some thread-level parallelism on a cluster where you primarily use MPI to achieve your parallel execution.*" So I meant that the source code is parallelized using MPI. Then in an effort to create something like a hybrid MPI/OpenMP program, but without having to add the OpenMP directives, you use the automatic parallelization feature of common compilers: -parallel in the Intel compiler -apo in the PathScale compiler -Mconcur in the PGI compiler, etc. to find loops which can profitably be parallelized using threads. Here was the example I mentioned in the original post: "For example, if an autoparallelizing compiler could find effective 4-way thread-level parallelism in an MPI code and you were running on a cluster of 8 nodes each with two quad-core CPUs, 64 cores total, you might choose to run with 16 MPI threads and set your NUM_THREADS variable to 4, to run with all 64 cores of the cluster executing work with reasonable efficiency. " So no one responded that they have done this, let alone finding it to be faster than running it with purely MPI ranks (no threads). -Tom > Not that I believe > it is impossible > (since I work for a company that does a similar thing) but I > would like to know > which autoparallel MPI code the posters had in mind. Is there > a market for > that kind of a compiler? > > Best, > > -- > Eray Ozkural, PhD candidate. Comp. Sci. Dept., Bilkent > University, Ankara > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf