Re: [Beowulf] Java vs C++ for interfacing to parallel library

Robert G. Brown Sun, 20 Aug 2006 20:55:54 -0700

On Mon, 21 Aug 2006, Jonathan Ennis-King wrote:

One way to alleviate the performance hit is of course to use a 90% Java
strategy, where the computationally intensive 10% (here, parallel sparse
matrix inversion) is handled in C.
It's the mixed language part that worries me with Java, especially in
the light of rgb's comments. It is claimed by some that Java and C++ are
largely incompatible. Or is this all solved in the Java native interface
(JNI)?


My specific question was whether anyone out there was running parallel
codes either written completely in Java, or with Java wrappering some
big numerical library for the hard part. Are there any additional issues
with parallel performance, or it this just a subcase of Java-C
interfacing in a scalar setting.


Well, GIYF.  For example,

  http://www.math.ucla.edu/~anderson/JAVAclass/JavaInterface/JavaInterface.html

turns up on the search string "java C interface", along with a few
zillion other hits and howto articles.

I think that the example given in the web hit above should serve to
guide you as a template on that route if that's what you want to try.
However, I wouldn't count on the result being horribly portable
cross-platform -- you'll face the double barrier of having to get the
same result from your C or C++ compiler on both platforms/OS
architectures AND have similar behavior from and versioning of the java
ports to both platform/OS's.

The other option is the Unix-like strategy suggested by rgb, where for
example the computational part is completely written in C, and then the
pre and post-processing which benefit from a GUI are written in some
other language (e.g. Java), or strung together from other unix tools and
wrapper languages.


Or a library-based strategy.  If you take your core code and develop a
plain old reusable library out of it with a fairly straightforward API
(which takes a lot of programming discipline and a certain amount of
practice, I think, but isn't particularly difficult) then you can USE
the library in a variety of UIs.  You can write a simple
vanilla-variable C interface to embed in java or perl.  You can write a
tty/ascii UI for command line users.  You can write a Gtk/Gnome
interface using glade and callbacks for native X/linux.  The UI code may
or many not be portable (tty/ascii I/O code using standard posix and
libc and libm calls is pretty much lowest common denominator and tends
to run compile and run "anywhere" including on Windows boxes with
minimal tweaking, more complex UIs become successively harder to port to
and less portable, as a rule) but the library itself, if written in
"boring" C with a very clear and simple API, should be able to support
all the UIs and interactive languages in a straightforward manner.

That's actually what I'd recommend if you really want UI flexibility and
code maintainability.  The latter is a very important consideration that
hasn't been touched on yet.  If you actually WRITE your application
integrated with a complex top level language/tool like perl, python,
java then maintaining it becomes much more complicated (as I've learned
very much the hard way, alas). If the basic sparse matrix routine
library changes, it will likely lead to emergent bugs.  If your C/C++
encapsulation of that routine changes or your application goals change
over time along with your core code, it's another thing to debug.  If
java (or any other UI base) changes, it's yet another thing to debug.
If you get enough layers (and languages and data interfaces) in there,
running down bugs and deciding "whose fault" they are gets to be really
quite difficult.  Fault aside, just finding them and fixing them can
become painful in the extreme.

For that reason I think it is a really good idea to have as few distinct
layers as possible to work with and to SEPARATE those so that they can
be separately debugged with a clean layer (API) in between.  If your
program is wrapped up in a library with a very simple C tty/ascii UI,
you control it all and can be pretty certain that any errors you
encounter are in YOUR code in ONE language and ONE data representation.
In other words, you have a decent chance of efficiently debugging
things.

OTOH if the only way you observe the failure is by accessing your core
routines through a big, complicated GUI written in an entirely different
language with its own data representation and with all sorts of stuff
happening at the callback level and with multiple layers of event loops
or even multiple threads (GUI-based programs have the nasty habit of
blocking when they are in their core work loops UNLESS they are written
with multiple threads, and multiple threads of course are far more
complex and enable far more subtle bugs to surface) then you're looking
at a LOT more work to debug any problems.  I personally would rather
tattoo the windows logo on my left bicep with a dull needle and food
coloring than mess with it at all.

Worse still, if you write a GUI-based program WITHOUT a relatively clean
API between the UI part and the work part and have NO other UI or
encapsulation to work on just the actual work routines, then god help
you if you ever have to debug the code OR rework the GUI.  Even "simple"
stuff like adding a new graphical display of some result can become
nightmarish if the UI and operational code are all entangled together.

So even if you ultimately want to integrate with java, with R, with
perl, or write a native GUI for some platform or another, I'd strongly
suggest writing your core code as a de facto library with its own
#include files that define the shared interface and all externally
visible data structures.  Develop this code with a simple ASCII front
end -- basically a command line parser to input program parameters and
perform any needed initialization, a work routine that takes the input
data and calls the core library subroutine(s) that do the required work
and produce a desired result (and does no significant work itself), and
a minimal output layer that pulls the result out of the standard
interface variables altered or returned by the library calls according
to the program API and dumps it onto either stdout or into a
command-line specified file where it can be verified for selected input
test data.

With this minimal encapsulation and debugging system, you can then do
whatever you like with the core work routines, quickly and easily.  For
example it is absolutely straightforward to replace the command line
interface with a glade-constructed set of GUI input widgets, the work
interface with a callback on a "run" button, and to add whatever kind of
output interface you like (graphical or otherwise).  Or you can fancy up
your command line interface and wrap it up in python or perl.  Or you
can modify the command line interface so that it is suitable for turning
the library calls into java or perl subroutine calls and obtaining the
input from java variables and delivering the output back to java
variables.  If you always take care to maintain your minimal C/tty UI
along with the library, you can easily isolate any problems that emerge
to JUST one layer in the initialization, execution, postprocessing,
presentation sequence, in particular keeping the execution part isolated
from the rest (that are more likely to be tied to some particular UI
environment with quirks, an API and data representation, and even a
language of its own).

   rgb



Robert G. Brown wrote:

On Sun, 20 Aug 2006, Joe Landman wrote:

Jonathan:

Jonathan Ennis-King wrote:

Does anyone have experience writing parallel Java code (using MPI) with
calls to C libraries which also use MPI? Is this possible/sensible? Is
there a big performance hit relative to doing the same in C++?



Unless all of the important optimizable calculation is done in libraries
that you are stitching together with Java glue, the compiled languages
are likely to be quite a bit faster.

There is a sizeable abstraction penalty associated with OO languages.
Many of the design patterns that they encourage (object factories,
inheritance chains, etc) are anathema to high performance.



Hear, hear!

I'm considering writing some parallel code to do fluid flow in porous
media, the heart of which is solving systems of sparse linear equations.
There are some good libraries in C which provide the parallel solver
(e.g. PETSC), but I'm trying to resolve which language to use for my
code. The choice is between C++ and Java, and although I'm favouring
Java at present, I'm not sure about its performance in this context.



Hmmm.  For this, C or Fortran may be far more appropriate.  Depends upon
what it is you want to do with the code.  High performance using MPI
depends upon many factors.  If there is one particular part of the code
that is better served by an OO based language, then I might suggest
designing/implementing all the speed sensitive bits in a language which
lets you achieve high performance, and then interfacing them to your OO
language so that the OO system isn't being used for the critical time
sensitive portions.



<disclaimer>Parts of the stuff below are editorial comment and religious
belief and can be ignored or sniffed at by those of differing
belief.</disclaimer>

Remember well the observation that you can write object oriented code in
a procedural language (and ditto, you can write procedural code in an OO
language).  Matching the language to the kind of code -- or more
likely, the personal taste of the coder -- simply makes development a
bit more simple and natural.

Untimately, OO vs procedural code is a matter of style as much as
anything else.  I write "real" code exclusively in C.  I'm in the
process of (re)writing a random number testing program (dieharder) into
a library-based tool that was originally (first pass) quite procedural
in its design.  In the second pass, as I came to fully understand the
data objects better in practice and could start to see how the code
could be simplified and compressed, I began to introduce a set of "lazy"
shared objects for certain parts of the code.

In the third (current) pass I'm splitting off all of the actual testing
code, as opposed to the startup/results/presentation UI code, into a
library.  Since most of the tests share a very similar implementation
structure and certain control variables in common, I can now see
precisely how to make the code very object oriented with a set of "test
objects" (structs and similarly structured test implementations that
read from them and fill them in) and a single set of "shell" code for
calling a standard test.  This reduces writing a UI to nothing but
simple, repetitive boilerplate for calling the actual tests and
displaying the returned results -- one can focus on the human side of
the UI and stop worrying about the tests, and one can relatively easily
and scalably add more tests or RNGs to test.

Since the code is still both lazy OO and C, I can freely intersperse the
use of pointers, can choose to treat variables (incluing all
structs/objects) as "opaque" or not as makes sense in the code, and keep
the code as efficient as C can make it, which is to say damn near as
efficient as assembler.  The "objectness" of the encapsulated tests just
permits me to write a relatively clean API to the library (without too
many test specific global/shared variables or the even greater hassle of
dealing with passing variable length argument lists through layers of
encapsulating subroutines) so that when I'm done adding a UI or GUI or
implementing the tests native inside e.g.  R or octave or whatever will
be fairly straightforward.

The point being that one CAN write non-lazy OO code in C or even in
Fortran -- that's more a question of program design and an understanding
of the basic data objects that a program requires, although it certainly
helps if the language permits the definition of a struct of one sort or
another.  One has the choice in C, though, of writing fully OO, lazy
(mixed) OO or fully procedural code when and where that is appropriate
for either ease of coding or program efficiency.  I suppose that choice
exists to some extent for at least some non-fascist OO environments
(e.g. C++ as a sort-of superset of C) but I think that the only people
who even know how to do so are those who have learned to code in a
non-OO language first -- people who learn C++ as their primary language
tend to be pretty clueless about pointers or the performance advantages
of NOT using protection and inheritance in your structs but just letting
everything access them directly.  C provides few safety nets but rather
permits you to do pretty much anything you like, at your own risk, in
code that is ultimately transparent.

Now, I personally believe that all nontrivial programs go through stages
like the three described above no matter what language they are written
in.  This is one of the reasons that Wirth's Pascal had its day and that
it passed -- whether one starts at the top or at the bottom or both, one
is likely to encounter mismatches that require rethinking all or part of
the memory hierarchy one begins with in any difficult project.  In that
SECOND pass and beyond, both strict-topdown and strict-bottomup
languages tend to require MORE work to fix than one that is less
hierarchically prestructured.

Perhaps there are OO ubercoders that can just "see" what the data
objects appropriate to a complex application are from the beginning and
can start off with the right top level, mid, AND bottom level objects
all perfectly enmeshed and integrated but I have yet to meet one.  One
of the great (IMO) illusions promoted by OO fanatics is that by using an
OO language (per se) to write the code in the first place one can
somehow shorten this process and home in on the correct hierarchy of
data structures (objects or not) that optimally support the
application's efficient implementation from top to bottom.  This is not
my experience, but hey, the world is a big place and there may be people
who just think that way and for them it may be true.

For code like the specific stuff you want to implement above that have
efficient libraries written in C, my guess is that you would do best
using C -- this is pretty much a no-brainer.  It is highly probable that
in C you have the best access to example programs using the library,
UIs, human support in the form of others who use the libraries in their
C code, and more.  Even communicating with the author/maintainers of the
library is bound to be simplest if you are implementing in C.  Second
best would almost certainly be C++, as C++ can (I believe) call C
libraries fairly transparently or with a minimal C++ encapsulation of
the C prototypes and data structures.

OTOH Fortran and C tend to have somewhat different subroutine call
mechanisms so binding a C library into fortran code or VV tends to be a
PITA -- for example, C always passes subroutine arguments by value,
fortran by reference.  In addition, C and fortran use slightly different
conventions for other simple stuff e.g. terminating a string.  Some of
the issues associated with the port are mentioned here:
http://star-www.rl.ac.uk/star/dvi/sun209.htx/node4.html as well as
elsewhere on the web.  Basically, calling C libraries in fortran code is
possible but requires some work and code encapsulation (and vice versa
for calling fortran routines from inside C code, IIRC -- fortran/C
compiler folks can check me on this:-).

Java, octave, matlab, python, perl etc. are MUCH WORSE in this regard.
All require NONTRIVIAL encapsulation of the library into the interactive
environment.  I have never done an actual encapsulation into any of
them, but I'll wager that it is really quite difficult because each of
them has their very own internal data types that are REALLY opaque
objects that bear little overt resemblance to the simple "all data
objects can be viewed as a projection onto a block of memory with either
typed or pointer driven offset arithmetic" view of data in C or for that
matter C++ or Fortran (with slighly different projective views in both
cases).

These languages typically permit you to allocate memory by just using a
named variable.  This is marvelously convenient for an interactive
environment -- it is marvelously expensive in terms of program
efficiency because the underlying environment has to manage allocating
the memory transparently extensibly (most of the languages permit you to
allocate whole vectors or matrices of variables by just referencing
them), tracking instances of the memory in code, and freeing the memory
when it is no longer referenced or being used.  Conservatively, so that
they tend to keep things if there is ANY CHANCE of their ever being
referenced, making them typically memory hogs almost as bad as a C
program would be if every memory reference in the program was to static
global memory -- no memory allocation or freeing at all, beyond whatever
goes on stack/heap in the course of subroutine calls or internal
function execution.  Complicated hashes or advanced list structures are
used to keep the execution itself moderately efficient (but highly
INefficient compared to a decent compiler with flat memory outlays).

The point being that you have to interface these opaque and not
obviously documented data types to the C library calls.  This is surely
possible -- it is how all those perl libraries, matlab toolboxes, java
interfaces come about.  It will probably require that you learn WAY more
about how the language itself is implemented at the source level than
you are likely to want to know, and it is probably not going to be
terribly easy...

   rgb



  Jonathan Ennis-King

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf



- --
 Jonathan Ennis-King
 email: [EMAIL PROTECTED]
 post: CSIRO Petroleum, Private Bag 10, Clayton South, Victoria, 3169,
Australia
 ph: +61-3-9545 8355 fax: +61-3-9545 8380

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFE6QYzzYw438SPLScRAqrGAJ997UJwcWXjdf3CGpGeb6tBFFfHlQCgpBTe
d5DPvPgmj3rYng+9m04bVvQ=
=i5QY
-----END PGP SIGNATURE-----


--
Robert G. Brown                        http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:[EMAIL PROTECTED]


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Java vs C++ for interfacing to parallel library

Reply via email to