On Wed, Nov 27, 2013 at 1:18 PM, John Hearns <hear...@googlemail.com> wrote: > Regarding ISV codes, today you frequently see a vendor supplying an entire > tree of software, > complete with an MPI plus a version of Python with libraries, plus X, Y,Z. > They do this because they have to ship software which works to users, and > don't want the inevitable > 'Oh - my sysadmin installed Python 2.7 and your software does not work with > that' type of query. > I don't see that it is that far a step to packaging up Docker containers.
I actually don't see the novelty here at all. I use since 10+ years versioned and dependency-expressing directories (e.g. openmpi/1.6.4/gcc/4.7.2 instead of simply openmpi/gcc to mean that openmpi-1.6.4 was compiled with gcc-4.7.2) to install HPC related software. I also try as much as possible to use -rpath with these versioned directories, so that dynamic libs dependencies never change. When I started using modules/dotkits/etc. I have taken the same convention there. This way the repeatability is guaranteed both for the user (for reproduction of scientific data) and the admin (for reliable bug reports). For convenience, I can set softlinks like "current" or "latest" to make it easier for developers to stay in sync with updates; modules can even automate this. As possibly many on this list, I have also wondered about the explosion of different versions to maintain that such a scheme would imply. But, as Peter Clapham mentioned, there is a certain life-time for all software (including libraries and compilers), so in more than 95% of cases only few combinations are needed; f.e. a single version of GROMACS is compiled against a single version of FFTW, one version of OpenMPI and one version of MVAPICH2, which are all compiled with a single version of gcc. OpenMPI is particularly nice in this context as it can discover and use at runtime the best interconnect - there is no need to keep several parallel versions which only differ in the supported hardware and then have to compile user apps against each of them. A thorny problem for HPC is the rather strong dependency of certain libraries on drivers/kernel. F.e. certain versions of MPI libraries require certain versions of the OFED stack which come with certain versions of IB/OF drivers which require certain kernel versions. So, in the end, the much touted independence from the underlying system is voided. Containers cannot solve this problem by design; virtualization can (with some hardware support), as the VM carries a complete computing environment including drivers/kernel. Packing only the required libraries (which AFAIU is what the Docker containers are doing to keep the size low) is something the HPC scene has known for ~10 years... Anyone remembers bproc/Scyld ? It was transferring on demand the required libs; sure, one had to keep the versioned libs on the master to have them available to the app... Cheers, Bogdan _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf