Re: [Beowulf] anyone have modern interconnect metrics?

2024-01-16 Thread Benson Muite
Many clouds use COTS ethernet, eg. AWS, Alibaba, Oracle. My expectation is that most workloads that are tightly coupled are on hundreds of nodes. Networking is disaggreagated, so variance in latency will be somewhat greater than a typical cluster, though some of the newer topologies used in HPC c

Re: [Beowulf] Your thoughts on the latest RHEL drama?

2023-06-28 Thread Benson Muite
On 6/28/23 11:18, Tony Travis wrote: > On 28/06/2023 07:18, John Hearns wrote: >> Rugged individuaiist? I like that...    Me puts on plaid shirt and >> goes to wrestle with some bears,,, >> >>  > Maybe it is time for an HPC Linux distro, this is where >> Good move. I would say a lightweight distro

Re: [Beowulf] likwid vs stream (after HPCG discussion)

2022-03-19 Thread Benson Muite
On 3/19/22 1:28 PM, Mikhail Kuzminsky wrote: Just in the HPCG discussion, it was proposed to use the now widely used likwid benchmark to estimate memory bandwidth. It gives excellent estimates of hardware capabilities. Am I right that likwid uses its own optimized assembler code for each specifi

Re: [Beowulf] HPCG benchmark, again

2022-03-19 Thread Benson Muite
For memory bandwidth, single node tests such as Likwid are helpful https://github.com/RRZE-HPC/likwid MPI communication benchmarks are a good complement to this. Full applications do more than the above, but these are easier starting points that require less domain specific application knowled

Re: [Beowulf] Ethernet switch OSes

2021-05-02 Thread Benson Muite
On 4/23/21 7:07 AM, Greg Lindahl wrote: I'm buying a 100 gig ethernet switch for my lab, and it seems that the latest gear is intended to run a switch OS. Being as cheap as I've always been, free software sounds good. It looks like Open Network Linux is kaput. It looks like SONiC is doing pret

Re: [Beowulf] [EXTERNAL] Re: perl with OpenMPI gotcha?

2020-11-21 Thread Benson Muite
GNU Parallel ( http://www.gnu.org/software/parallel/ ) might allow for similar workflows On 11/21/20 3:56 AM, Lux, Jim (US 7140) via Beowulf wrote: If Joe has interpreted your need correctly, I’ll second the suggestion of pdsh – it’s simple, it works pretty well, it’s “transport” independent (

Re: [Beowulf] ***UNCHECKED*** Re: Re: [EXTERNAL] Re: Re: Spark, Julia, OpenMPI etc. - all in one place

2020-10-20 Thread Benson Muite
But even then, it was a pretty slow evolution – the Fortran compilers I was running in the 80s on microcomputers under MS-DOS wasn’t materially different from the Fortran I was running in 1978 on a Z80, which wasn’t significantly different from the Fortran I ran on mainframes (IBM 360, CDC

Re: [Beowulf] [External] Spark, Julia, OpenMPI etc. - all in one place

2020-10-13 Thread Benson Muite
On 10/13/20 3:12 PM, Oddo Da wrote: Jim, Peter: by things have not changed in the tooling I meant that it is the same approach/paradigm as it was when I was in HPC back in the late 1990s/early 2000s. Even if you look at books about OpenMPI, you can go on their mailing list and ask what books t

Re: [Beowulf] [External] Re: HPCG

2020-08-11 Thread Benson Muite
that will definitely be handy. -- Prentice On 8/7/20 5:02 AM, Benson Muite wrote: Maybe the following are helpful: https://sx-aurora.github.io/posts/hpcg-tuning/ https://www.hpcadvisorycouncil.com/pdf/HPCG_Analysis_POWER8.pdf https://link.springer.com/chapter/10.1007/978-3-030-50743-5_21 htt

Re: [Beowulf] HPCG

2020-08-07 Thread Benson Muite
Maybe the following are helpful: https://sx-aurora.github.io/posts/hpcg-tuning/ https://www.hpcadvisorycouncil.com/pdf/HPCG_Analysis_POWER8.pdf https://link.springer.com/chapter/10.1007/978-3-030-50743-5_21 https://ulhpc-tutorials.readthedocs.io/en/latest/parallel/hybrid/HPCG/ https://www.intel.c

Re: [Beowulf] HPC for community college?

2020-02-26 Thread Benson Muite
Dear Doug, Might you be willing to give some indication of benchmarks you find useful for customers of Limulus systems? Regards, Benson On Sat, Feb 22, 2020, at 6:42 AM, Douglas Eadline wrote: > > That is the idea behind the Limulus systems -- a personal (or group) small > turn-key cluster t

Re: [Beowulf] HPC for community college?

2020-02-22 Thread Benson Muite
Enabling local manufacturing using CAD and additive manufacturing may be helpful. Machine learning applications such as voice recognition, image recognition and recommendation systems might also be interesting. For example one could improve library catalog search. On Sat, Feb 22, 2020, at 1

Re: [Beowulf] HPC for community college?

2020-02-19 Thread Benson Muite
YES! Though national infrastructures such as XSEDE may be easier to sustain and for community colleges to use. On Wed, Feb 19, 2020, at 10:46 PM, Mark Kosmowski wrote: > Is there a role for a modest HPC cluster at the community college? > ___ > Beowul

Re: [Beowulf] Have machine, will compute: ESXi or bare metal?

2020-02-10 Thread Benson Muite
On Tue, Feb 11, 2020, at 9:31 AM, Skylar Thompson wrote: > On Sun, Feb 09, 2020 at 10:46:05PM -0800, Chris Samuel wrote: > > On 9/2/20 10:36 pm, Benson Muite wrote: > > > > > Take a look at the bootable cluster CD here: > > > http://www.littlefe.net/ > &g

Re: [Beowulf] Have machine, will compute: ESXi or bare metal?

2020-02-10 Thread Benson Muite
On Mon, Feb 10, 2020, at 9:46 AM, Chris Samuel wrote: > On 9/2/20 10:36 pm, Benson Muite wrote: > > > Take a look at the bootable cluster CD here: > > http://www.littlefe.net/ > > From what I can see BCCD hasn't been updated for just over 5 years, and > the la

Re: [Beowulf] Have machine, will compute: ESXi or bare metal?

2020-02-09 Thread Benson Muite
Take a look at the bootable cluster CD here: http://www.littlefe.net/ On Mon, Feb 10, 2020, at 1:54 AM, Mark Kosmowski wrote: > I purchased a Cisco UCS C460 M2 (4 @ 10 core Xeons, 128 GB total RAM) for > $115 in my local area. If I used ESXi (free license), I am limited to 8 vcpu > per VM. Coul

Re: [Beowulf] [External] Re: First cluster in 20 years - questions about today

2020-02-07 Thread Benson Muite
Charges depend alot with field. People in medical fields generally pay $2,000 per paper for open access, with their research grants supporting this - the idea being that medical information is useful and it is cheaper for the funding body to make their research widely available, than for librari

Re: [Beowulf] First cluster in 20 years - questions about today

2020-02-04 Thread Benson Muite
Generally, getting published does not depend on having an academic qualification if the work is sufficiently interesting. Choose venue appropriately. There seems to be quite a plit between domain specific publications and HPC publications, with only a few venues able to reliably review both the

Re: [Beowulf] HPC demo

2020-01-20 Thread Benson Muite
a) For technically knowledgable audience, you could demonstrate some simple benchmarks codes. b) For a more general audience, you might also look for some parallel open source applications that are specific to the domain of interest. For example for engineering, OpenFOAM (https://openfoam.com/

Re: [Beowulf] [EXTERNAL] Re: Is Crowd Computing the Next Big Thing?

2019-11-30 Thread Benson Muite
MPI is not likely the best programming model at present for such a scenario, because fault tolerance is not great. However, there are people running many loosely coupled parallel programs on clusters because the computational capacity and storage mechanisms make it convenient, even if the inter

Re: [Beowulf] [EXTERNAL] Re: Is Crowd Computing the Next Big Thing?

2019-11-28 Thread Benson Muite
There are some machine learning workloads that could be done in this context. Renting smartphones may be more challenging, but desktop or server idle computing power could be used in distributed offline training algorithms. The physics community does make heavy use of grid computing (https://

Re: [Beowulf] [EXTERNAL] Re: Is Crowd Computing the Next Big Thing?

2019-11-28 Thread Benson Muite
There are some machine learning workloads that could be done in this context. Renting smartphones may be more challenging, but desktop or server idle computing power could be used in distributed offline training algorithms. The physics community does make heavy use of grid computing (https://

Re: [Beowulf] PLCC84, FPGA compute array.

2019-09-22 Thread Benson Muite
Hi, You might check work by Ryohei Kobayashi: https://dblp.org/pers/hd/k/Kobayashi:Ryohei Benson ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowu

Re: [Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?

2019-05-02 Thread Benson Muite
Hi Faraz, Mellanox manuals can be found at: https://docs.mellanox.com/ Example setup instructions (not sure if correct for you as do not have exact details on your hardware): https://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_User_Manual_v4_3.pdf Maybe also helpful (stu

Re: [Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?

2019-05-01 Thread Benson Muite
Hi Faraz, Have you tried any other MPI distributions (eg. MPICH, MVAPICH)? Regards, Benson On 4/30/19 11:20 PM, Gus Correa wrote: It may be using IPoIB (TCP/IP over IB), not verbs/rdma. You can force it to use openib (verbs, rdma) with (vader is for in-node shared memory): mpirun --mca b

Re: [Beowulf] Introduction and question

2019-03-21 Thread Benson Muite
"Many employers look for people who studied humanities and learned IT by themselves, for their wider appreciation of human values." Mark Burgess https://www.usenix.org/sites/default/files/jesa_0201_issue.pdf On 2/23/19 4:30 PM, Will Dennis wrote: Hi folks, I thought I’d give a brief introdu

Re: [Beowulf] Introduction and question

2019-02-23 Thread Benson Muite
Welcome! Do you also contribute to documentation - even improving grammar would be helpful? On 2/23/19 4:30 PM, Will Dennis wrote: Hi folks, I thought I’d give a brief introduction, and see if this list is a good fit for my questions that I have about my HPC-“ish” infrastructure... I am a ~30y

Re: [Beowulf] Simulation for clusters performance

2019-01-03 Thread Benson Muite
There are a number of tools. A possible starting point is: http://spcl.inf.ethz.ch/Research/Scalable_Networking/SlimFly/ Regards, Benson On 1/4/19 12:44 AM, Alexandre Ferreira Ramos wrote: Hi Eveybody, Happy new year! I need to conduct a study on simulating the performance of a large scale c

[Beowulf] Benchmarking

2018-11-27 Thread Benson Muite
Thoughts and papers related to computer benchmarks are sought for the Benchmarking in the data center workshop. More details at: http://parallel.computer/ ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscr

Re: [Beowulf] More about those underwater data centers

2018-11-05 Thread Benson Muite
Some power transformer oils do work - but performance measurements, performance/price evaluation and long term system durability studies seem to be lacking. On 11/5/18 10:16 AM, Tony Brian Albers wrote: > Salt water is highly corrosive, that's why people use mineral or > silicone oil. > > I've hea

Re: [Beowulf] If I were specifying a new custer...

2018-10-11 Thread Benson Muite
On 10/11/18 10:08 PM, Douglas Eadline wrote: > All: > > Over the last several months I have been reading about: > > 1) Spectre/meltdown > 2) Intel Fab issues > 3) Supermicro MB issues > > I started thinking, if I were going to specify a > single rack cluster, what would I use? > > I'm assuming a g

Re: [Beowulf] C++ compilers and assembly

2018-09-10 Thread Benson Muite
Thanks. This is interesting. Possibly also of interest: https://clearlinux.org/ Not for HPC, but some aspects look useful. On 09/10/2018 06:50 AM, John Hearns via Beowulf wrote: Chris Samuels recent post reminds me. I went to a fascinating and well delivered talk by Jason Hearne McGuiness https

Re: [Beowulf] Working for DUG, new thead

2018-06-19 Thread Benson Muite
On 06/19/2018 09:47 PM, Prentice Bisbal wrote: On 06/13/2018 10:32 PM, Joe Landman wrote: I'm curious about your next gen plans, given Phi's roadmap. On 6/13/18 9:17 PM, Stu Midgley wrote: low level HPC means... lots of things.  BUT we are a huge Xeon Phi shop and need low-level programme

Re: [Beowulf] Openfoam advice

2018-03-15 Thread Benson Muite
Hi Faraz, Last looked at OpenFOAM 4-X https://github.com/OpenFOAM/OpenFOAM-4.x Though development has migrated to https://develop.openfoam.com/Development/OpenFOAM-plus Steps I used in a bash build script for 4-X on a cluster with OpenMPI, qt, zlib and cmake already installed are below. Scaling

Re: [Beowulf] Theoretical vs. Actual Performance

2018-02-22 Thread Benson Muite
Consider trying: https://github.com/amd/blis https://github.com/clMathLibraries/clBLAS as well. On 02/23/2018 12:48 AM, Prentice Bisbal wrote: Just rebuilt OpenBLAS 0.2.20 locally on the test system with GCC 6.1.0, and I'm only getting 91 GFLOPS. I'm pretty sure OpenBLAS performance should be

Re: [Beowulf] Theoretical vs. Actual Performance

2018-02-22 Thread Benson Muite
There is a very nice and simple Max flops code that requires much less tuning than Linpack. It is described in pg 57 of: Rahman "Intel® Xeon Phi™ Coprocessor Architecture and Tools" https://link.springer.com/book/10.1007%2F978-1-4302-5927-5 An example Fortran code is here: https://github.com/bk

Re: [Beowulf] Monitoring and Metrics

2017-10-08 Thread Benson Muite
May also be of interest: JobDigest – Detailed System Monitoring-Based Supercomputer Application Behavior Analysis Dmitry Nikitenko, Alexander Antonov, Pavel Shvets, Sergey Sobolev, Konstantin Stefanov, Vadim Voevodin, Vladimir Voevodin and Sergey Zhumatiy http://russianscdays.org/files/pdf1

Re: [Beowulf] cold spare storage?

2017-08-17 Thread Benson Muite
On 08/17/2017 09:54 PM, mathog wrote: On 17-Aug-2017 11:10, Alex Chekholko wrote: The Google paper from a few years ago showed essentially no correlations between the things you ask about and failure rates. So... do whatever is most convenient for you. This one? http://research.google.c

Re: [Beowulf] ethernet performance testing

2017-03-17 Thread Benson Muite
This was nice (support has ended, but still runs): http://open-mx.gforge.inria.fr/ Comparisons with other things would be interesting. Benson On 03/17/2017 04:33 PM, Lux, Jim (337C) wrote: This is a bit of a blast from the past question.. Back when Beowulfs were the “new hot thing”, we all u

Re: [Beowulf] Suggestions to what DFS to use

2017-02-13 Thread Benson Muite
Hi, Do you have any performance requirements? Benson On 02/13/2017 09:55 AM, Tony Brian Albers wrote: Hi guys, So, we're running a small(as in a small number of nodes(10), not storage(170TB)) hadoop cluster here. Right now we're on IBM Spectrum Scale(GPFS) which works fine and has POSIX suppo