Re: [Beowulf] [External] RIP CentOS 8

2020-12-10 Thread Jon Tegner
On 12/10/20 10:55 AM, Jon Tegner wrote: What about https://linux.oracle.com/switch/centos/ Regards, /jon Possibly a good option - if I didn't trust Oracle even less than IBM.  I wonder if ElRepo will work that the Oracle distro? RH has removed a lot of hardware support lately an

Re: [Beowulf] [External] RIP CentOS 8

2020-12-10 Thread Jon Tegner
What about https://linux.oracle.com/switch/centos/ Regards, /jon On 12/9/20 7:19 AM, Carsten Aulbert wrote: On 09.12.20 07:12, Tony Brian Albers wrote: > So, if you use CentOS, I can only recommend Springdale. Just fresh from my Twitter stream: https://github.com/hpcng/rocky " Rocky Linux

Re: [Beowulf] What is rdma, ofed, verbs, psm etc?

2017-09-21 Thread Jon Tegner
What kind of latency can one expect using RoCE/10G? On 09/21/2017 05:28 PM, Douglas Eadline wrote: I think the main reason it stopped was that IB is the choice of most clusters and most 10G nics provide low latency with default tcp/ip (less than 10us in most cases). ___

Re: [Beowulf] What is rdma, ofed, verbs, psm etc?

2017-09-20 Thread Jon Tegner
What about RoCE? Is this something that is commonly used (I would guess no since I have not found much)? Are there other protocols that are worth considering (like "gamma" which doesn't seem to be developed anymore)? My impression is that with RoCE you have to use specialized hardware (unlike

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Jon Tegner
Isn't latency over RDMA a bit high? When I've tested QDR and FDR I tend to see around 1 us (using mpitests-osu_latency) between two nodes. /jon On 08/03/2017 06:50 PM, Faraz Hussain wrote: Here is the result from the tcp and rdma tests. I take it to mean that IB network is performing at the ex

[Beowulf] Timeout on eth0 when booting disk less

2017-03-12 Thread Jon Tegner
Hi, I'm booting some nodes disk less, using root file system on NFS. The first phase (PXE, tftp etc) works without problems. The second phase, when the system is actually supposed to boot over NFS the machine hangs about every second time - and it seems to be a result of the relevant nic (eth

Re: [Beowulf] Suggestions to what DFS to use

2017-02-13 Thread Jon Tegner
BeeGFS sounds interesting. Is it possible to say something general about how it compares to Lustre regarding performance? /jon On 02/13/2017 05:54 PM, John Hanks wrote: We've had pretty good luck with BeeGFS lately running on SuperMicro vanilla hardware with ZFS as the underlying filesystem. I

Re: [Beowulf] Scaling issues on Xeon E5-2680

2016-02-29 Thread Jon Tegner
Found out what was wrong... Turned out the hardware was delivered with 15 of the 16 memory slots populated No wonder we had performance issues! Anyway, thanks a a lot, all who answered! /jon On 02/29/2016 06:48 PM, Josef Weidendorfer wrote: Am 28.02.2016 um 16:27 schrieb Jon Tegner

Re: [Beowulf] Scaling issues on Xeon E5-2680

2016-02-28 Thread Jon Tegner
ers and flags. Give more details about the software stack Original Message From: Jon Tegner Sent: Sunday, February 28, 2016 22:28 To: beowulf@beowulf.org Subject: [Beowulf] Scaling issues on Xeon E5-2680 Hi, have issues with performance on E5-2680. Each of the nodes have 2 of these 1

[Beowulf] Scaling issues on Xeon E5-2680

2016-02-28 Thread Jon Tegner
Hi, have issues with performance on E5-2680. Each of the nodes have 2 of these 12 core CPUs on SuperMicro SuperServer 1028R-WMR (i.e., 24 cores on each node). For one of our applications (CFD/OpenFOAM) we have noticed that the calculation runs faster using 12 cores on 4 nodes compared to whe

Re: [Beowulf] installint MPI 1.x on RHEL 5.4

2014-04-27 Thread Jon Tegner
Haven't followed this guide http://www.admin-magazine.com/HPC/Articles/Warewulf-Cluster-Manager-Master-and-Compute-Nodes but it certainly looks really good. /jon On 04/27/2014 07:08 AM, Chris Samuel wrote: On Sat, 26 Apr 2014 10:45:05 AM Bhabani Samantray wrote: I want to install MP

[Beowulf] mpitest between qlocgic and mellanox/infiniband

2014-02-23 Thread Jon Tegner
Hi, when evaluating infiniband I usually try to run "osu_latency", "osu_bw" and the likes (from package mpitests-openmpi). However, I have not managed to run these between nodes where one has a Qlogic card and the other one from Mellanox. Is it a bad idea to try to use cards from different v

Re: [Beowulf] anyone using SALT on your clusters?

2013-06-28 Thread Jon Tegner
On 06/28/2013 10:56 PM, Joe Landman wrote: >> >I don't understand your question, how can you eliminate configuration? >> >At some point you have to tell the system what it's supposed to do. > Its done once. Then you don't have to install it again. Its installed. >Its done. I think one of the

Re: [Beowulf] PXE boot with X7DWT-INF IB onboard card

2012-11-13 Thread Jon Tegner
How are you planning to boot your nodes? I have used perceus (http://www.perceus.org/) and was happy with it. There is also Warewulf (http://warewulf.lbl.gov or http://hpc.admin-magazine.com/Articles/Warewulf-Cluster-Manager-Master-and-Compute-Nodes) which I haven't used. Anyone who has compa

Re: [Beowulf] Strange error, gluster/ext4/zone_reclaim_mode

2012-08-30 Thread Jon Tegner
Hi! And thanks for answer, much appreciated! On 08/31/2012 12:47 AM, Mark Hahn wrote: >> However, at one point one of the machines serving the file system went >> down, after spitting out error messages as indicated in >> it >> https://bugzilla.redhat.com/show_bug.cgi?id=770545 >> >> We used the a

[Beowulf] Strange error, gluster/ext4/zone_reclaim_mode

2012-08-30 Thread Jon Tegner
Hi, have this strange error. We run CFD calculations on a small cluster. Basically it consists of bunch of machines connected to a file system. The file system consists of 4 servers, CentOS-6.2, ext4 and glusterfs (3.2.7) on top. Infiniband is used for interconnect. For scheduling/resource m

[Beowulf] Functionality of schedulers

2012-02-29 Thread Jon Tegner
Hi list! Is there any scheduler which has the functionality to automatically put a running job on hold when another job with higher priority is submitted? Preferably the state of the first job should be frozen, and saved to disk, so that it can be restarted again when the higher priority job ha

Re: [Beowulf] personal HPC

2011-12-23 Thread Jon Tegner
Cool! Impressive to have taken it this far! What are the dimensions of the system? And the mainbord for the compute nodes, are you using mini-itx there? Regards, /jon On 12/22/2011 05:51 PM, Douglas Eadline wrote: > For those that don't know, I have been working > on a commodity "desk side" clu

Re: [Beowulf] building Infiniband 4x cluster questions

2011-11-10 Thread Jon Tegner
>>> DRIVERS: >>> Drivers for cards now. Are those all open source, or does it require >>> payment? Is the source released of all those cards drivers, and do >>> they integrate into linux? >> You should get everything you need from the Linux kernel and / or OFED. > > You can also find the drivers o

Re: [Beowulf] Building a Beowulf - Noob

2010-04-17 Thread Jon Tegner
Take a look at perceus, can be really easy, check www.infiscale.com/html/perceus_on_enterprise_linux_qu.html Richard Chang wrote: I am a bit new. Trying to build a cluster from ground up. I need to build a new 32 node cluster from ground up. Hardware has been decided, Nehalam based, but OS is

Re: Re: [Beowulf] 96 cores in silent and small enclosure

2010-04-14 Thread Jon Tegner
rt #, here’s what happens”, because it depends on a lot of > things. > > > On 4/14/10 1:12 AM, "Jon Tegner" <> wrote: > > > the max temp spec is not some arbitrary knob that the chip vendors > > &g

Re: Re: [Beowulf] 96 cores in silent and small enclosure

2010-04-14 Thread Jon Tegner
> > the max temp spec is not some arbitrary knob that the chip vendors > choose out of spiteful anti-green-ness. I wouldn't be surprised to see > some > > > > > > Issue is not the temp spec of current cpus, problem is that it

Re: [Beowulf] 96 cores in silent and small enclosure

2010-04-13 Thread Jon Tegner
Mark Hahn wrote: I find it strange with this rather large temp range, and 55 seems very low to my experience. Could they possibly stand for something else? Did not find any description of the numbers anywhere on that address. I think you should always worry about any temperature measured on a

Re: RE: [Beowulf] 96 cores in silent and small enclosure

2010-04-12 Thread Jon Tegner
On Apr 13, 2010 00:24 "Lux, Jim (337C)" wrote: > > -Original Message- > > From: > > [mailto:beowulf-boun...@beowulf.org] On Behalf Of Jon Tegner > > Sent: Monday, April 12, 2010 11:02 AM > > To: Mark Hahn > > Cc: > > Subject:

Re: [Beowulf] 96 cores in silent and small enclosure

2010-04-12 Thread Jon Tegner
well, you can look up the max operating spec for your particular chips. for instance, http://products.amd.com/en-us/OpteronCPUResult.aspx shows that OS8439YDS6DGN includes chips rated 55-71. (there must be some further package marking to determine which temp spec...) I find it strange with t

Re: [Beowulf] 96 cores in silent and small enclosure

2010-04-11 Thread Jon Tegner
Have done some preliminary tests on the system. Indicates a CPU temperature of 60-65 C after half an hour (will do longer test soon). Have a few questions: * How high cpu temperatures are acceptable (our cluster is built on 6 core AMD opterons)? I know life span is reduced if temperature is hi

Re: [Beowulf] 96 cores in silent and small enclosure

2010-04-07 Thread Jon Tegner
ail.com>> wrote: how much would something like this go for? On Wed, Apr 7, 2010 at 3:56 PM, Jon Tegner mailto:teg...@renget.se>> wrote: In this case we have large "home made" heat sinks, I'll put up more pictures when I get the time. You can che

Re: Re: [Beowulf] 96 cores in silent and small enclosure

2010-04-07 Thread Jon Tegner
On Apr 7, 2010 15:34 "Jonathan Aquilina" wrote: > is one of you an engineer? how well does the air flow in the case. > what > gets air into and out of the case if there are just only 2 fans on the > box? you guys ever consider marketing this product? > >

[Beowulf] 96 cores in silent and small enclosure

2010-04-07 Thread Jon Tegner
We (me and my brother) have been into silent computing, and clusters, for quite some time now. We just recently designed and built a unit equipped with 4 supermicro boards (H8DMT) and 8 cpus. In the actual unit the cpus are Opterons with 6 cores each, but it would be easy enough to switch to cpus w

Re: [Beowulf] test network link quality?

2010-04-02 Thread Jon Tegner
What about netpipe? www.scl.ameslab.gov/netpipe/ /jon David Mathog wrote: Is there a common method for testing the quality of a network link between two networked machines? This is for situations where the link works 99.99% of the time, but should work 99.9% of the time, with the failures

Re: [Beowulf] Small form computers as cluster nodes - any comments about the Shuttle brand ?

2009-08-08 Thread Jon Tegner
David Ramirez wrote: Due to space constraints I am considering implementing a 8-node (+ master) HPC cluster project using small form computers. Knowing that Shuttle is a reputable brand, with several years in the market, I wonder if any of you out there has already used them on clusters and ho

Re: [Beowulf] noobs: what comes next?

2009-06-24 Thread Jon Tegner
Hearns, John wrote: I would guess you are looking are looking at using OpenFOAM for There is also overture https://computation.llnl.gov/casc/Overture/ using overlapping grids. Complete with gridgenerator and a bunch of solvers. Excellent software! /jon ___

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

2008-10-01 Thread Jon Tegner
There seem to be significant advantages using Scyld ClusterWare, I did try it (Scyld?) many years ago (when it was free?) and I was impressed then. However, when looking at penguincomputing.com I don't find any price quotes. It seems - unless I miss something - one has to fill in a rather leng

Re: [Beowulf] Parallel Development Tools

2007-10-17 Thread Jon Tegner
Robert G. Brown wrote: Fedora installs in the future will be done by yum. Yum enables something that is truly marvelous for people who have to install through thin pipes (e.g. DSL links): a two stage interruptable install. It is possible to install a barebones system in the first pass in a rela

Re: [Beowulf] Parallel Development Tools

2007-10-17 Thread Jon Tegner
/2007, Tim Cutts <[EMAIL PROTECTED]> wrote: On 16 Oct 2007, at 10:19 pm, Robert G. Brown wrote: On Tue, 16 Oct 2007, Jon Tegner wrote: You should switch to a .deb-system, to save you some trouble: $ apt-cache search jove jove - Jonathan's Own Version of Emacs - a compact, po

Re: [Beowulf] Parallel Development Tools

2007-10-17 Thread Jon Tegner
You should switch to a .deb-system, to save you some trouble: $ apt-cache search jove jove - Jonathan's Own Version of Emacs - a compact, powerful editor Sorry, couldn't resist ;-) /jon Robert G. Brown wrote: I do realize (*ahem*) that I'm one of three living humans that still use jove, an

Re: [Beowulf] best linux distribution

2007-10-09 Thread Jon Tegner
Tony Travis wrote: I also prefer Debian-based distro's and still run the openMosix kernel under an Ubuntu 6.06.1 LTS server installation on our Beowulf cluster. What I like about APT (the Debian package manager) is the dependency checking and conflict resolution capabilities of "aptitude", w

Re: [Beowulf] Which distro for the cluster?

2007-01-02 Thread Jon Tegner
Robert G. Brown wrote: All of this takes time, time, time. And I cannot begin to describe my life to you, but time is what I just don't got to spare unless my life depends on it. That's the level of triage here -- staunch the spurting arteries first and apply CPR as necessary -- the mere compo