Re: [Beowulf] Introduction and question

2019-02-25 Thread Andrew Holway
One of us. One of us. On Sat, 23 Feb 2019 at 15:41, Will Dennis wrote: > Hi folks, > > > > I thought I’d give a brief introduction, and see if this list is a good > fit for my questions that I have about my HPC-“ish” infrastructure... > > > > I am a ~30yr sysadmin (“jack-of-all-trades” type), co

Re: [Beowulf] Jupyter and EP HPC

2018-07-28 Thread Andrew Holway
Hi Jim, There is a group at JPL doing Kubernetes. It might be interesting to ask them if you can execute Jobs on their clusters. Cheers, Andrew On 27 July 2018 at 20:47, Lux, Jim (337K) wrote: > I’ve just started using Jupyter to organize my Pythonic ramblings.. > > > > What would be kind of

Re: [Beowulf] Bright Cluster Manager

2018-05-02 Thread Andrew Holway
On 1 May 2018 at 22:57, Robert Taylor wrote: > Hi Beowulfers. > Does anyone have any experience with Bright Cluster Manager? > I used to work for ClusterVision from which Bright Cluster Manager was born. Although my experience is now quite some years out of date I would still recommend it mainly

Re: [Beowulf] How to debug slow compute node?

2017-08-10 Thread Andrew Holway
I put €10 on the nose for a faulty power supply. On 10 August 2017 at 19:45, Gus Correa wrote: > + Leftover processes from previous jobs hogging resources. > That's relatively common. > That can trigger swapping, the ultimate performance killer. > "top" or "htop" on the node should show somethin

Re: [Beowulf] Anyone know whom to speak with at Broadcom about NIC drivers?

2017-07-05 Thread Andrew Holway
Broadcom technology portfolio is from a bunch of acquisitions over the years so everything tends to be highly siloed. If you can work out which company the tech came from then you can sometimes track someone down with some Linkedin Sleuthing. Of course they may also just be sweating the assets and

Re: [Beowulf] Building OpenFOAM

2016-11-07 Thread Andrew Holway
> I don't have info for that exact version at my fingertips, but did an > OpenFOAM 3.0 build last week. It took ~3 hours on a 24-core Xeon Broadwell > server. > I propose we start a book on when it will finish. Who will give me odds on 30 days for the build to complete. ___

Re: [Beowulf] NFS + IB?

2015-02-22 Thread Andrew Holway
On 22 February 2015 at 19:09, Jeffrey Layton wrote: > Dell has been doing some really good things around NFS and IB using IPoIB. > IPoIB is just vanilla NFS and will work as normal but with lots of lovely bandwidth. You still get the TCP overhead so latency and therefore random IO performance is

Re: [Beowulf] Docker vs KVM paper by IBM

2015-01-29 Thread Andrew Holway
It seems that in environments where you don't care about security then docker is a great enabler so that scientists can make any kind of mess in a sandbox type environment and no one cares because your not on a public facing network. There are however difficulties in using docker with mpi so its pr

Re: [Beowulf] Docker vs KVM paper by IBM

2015-01-28 Thread Andrew Holway
This is the problem that I think everyone using Docker now is looking to > solve. How can you distribute an app in a reasonable manner an remove all > of the silliness you don't need in the app distribution that the base OS > can solve. > Its seems to encourage users to "do whatever they want in

Re: [Beowulf] Docker vs KVM paper by IBM

2015-01-26 Thread Andrew Holway
esults seem pretty obvious > to me. > > > On 01/21/2015 04:26 PM, Andrew Holway wrote: > >> *yawn* >> >> On 19 August 2014 at 18:16, Kilian Cavalotti >> wrote: >> >>> Hi all, >>> >>> On Tue, Aug 19, 2014 at 7:10 AM, Douglas E

Re: [Beowulf] Docker vs KVM paper by IBM

2015-01-21 Thread Andrew Holway
*yawn* On 19 August 2014 at 18:16, Kilian Cavalotti wrote: > Hi all, > > On Tue, Aug 19, 2014 at 7:10 AM, Douglas Eadline wrote: >> I ran across this interesting paper by IBM: >> An Updated Performance Comparison of Virtual Machines and Linux Containers > > It's an interesting paper, but I kin

Re: [Beowulf] strange problem with large file moving between server

2014-09-21 Thread Andrew Holway
> > Regarding ZFS: is that available for Linux now? I lost a bit track here. > Yes. http://zfsonlinux.org/ I would say its ready for production now. Intel are about to start supporting it under Lustre in the next couple of months and they are typically careful about such things. Cheers, Andrew

[Beowulf] Docker and Infiniband

2014-06-25 Thread Andrew Holway
Hello, How would you get IP packets into your docker containers from a Mellanox Infiniband network? I am assuming EoIB with connectx-3 is the answer here? Ta, Andrew ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To chang

[Beowulf] Fancy NAS systems in HPC

2014-06-04 Thread Andrew Holway
Hello, I work for a company that is selling a line of Illumos / ZFS / NFS based NAS appliances. I am trying to work out if these kinds of systems are interesting for HPC systems. Would you pay a premium for a system with a nice API and GUI or would you just download OpenIndiana and do it yourself.

Re: [Beowulf] Brian Guarraci (engineer at Twitter) is building a Parallella cluster in his spare time

2014-06-04 Thread Andrew Holway
> > Yes.. not everything has to be nuts and bolts.. Sticky tape and Velcro. > In the UK, we call this "Blue Peter Engineering". Jim Lux > > > -Original Message- > From: Beowulf [mailto:beowulf-boun...@beowulf.org] On Behalf Of Eugen > Leitl > Sent: 2014-Jun-03 8:27 AM > To: beowulf@beowu

Re: [Beowulf] Infiniband Support ?

2014-05-26 Thread Andrew Holway
Hi, This cluster is now a little bit ancient. I have a feeling that, for the price of upgrading your network to Infiniband (around $1 for QDR), you could buy a single, dual socket server that will be more powerful. The pcie bus on those systems is PCIe x8 Gen1 which would halve the speed anywa

Re: [Beowulf] Hadoop's Uncomfortable Fit in HPC

2014-05-19 Thread Andrew Holway
Important paragraph: "Some larger players in the HPC arena have begun to provide rich support for high-performance parallel file systems as a complete alternative to HDFS. IBM's GPFS file system has a file placement optimization (FPO) capability that allows GPFS to act as a drop-in replacement fo

Re: [Beowulf] Red Hat buys Inktank, maker of Ceph

2014-04-30 Thread Andrew Holway
software is a very difficult thing. Maybe Gluster has less web hits because there are very few people complaining about it. Maybe Lustre has 1,480,000 pages of excellent documentation! On 30 April 2014 17:52, Prentice Bisbal wrote: > On 04/30/2014 12:15 PM, Andrew Holway wrote: >> >> On 3

Re: [Beowulf] Red Hat buys Inktank, maker of Ceph

2014-04-30 Thread Andrew Holway
On 30 April 2014 15:05, Prentice Bisbal wrote: > > On 04/30/2014 08:34 AM, Chris Samuel wrote: >> >> So Red Hat now has Glusterfs and Ceph. Interesting.. >> >> http://www.redhat.com/inktank/ >> >> cheers, >> Chris > > > Gluster never seemed to gain the traction of Lustre or GPFS, though. In the

Re: [Beowulf] Lifespan of a cluster

2014-04-27 Thread Andrew Holway
Hi Jörg, Typically we need to be looking at the amount of performance per unit of power that computers give us in order to get an objective analysis. Lets assume that all computer cores consume 20W of power and cost £200. Models from 10 years ago give us 20GFLOP. Lets assume that this performance

Re: [Beowulf] transcode Similar Video Processing on Beowulf?

2014-04-14 Thread Andrew Holway
> I’m TH and am interested with this > http://www.beowulf.org/pipermail/beowulf/2005-January/011626.html. I’m > currently looking at a solution to launch an object detection app on Host, > with the GUI running on Host and the compute nodes doing all the video > processing and analytics part. I see

Re: [Beowulf] Mutiple IB networks in one cluster

2014-02-03 Thread Andrew Holway
> There is no specific application. This is for a university-wide cluster that > will be used by many different researchers in many different fields using > many different applications. Unless you have a specific application that requires it I would be fairly confident is saying that a secondary I

Re: [Beowulf] Mutiple IB networks in one cluster

2014-02-01 Thread Andrew Holway
On 30 January 2014 16:33, Prentice Bisbal wrote: > Beowulfers, > > I was talking to a colleague the other day about cluster architecture and > big data, and this colleague was thinking that it would be good to have two > separate FDR IB clusters within a single cluster: one for message-passing, >

[Beowulf] ZFS for HPC

2013-11-26 Thread Andrew Holway
Hello, I am looking at Lustre 2.4 currently and have it working in a test environment (actually with minimal shouting and grinding of teeth). Taking a holistic approach: What does ZFS mean for HPC? I am excited about on the fly data compression and snapshotting however for most scientific datas I

Re: [Beowulf] Admin action request

2013-11-25 Thread Andrew Holway
Its been awfully quiet around here since Vincent Diepeveen was kicked On 25 November 2013 20:25, Prentice Bisbal wrote: > On 11/22/2013 02:41 PM, Ellis H. Wilson III wrote: >> On 11/22/13 16:15, Joe Landman wrote: >>> On 11/22/2013 02:00 PM, Ellis H. Wilson III wrote: I think "no support

[Beowulf] DMCA take down notices

2013-11-23 Thread Andrew Holway
http://www.lexology.com/library/detail.aspx?g=4472d242-ff83-430a-8df4-6be5d63422ca There is a legal mechanism to deal with this which protects both the copyright holder and the host. Please can we talk about supercomputers again because this "issue" is _EXTREMELY_ boring. One of the lurking issue

Re: [Beowulf] Admin action request

2013-11-22 Thread Andrew Holway
+1 for Option 1. On 22 November 2013 20:42, Joe Landman wrote: > Folks: > >We are seeing a return to the posting of multiple full articles > again. We've asked several times that this not occur. It appears to be > a strong consensus from many I spoke with at SC13 this year, that there > is

[Beowulf] zfs

2013-09-14 Thread Andrew Holway
Hello, Anyone using ZFS in production? Stories? challenges? Caveats? I've been spending a lot of time with zfs on freebsd and have found it thoroughly awesome. Thanks, Andrew ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing

Re: [Beowulf] NVIDIA buying PGI

2013-08-01 Thread Andrew Holway
> enough. In no way is that a free > market, and the anti-competitive mechanism is obvious. /me starts a compiler company. ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscri

[Beowulf] Python Fabric WAS Re: anyone using SALT on your clusters?

2013-06-29 Thread Andrew Holway
> Saltstack and even Python Fabric are great tools for managing large > numbers of systems. I have not used them with a Beowulf system yet but > the threading and logging for "push" configurations is the best of > both worlds. One great aspect of Python Fabric is the local command > execution so th

[Beowulf] Borrowing / Renting some machines in Austin, Texas

2013-05-25 Thread Andrew Holway
Hiho, We need 3 demo machines to demo our stuff at hostingcon in a few weeks. Anyone got any idea where we can get them from? We will need 16 GB ram, Single CPU and some disks. Server chassis. ta Andrew http://www.hybridcluster.com/ ___ Beowulf mailin

Re: [Beowulf] /. Swedish data center saves $1 million a year using seawater for cooling

2013-05-17 Thread Andrew Holway
Considering the quality and durability of modern computer components; anyone using AC chillers to cool their DC could be considered somewhat moronic. [When will | is it required for] computer manufacturers and DC's be forced to comply with similar stringent emissions regulations applied to the

Re: [Beowulf] Pony: not yours.

2013-05-16 Thread Andrew Holway
There is going to be a paradigm shift or some new kind of disruptive technology is going to pop up before 'exascale' happens. Quantum or some shizzle. It will do to clusters what the cluster did to IBM and Cray. On 16 May 2013 17:01, Eugen Leitl wrote: > On Thu, May 16, 2013 at 10:28:12AM -0400,

Re: [Beowulf] Stop the noise please

2013-05-12 Thread Andrew Holway
Perhaps you could inform us of exactly what kind of discourse is acceptable. On 12 May 2013 18:38, "C. Bergström" wrote: > Can you people please stop the noise, take it offlist, change the > subject.. and or add OT in the subject or something.. > > > Thanks > _

Re: [Beowulf] Programming for heartbeat

2013-05-07 Thread Andrew Holway
On 1 May 2013 04:32, Caio Freitas de Oliveira wrote: > Hi all, > > I'm completely new to all this beowulf thing. I've just now connected my > 2 PCs using Heartbeat + Pacemaker, but I don't have a clue of how to use > them as a single computer. Single computer - maybe google "Single System Image".

Re: [Beowulf] Mellanox ConnectX-3 MT27500 problems

2013-04-28 Thread Andrew Holway
Use a real operating system. On 28 April 2013 09:36, Jörg Saßmannshausen wrote: > Hi Josh, > > interesting. However, I am not using XEN on that machine at all and I don't > have the XEN kernel installed. Thus, that is not the problem. > > All the best from a sunny London > > Jörg > > > On Sonntag

Re: [Beowulf] Are disk MTBF ratings at all useful?

2013-04-20 Thread Andrew Holway
> The net of all this is that (and I'll bet you if you read all 21 of the > references, you'll find this).. Disk drive life time is very hard to > predict. I dunno about that; the error bars are not that big. Given a big enough sample size I think you could predict failure rates with some accuracy

Re: [Beowulf] Are disk MTBF ratings at all useful?

2013-04-20 Thread Andrew Holway
Did anyone post this yet? I thinking this is one of the definitive works on disk failure. http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/disk_failures.pdf On 19 April 2013 17:56, Joe Landman wrote: > On 4/19/2013 11:47 AM, mathog wrote: >>> My

Re: [Beowulf] Clustering VPS servers

2013-03-20 Thread Andrew Holway
the hosts and a specialised bios and/or infiniband firmware to make it work. ta Andrew On 20 March 2013 19:33, Jonathan Aquilina wrote: > Yes that’s what I am curious about. > > ** ** > > *From:* Andrew Holway [mailto:andrew.hol...@gmail.com] > *Sent:* Wednesday, Marc

Re: [Beowulf] Clustering VPS servers

2013-03-20 Thread Andrew Holway
Do you mean a single system image? http://en.wikipedia.org/wiki/Single_system_image On 20 March 2013 19:16, Jonathan Aquilina wrote: > Combining each servers resources into one massive server. > > -Original Message- > From: Reuti [mailto:re...@staff.uni-marburg.de] > Sent: Wednesday, Ma

[Beowulf] Themes for a talk on beowulf clustering

2013-03-03 Thread Andrew Holway
Hello all, I am giving a talk on beowulf clustering to a local lug and was wondering if you had some interesting themes that I could talk about. ta for now. Andrew ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change y

Re: [Beowulf] SSD caching for parallel filesystems

2013-02-11 Thread Andrew Holway
> You seem to have no idea what determines prices when you buy in a lot versus > just 1. Yes indeed. I am obviously clueless in the matter. ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest m

Re: [Beowulf] SSD caching for parallel filesystems

2013-02-10 Thread Andrew Holway
>>> So any SSD solution that's *not* used for latency sensitive workloads, it >>> needs thousands of dollars worth of SSD's. >>> In such case plain old harddrive technology that's at buy in price right >>> now $35 for a 2 TB disk >>> (if you buy in a lot, that's the actual buy in price for big sh

Re: [Beowulf] SSD caching for parallel filesystems

2013-02-10 Thread Andrew Holway
>> Find me an application that needs big bandwidth and doesn't need massive >> storage. Databases. Lots of databases. ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) vi

Re: [Beowulf] SSD caching for parallel filesystems

2013-02-06 Thread Andrew Holway
Lustre is now implementing ZFS which I think has the most advanced SSD caching stuff available. If you have a google around for "roadrunner Lustre ZFS" you might find something juicy. Ta, Andrew Am 6 Feb 2013 um 21:36 schrieb Prentice Bisbal : > Beowulfers, > > I've been reading a lot about

Re: [Beowulf] single machine with 500 GB of RAM

2013-01-09 Thread Andrew Holway
> I would think faster memory would be the only thing that could be done > about it, Indeed but I am imagining that the law of diminishing returns is going to kick in here hard and fast. ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin

Re: [Beowulf] single machine with 500 GB of RAM

2013-01-09 Thread Andrew Holway
As its a single thread I doubt that faster memory is going to help you much. It's going to suck whatever you do. Am 9 Jan 2013 um 17:29 schrieb Jörg Saßmannshausen : > Dear all, > > many thanks for the quick reply and all the suggestions. > > The code we want to use is that one here: > > htt

Re: [Beowulf] single machine with 500 GB of RAM

2013-01-09 Thread Andrew Holway
> So if I am using a single socket motherboard, would that not be faster or does > a single CPU not cope with that amount of memory? I am not aware of a single socket motherboard that can cope with 500GB ram. 2 socket motherboards support about 256GB (128GB per processor) or so at the moment and q

[Beowulf] NFS vs LVM

2012-12-21 Thread Andrew Holway
Dear Listeroons, I have been doing some testing with KVM and Virtuozzo(containers based virtualisation) and various storage devices and have some results I would like some help analyzing. I have a nice big ZFS box from Oracle (Yes, evil but Solaris NFS is amazing). I have 10G and IB connecting t

Re: [Beowulf] What about ceph then?

2012-11-28 Thread Andrew Holway
http://ceph.com/docs/master/rados/ Ceph looks to be a fully distributed object store. 2012/11/28 Jonathan Aquilina > > > On Wed, Nov 28, 2012 at 9:21 AM, Andrew Holway wrote: > >> http://ceph.com/ > > > > What sets ceph apart from something like CIFS or

[Beowulf] What about ceph then?

2012-11-28 Thread Andrew Holway
http://ceph.com/ ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] hadoop

2012-11-27 Thread Andrew Holway
ads of 64MB at a time. If you do not have petabytes of >> big data i would assume doing reads of 64MB at a time >> at your laptop isn't gonna make things better :) >> >> > >> > On Tue, Nov 27, 2012 at 9:21 AM, Andrew Holway >> > wrote: >> >

Re: [Beowulf] hadoop

2012-11-27 Thread Andrew Holway
2012/11/27 Jonathan Aquilina > Interesting indeed. Does LVM span across multiple storage servers? There is Clustered LVM but I dont think this is what your looking for. CLVM allows you to have a shared storage target such as an iSCSI box and give one LV to one box and another LV to another box

Re: [Beowulf] hadoop

2012-11-27 Thread Andrew Holway
machines and mac machines and if > someone wants windows machines? > > On Tue, Nov 27, 2012 at 9:21 AM, Andrew Holway wrote: > >> not as efficient as gluster I would venture. >> >> >> 2012/11/27 Jonathan Aquilina >> >>> Hey guys I was looking at th

Re: [Beowulf] hadoop

2012-11-27 Thread Andrew Holway
not as efficient as gluster I would venture. 2012/11/27 Jonathan Aquilina > Hey guys I was looking at the hadoop page and it got me wondering. is it > possible to cluster together storage servers? If so how efficient would a > cluster of them be? > > -- > Jonathan Aquilina > > _

Re: [Beowulf] More AMD rumors

2012-11-20 Thread Andrew Holway
Intel never wanted monopoly. 2012/11/20 mathog > Should Intel become the sole supplier of x86 chips we can expect > technological stagnation at ever increasing prices in both the x86 > desktop and laptop markets. At that point ARM will likely become the > chip du jour, since there is still com

Re: [Beowulf] A petabyte of objects

2012-11-14 Thread Andrew Holway
> Again, knowing what you need will help us a lot here. > Its primary function will be as an Origin for a content delivery network. As far as I understand it will act in a similar way to a DNS root. Changes will be pushed back to the origin and it will hold the master copy but it wont actually do

[Beowulf] A petabyte of objects

2012-11-13 Thread Andrew Holway
Hello, I've been asked to look how we would provide a PB+50%/year of storage for objects between 0.5 and 10mb per file. It will need some kind of restful interface (only just understanding what this means but it seems to me mostly "is there a http server in front of it") Gluster seems to do th

Re: [Beowulf] PXE boot with X7DWT-INF IB onboard card

2012-11-12 Thread Andrew Holway
OR put in a USB thumb drive in each node. It would be somewhat simpler to set up :) 2012/11/12 Andrew Holway > > > > 2012/11/12 Vincent Diepeveen > >> Problem is not the infiniband NIC's. > > > Yes it is. You have to flash the device firmware in order

Re: [Beowulf] PXE boot with X7DWT-INF IB onboard card

2012-11-12 Thread Andrew Holway
2012/11/12 Vincent Diepeveen > Problem is not the infiniband NIC's. Yes it is. You have to flash the device firmware in order for the BIOS to recognize it as a bootable device. Yes your a bit screwed for booting over IB but booting over 1GE is perfectly acceptable. 1GE bit rate is not 1000 mil

Re: [Beowulf] Maker2 genomic software license experience?

2012-11-08 Thread Andrew Holway
> It's all a bit academic now (ahem) as the MPI component is a Perl > program, and Perl isn't supported on BlueGene/Q. :-( huh? perl mpi? Interpreted language? High performance message passing interface? confused. ___ Beowulf mailing list, Beowulf@beo

Re: [Beowulf] Report critique

2012-10-25 Thread Andrew Holway
2012/10/25 Sabuj Pattanayek : > link? I should have explained, I'm not allowed to publish it. I was planning on emailing the link to those interested. > > Thanks, > Sabuj > > On Thu, Oct 25, 2012 at 9:32 AM, Andrew Holway > wrote: >> Hello. >> >> W

[Beowulf] Report critique

2012-10-25 Thread Andrew Holway
Hello. Would anyone like to volunteer to critique an engineers report on storage benchmarks that I've just completed? Thanks, Andrew ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode o

Re: [Beowulf] Degree

2012-10-25 Thread Andrew Holway
> Well, there was the propeller-top beanie and hazing when I first arrived > at graduate school, the secret physics handshake, the decoder ring, and > the wierd robes they made us wear in quantum mechanics "to keep us safe > from virtual photons", but by in large, no. I mean, except for The > Ritu

[Beowulf] Degree

2012-10-24 Thread Andrew Holway
Hello, I have no education. I left school at 16 and, in my mid 20's, somehow got into supercomputing and now am doing all kinds of silly stuff. Thing is, I need some kind of degree in this stuff to do the kind of work I really want to do. Especially in Germany, organisations involved in HPC usual

Re: [Beowulf] Given all the recent infrastructure talk...

2012-10-04 Thread Andrew Holway
> bitter? sure. to me Canadian HPC is on the verge of extinction, > partly because of this issue. Is Canadien HPC a distinct entity from US HPC? For instance; although chock full of HPC and computational science Ireland does not have enough HPC to support its own industry. They seem to buy from

Re: [Beowulf] let's standardize liquid cooling

2012-09-30 Thread Andrew Holway
Using rear door chillers which were evaporativity cooled we were removing 25kw per rack. This was 72 modern supermicro nodes per rack at full power. It was cheap as chips and very effective but we did just exhaust the heat into the atmosphere. 2012/9/28 Mark Hahn : > I have a modest proposal:

Re: [Beowulf] another crazy idea

2012-09-28 Thread Andrew Holway
2012/9/28 Mark Hahn : > in the spirit of Friday, here's another, even less realistic idea: > let's slide 1U nodes into a rack on their sides, and forget the > silly, fussy, vendor-specific, not-that-cheap rail mechanism entirely. That sounds almost as good as submerging your servers in oil. __

Re: [Beowulf] Quantum Chemistry scalability for large number of processors (cores)

2012-09-26 Thread Andrew Holway
This is probably wildly inaccurate and out of date but might be a good place to start :) http://en.wikipedia.org/wiki/Quantum_chemistry_computer_programs Let the benchmarks begin!!! 2012/9/26 Mark Hahn : > forwarding by request: > > From: Mikhail Kuzminsky > > Do somebody know any modern refere

Re: [Beowulf] Redundant Array of Independent Memory - fork(Re: Checkpointing using flash)

2012-09-25 Thread Andrew Holway
2012/9/24 Justin YUAN SHI : > I think the Redundant Memory paper was really mis-configured. It uses > a storage solution, trying to solve a volatle memory problem but > insisting on eliminating volatility. It looks very much messed up. http://thebrainhouse.ch/gse/silvio/74.GSE/Silvio's%20Corner%20

Re: [Beowulf] General cluster management tools - Re: Southampton engineers a Raspberry Pi Supercomputer

2012-09-24 Thread Andrew Holway
> Basically the transition nearly killed the project/community. Lots of > folks grew tired of this and moved on. We are one such ... I don't have > time for political battles between two nearly identical projects that > should merge. This "sectarianism" is deadly to open source projects ... > fo

Re: [Beowulf] Checkpointing using flash

2012-09-24 Thread Andrew Holway
> Haha, I doubt it -- probably the opposite in terms of development cost. > Which is why I question the original statement on the grounds that > "cost" isn't well defined. Maybe the costs just performance-wise, but > that's not even clear to me when we consider things at huge scales. 40 years a

Re: [Beowulf] Checkpointing using flash

2012-09-24 Thread Andrew Holway
> Of course the physical modelers won't bat an eyelash, > but the common programmer who still tries to figure out > this multithreading thing will be out to lunch. Whenever you push a problem to from hardware software you exponentially increase the cost of solving that problem. ___

Re: [Beowulf] Checkpointing using flash

2012-09-23 Thread Andrew Holway
2012/9/21 David N. Lombard : > Our primary approach today is recovery-base resilience, a.k.a., > checkpoint-restart (C/R). I'm not convinced we can continue to rely on that > at exascale. - Snapshotting seems to be an ugly and inelegant way of solving the problem. For me it is especially laughable

Re: [Beowulf] Checkpointing using flash

2012-09-22 Thread Andrew Holway
> To be exact, the OSI layers 1-4 can defend packet data losses and > corruptions against transient hardware and network failures. Layers > 5-7 provides no protection. MPI sits on top of layer 7. And it assumes > that every transmission must be successful (this is why we have to use > checkpoint in

Re: [Beowulf] In appropriate post (was "In the news again HPC in Iceland")

2012-09-21 Thread Andrew Holway
BMW make their 'Flagship' new mini in the UK near oxford and also own Rolls Royce (cars). 2012/9/21 Vincent Diepeveen : > A NATO bunker doesn't even have enough power to run 0.01% of the > crunching power of BMW, > which is of course a lot larger, as far as generic crunching > hardware, than what

Re: [Beowulf] In the news again HPC in Iceland

2012-09-21 Thread Andrew Holway
> which of course in combination with getting rid of nuclear reactors That wont last long. ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org

Re: [Beowulf] Cannot use more than two nodes on cluster

2012-09-20 Thread Andrew Holway
Hi, Are you sure that you replicated your hostfile to all of your nodes? Please can I see the output of your hosts file? Thanks, Andrew 2012/9/20 Antti Korhonen : > Hi Vincent > > Master works with all slaves. > M0+S1 works, M0+S2 works, M0+S3 works. > All nodes work fine as single nodes. > >

Re: [Beowulf] General cluster management tools - Re: Southampton engineers a Raspberry Pi Supercomputer

2012-09-16 Thread Andrew Holway
> I just saw anecdotes and opinions. Ditto ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] General cluster management tools - Re: Southampton engineers a Raspberry Pi Supercomputer

2012-09-16 Thread Andrew Holway
> Or, he could *PAY* someone to develop it for him. Last I saw, you can > download the source code to OI, and go to town. There is a full build > environment, and the stack is self hosting. Are you making the case that I or my company could get into the operating system development game? > So .

[Beowulf] ZFS / NFS and other stories - Re: General cluster management tools - Re: Southampton engineers a Raspberry Pi Supercomputer

2012-09-16 Thread Andrew Holway
> Choose more carefully next time? Just like you have to do a little due > diligence before deciding if a commercial vendor is a good bet, you > should also evaluate open source projects to see if they're a good > bet. I think at the time I think there was no other choice for ZFS/NFS. The Oracle e

Re: [Beowulf] General cluster management tools - Re: Southampton engineers a Raspberry Pi Supercomputer

2012-09-16 Thread Andrew Holway
>> This is also demonstrably false. Just because cluster vendor A is >> using a completely open source stack does not mean that you have any >> less risk then Cluster Vendor B with their proprietary closed source >> stack. > > Risk is a function of your control over the stack against small or large

Re: [Beowulf] General cluster management tools - Re: Southampton engineers a Raspberry Pi Supercomputer

2012-09-16 Thread Andrew Holway
> With regards to risk perception, I am still blown away at some of the > conversations I have with prospective customers who, still to this day, > insist that "larger company == less risk". This is demonstrably false. > > A company with open products (real open products), open software stacks, >

Re: [Beowulf] Servers Too Hot? Intel Recommends a Luxurious Oil Bath

2012-09-11 Thread Andrew Holway
Sorry. It was a general reply. I was being lazy. 2012/9/10 Chris Samuel : > On Monday 10 September 2012 20:08:53 hol...@th.physik.uni-frankfurt.de > wrote: > >> So please cease debasing these discussions with 'Iceland will never >> have datacenters because you cant use it for HFT". Thou speaks fro

Re: [Beowulf] Servers Too Hot? Intel Recommends a Luxurious Oil Bath

2012-09-11 Thread Andrew Holway
> That's very interesting! Where do you find out information on the banks' > setups? > The few times I have interviewed in the City they wouldn't let me see into > the server rooms. I just know a bit about RBS setup as I was interested in working for them a little while back. I've recently learn

Re: [Beowulf] Servers Too Hot? Intel Recommends a Luxurious Oil Bath

2012-09-11 Thread Andrew Holway
You make about as much sense as a Japanese VCR instruction manual! 2012/9/10 Vincent Diepeveen : > You are rather naive to believe that public released numbers, other > than total CPU's sold, will give you ANY data on > what holds the biggest secrets of society. Secrecy is a higher > priority here

Re: [Beowulf] NFSoIB and SRP benchmarking

2012-08-26 Thread Andrew Holway
Sorry, NFSoIB is more commonly known as NFSoRDMA. This link explains it quite well. http://www.opengridcomputing.com/nfs-rdma.html. Currently my export looks like this. /dev/shm 10.149.0.0/16(rw,fsid=1,no_root_squash,insecure) Where 10.149.0.0 is the IPoIB interf

Re: [Beowulf] RFC: Restrict posting to subscribers only ?

2012-08-24 Thread Andrew Holway
Maybe we should have a 'Please send us a brief email explaining why you would like to join the list so that we know that your not a spammer" 2012/8/24 Christopher Samuel : > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hi all, > > I'd like to solicit opinions from list members on changing

Re: [Beowulf] Doing i/o at a small cluster

2012-08-18 Thread Andrew Holway
> So some sort of distributed file system seems the best option, and a > lot cheaper and a lot faster than a dedicated fileserver > that will not be able to keep up. a) ZFS doesn't use hardware raid. at all. ever. b) 500MB/s (actually 1GB/s) of I/O will chew up a quite large amount of resource. J

Re: [Beowulf] Doing i/o at a small cluster

2012-08-18 Thread Andrew Holway
2012/8/17 Vincent Diepeveen : > The homepage looks very commercial and they have a free trial on it. > You refer to the free trial? http://nexentastor.org/ - Sorry wrong link. Its a commercially backed open source project. > Means buy raid controller. That's extra cost. That depends upon what > i

Re: [Beowulf] Doing i/o at a small cluster

2012-08-17 Thread Andrew Holway
How about something like putting all your disks in one basket and getting a ZFS / NFSoRDMA solution such as nexenta. They have a nice open source distribution. 2012/8/17 Vincent Diepeveen : > The idea someone brought me on by means of a private email is > to use a distributed file system and spli

Re: [Beowulf] lm_sensors and clusters and wrong intel cpu readings

2012-08-16 Thread Andrew Holway
On AMD sensors at least the reading is a 'relative value' with 70C indicating an overheat. The processor is fine unless it is actually clocking its self down to a lower ACPI power state. It is seemingly impossible to overheat modern CPUS. When its clocking down you should see messages in /var/log/

Re: [Beowulf] What is the right lubricant for computer rack sliding rails?

2009-02-08 Thread andrew holway
I wonder what BOFH would do? On Thu, Feb 5, 2009 at 4:49 PM, Gus Correa wrote: > Dear Beowulfers > > A mundane question: > > What is the right lubricant for computer rack sliding rails? > Silicone, paraffin, graphite, WD-40, machine oil, grease, other? > > Thank you, > Gus Correa > --

[Beowulf] Scandinavian companies working in the HPC sector

2008-10-22 Thread andrew holway
Hello, I want to compile a list of Nordic companies engaged with supercomputing technologies. Manufacturers, ISV's, Integrators, Designers etc, Google isn't being very forthcoming. Thanks Andrew ___ Beowulf mailing list, Beowulf@beowulf.org To change

Re: [Beowulf] slightly [OT] smp boxes

2008-10-20 Thread andrew holway
> Do you really mean SMP, or just 8, 16, or 32 nodes in the same box, > running a single system image (SSI) of the OS? Yes, I need a shared memory machine. ___ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscr

[Beowulf] slightly [OT] smp boxes

2008-10-17 Thread andrew holway
Hi If you wanted to buy an smp machine of 8, 16 or 32 sockets, what would be your options? Thanks ___ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowu

Re: [Beowulf] Has DDR IB gone the way of the Dodo?

2008-10-01 Thread andrew holway
> Has DDR IB gone the way of the dodo bird and been supplanted by QDR? I don't think front side busses are fast enough for QDR yet. ___ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.b

Re: [Beowulf] Pretty High Performance Computing

2008-09-24 Thread andrew holway
> I call this Pretty High Performance Computing (PHPC). Or high productivity computing? ___ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Roll your own cluster management system with ClusterVisionOS v4 - was: [Beowulf] What services do you run on your cluster nodes?

2008-09-23 Thread andrew holway
cts please look here:- http://www.clustervision.com/CV_NEWS_summer2008.pdf Cheers Andrew Holway UK Office, ClusterVision tel: +44 7525 057462 [EMAIL PROTECTED] http://www.clustervision.com ___ Beowulf mailing list, Beowulf@beowulf.org To change your

Re: [Beowulf] What services do you run on your cluster nodes?

2008-09-23 Thread andrew holway
We've written our own lightweight monitoring daemon that reports back to a portal facility back on the master node. On Mon, Sep 22, 2008 at 7:32 PM, Prentice Bisbal <[EMAIL PROTECTED]> wrote: > The more services you run on your cluster node (gmond, sendmail, etc.) > the less performance is availab

  1   2   >