[Beowulf] AMD is looking for expert HPC/AI sysadmins/SRE

2025-06-11 Thread Joe Landman
Hi folks:    Quick post for the day job.  AMD (my employer) is looking for expert systems administrators for a mix of our internal HPC systems, and helping customers stand up their AI and HPC clusters.    AMD systems include a small version of Frontier, some El Cap adjacent nodes, and a vari

[Beowulf] position adverts?

2024-02-22 Thread Joe Landman
Hi fellow beowulfers   I don't know if its bad form to post job adverts here.  Day job (@AMD) is looking for lots of HPC (and AI) folks, think debugging/support/etc. .  Happy to talk with anyone about this.   Regards Joe -- Joe Landman e:joe.land...@gmail.com t: @hpcjoe w:

Re: [Beowulf] Your thoughts on the latest RHEL drama?

2023-06-26 Thread Joe Landman
but it's certainly not going to be cheap or easy. What are you thinking/doing about this? -- Prentice ___ Beowulf mailing list,Beowulf@beowulf.org sponsored by Penguin Computing To change your subs

Re: [Beowulf] [External] Re: old sm/sgi bios

2023-03-23 Thread Joe Landman
y was an he over-the-hill curmudgeon afraid of new technology, there was also a pretty clear conflict of interest for him to be pushing SGI, even though I'm sure our small purchase did nothing to improve SGI stock value. On 3/23/23 2:58 PM, Joe Landman wrote: They had laid off all the good

Re: [Beowulf] [External] Re: old sm/sgi bios

2023-03-23 Thread Joe Landman
isithttps://beowulf.org/cgi-bin/mailman/listinfo/beowulf -- Joe Landman e:joe.land...@gmail.com t: @hpcjoe w:https://scalability.org g:https://github.com/joelandman l:https://www.linkedin.com/in/joelandman ___ Beowulf mailing list, Beowulf@beowulf.org s

Re: [Beowulf] [External] Re: old sm/sgi bios

2023-03-23 Thread Joe Landman
org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf -- Joe Landman e:joe.land...@gmail.com t: @hpcjoe w:https://scalability.org g:https://github.com/joelandman l:https://www.li

Re: [Beowulf] old sm/sgi bios

2023-03-23 Thread Joe Landman
owulf -- Joe Landman e:joe.land...@gmail.com t: @hpcjoe w:https://scalability.org g:https://github.com/joelandman l:https://www.linkedin.com/in/joelandman ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscri

Re: [Beowulf] milan and rhel7

2022-06-29 Thread Joe Landman
ap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload vgif umip pku ospke vaes vpclmulqdq overflow_recov succ

Re: [Beowulf] Anaconda distribution sowing FUD to generate sales?

2022-04-13 Thread Joe Landman
rentice ___ Beowulf mailing list,Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visithttps://beowulf.org/cgi-bin/mailman/listinfo/beowulf -- Joe Landman e

Re: [Beowulf] SC21 Beowulf Bash

2021-11-09 Thread Joe Landman
Pid1 game ... heh ! On 11/9/21 2:13 PM, Douglas Eadline wrote: Here is the Hybrid Beowulf Bash info https://beowulfbash.com/ -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman

Re: [Beowulf] AMD and AVX512

2021-06-21 Thread Joe Landman
x27;ll get a great HPC compiler for C/Fortran. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing list, Beowulf@beowulf.org

[Beowulf] AMD and AVX512

2021-06-20 Thread Joe Landman
especially ones waiting on (slow) RAM and (slower) IO.  Make the RAM and IO faster (lower latency, higher bandwidth), and the system will be far more performant. -- Joe Landman e:joe.land...@gmail.com t: @hpcjoe w:https://scalability.org g:https://github.com/joelandman l:https://www.lin

Re: [Beowulf] head node abuse

2021-03-26 Thread Joe Landman
change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman __

Re: [Beowulf] [External] RIP CentOS 8 [EXT]

2020-12-08 Thread Joe Landman
bin_mailman_listinfo_beowulf&d=DwIGaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=gSesY1AbeTURZwExR_OGFZlp9YUzrLWyYpGmwAw4Q50&m=jLfA-668qdAa9HzPD-HBTocn7f-NX1ASGLHzPe9-pDs&s=sWhmJZWjbMIRK4zSFnuL9kIlUUDTxkBGjmG8M6jKK4w&e= -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org

Re: [Beowulf] perl with OpenMPI gotcha?

2020-11-20 Thread Joe Landman
most recently on large supers over the past few months. Thanks, David Mathog ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin

Re: [Beowulf] Julia on POWER9?

2020-10-16 Thread Joe Landman
on (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___

Re: [Beowulf] Julia on POWER9?

2020-10-15 Thread Joe Landman
t support the POWER architecture anymore because they no longer have access to POWER hardware. Most of this information comes from the Julia GitHub or Julia Discourse conversations. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l

Re: [Beowulf] First cluster in 20 years - questions about today

2020-02-03 Thread Joe Landman
To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Joe Landman
scribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing list, Beowulf@beow

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Joe Landman
tion (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Be

Re: [Beowulf] 40kw racks?

2019-10-21 Thread Joe Landman
FWIW, have a look at scalematrix rack enclosures. I saw them last week. Can get to 50kW as far as I understand. Disclosure: I met with them last week as part of the day job. No financial relationship with them. Just interesting tech. On October 21, 2019 11:30:16 AM Michael Di Domenico w

[Beowulf] Brief OT: Open positions

2019-07-26 Thread Joe Landman
) in my group as well.  More standard "cloudy" things there (yes, $dayjob does cloud!).   Please ping me on my email in .sig or at $dayjob.  Email there is my first initial + last name at cray dot com.  Thanks, and back to your regularly scheduled cluster/super ... :D -- Joe Landman e

Re: [Beowulf] Lustre on google cloud

2019-07-25 Thread Joe Landman
the hard problem in the mix.  Not technically hard, but hard from a cost/time perspective. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman

Re: [Beowulf] help for metadata-intensive jobs (imagenet)

2019-06-28 Thread Joe Landman
r nodes.  Then put a beegfs file system atop those.  Stage in the images.  Run. This is cheap compared to building the storage you actually need for this workload. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linked

[Beowulf] OT: open positions in HPC, Cloud, networking, services and support etc

2019-05-01 Thread Joe Landman
at Cray as Director of Cloud Services and DevOps !   Thanks! Joe -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing lis

[Beowulf] LFortran ... a REPL/Compiler for Fortran

2019-03-24 Thread Joe Landman
See https://docs.lfortran.org/ .   Figured Jeff Layton would like this :D -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing

Re: [Beowulf] Considering BeeGFS for parallel file system

2019-03-18 Thread Joe Landman
ou in a particular direction versus working with you to design what you need (the smaller shops do this). If you want to do this yourself atop your existing kit, go for it.  Its not hard to set up/configure. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: htt

Re: [Beowulf] Large amounts of data to store and process

2019-03-05 Thread Joe Landman
data frame packages.  R, Julia, and I think Python can all handle this without too much pain. [1] https://gssc.esa.int/navipedia/index.php/Relativistic_Clock_Correction [2] http://www.astronomy.ohio-state.edu/~pogge/Ast162/Unit5/gps.html -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w:

Re: [Beowulf] Large amounts of data to store and process

2019-03-04 Thread Joe Landman
oks like a nail" view as much as possible. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing list, Beowulf@beowulf.org

Re: [Beowulf] Anybody here still use SystemImager?

2019-02-28 Thread Joe Landman
On 2/27/19 9:08 PM, David Mathog wrote: Joe Landman wrote: [...] I'm about 98% of the way there now, with a mashup of parts from boel and Centos 7. The initrd is pretty large though. Wasted most of a day on a mysterious issue with "sh" (busybox) not responding to the

Re: [Beowulf] Anybody here still use SystemImager?

2019-02-26 Thread Joe Landman
oot-init.d file, of the form 'grep -q option= /proc/cmdline). I use this for doing all my booting of immutable images.  Just need the kernel, and the initramfs.  I can build one for you if you want, and you can play with it. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://sc

Re: [Beowulf] USB flash drive bootable distro to check cluster health.

2019-01-11 Thread Joe Landman
and, or most drivers. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Comp

Re: [Beowulf] Fortran is Awesome

2018-11-29 Thread Joe Landman
ed by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman

Re: [Beowulf] Fortran is Awesome

2018-11-28 Thread Joe Landman
http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing list, Beowulf@beowulf.org spon

Re: [Beowulf] More about those underwater data centers

2018-11-08 Thread Joe Landman
make sure one does not ignore the vapor, or potential heat induced reaction products of the vapor.  Fluorinert has some issues:  https://en.wikipedia.org/wiki/Fluorinert#Toxicity if you overcook it ... -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://githu

[Beowulf] SC18

2018-11-06 Thread Joe Landman
It feels weird attending SC18, and not being an exhibitor. Definitely looking forward to it. Beobash will (of course) be fun ... and I'm looking forward to (finally) being able to attend talks, poster sessions, panels. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w:

Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-30 Thread Joe Landman
ts. I'll paraphrase Churchill here:  Systemd is the worst, except for all the rest. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowu

Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-30 Thread Joe Landman
lf -- MailScanner: Clean ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.

Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-29 Thread Joe Landman
nother point of mine.  And Greg K @Sylabs is getting free exposure here :D -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing list,

Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-29 Thread Joe Landman
e that ubuntu non-LTS are potentially broken (bleeding edge). -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing list, Beowulf@beowulf

Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-29 Thread Joe Landman
t in Debian, and significant effort/pain in RH/CentOS, usually employing modules or similar construct. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman

Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-29 Thread Joe Landman
Glad to see that! [...] Robert G. Brown   http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567  Fax: 919-660-2525 email:r...@phy.duke.edu -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scal

Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-29 Thread Joe Landman
igest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing

Re: [Beowulf] If I were specifying a new custer...

2018-10-12 Thread Joe Landman
e worked with some ARM product builders in the past, and have been burned by the misalignment between reality and rhetoric. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman _

Re: [Beowulf] its going to be big

2018-10-12 Thread Joe Landman
/20181011005476/en/Australia’s-DownUnder-GeoSolutions-Selects-Skybox-Datacenters-Houston -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf

Re: [Beowulf] SIMD exception kernel panic on Skylake-EP triggered by OpenFOAM?

2018-09-09 Thread Joe Landman
0-0xbfff) --8< snip snip 8<-- All the best! Chris -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman __

Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Joe Landman
ailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman

Re: [Beowulf] Jupyter and EP HPC

2018-07-27 Thread Joe Landman
emble output.  SGI turned that into a product. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing list, Beowulf@beowulf.org sponsor

Re: [Beowulf] Lustre Upgrades

2018-07-25 Thread Joe Landman
On 07/25/2018 04:36 PM, Prentice Bisbal wrote: Paging Dr. Joe Landman, paging Dr. Landman... My response was "I'd seen/helped build/benchmarked some very nice/fast CephFS based storage systems in $dayjob-1.  While it is a neat system, if you are focused on availability, scalab

Re: [Beowulf] Lustre Upgrades

2018-07-24 Thread Joe Landman
s of about 16TiB last I checked.   If you need more, replace minio with another system (igneous, ceph, etc.).  Ping me offline if you want to talk more. [...] -- Joe Landman e:joe.land...@gmail.com t: @hpcjoe w:https://scalability.org g:https://github.com/joelandman l:https://www.linkedin.com/in/

Re: [Beowulf] Lustre Upgrades

2018-07-24 Thread Joe Landman
s of about 16TiB last I checked.   If you need more, replace minio with another system (igneous, ceph, etc.).  Ping me offline if you want to talk more. [...] -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linke

Re: [Beowulf] Lustre Upgrades

2018-07-24 Thread Joe Landman
s of about 16TiB last I checked.   If you need more, replace minio with another system (igneous, ceph, etc.).  Ping me offline if you want to talk more. [...] -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linke

Re: [Beowulf] Lustre Upgrades

2018-07-24 Thread Joe Landman
ystem, if you are focused on availability, scalability, and performance, its pretty hard to beat BeeGFS.  We'd ($dayjob-1) deployed several very large/fast file systems with it on our spinning rust, SSD, and NVMe units. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalabi

Re: [Beowulf] Working for DUG, new thead

2018-06-19 Thread Joe Landman
On 6/19/18 2:47 PM, Prentice Bisbal wrote: On 06/13/2018 10:32 PM, Joe Landman wrote: I'm curious about your next gen plans, given Phi's roadmap. On 6/13/18 9:17 PM, Stu Midgley wrote: low level HPC means... lots of things.  BUT we are a huge Xeon Phi shop and need low-level p

Re: [Beowulf] Working for DUG, new thead

2018-06-13 Thread Joe Landman
Midgley sdm...@gmail.com <mailto:sdm...@gmail.com> ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe L

Re: [Beowulf] Working for DUG, new thead

2018-06-13 Thread Joe Landman
scription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe c: +1 734 612 4615 w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman __

Re: [Beowulf] Fwd: Project Natick

2018-06-07 Thread Joe Landman
so ... suddenly discovering that the neat little hole in the pipe enables this highly conductive ionic fluid to short ... somewhere between 1V and 12V DC.  10's to 100's of thousands of Amps.  I wouldn't wanna be anywhere near that when it lets go. -- Joe Landman e: joe.land...

Re: [Beowulf] FPGA storage accelerator

2018-06-06 Thread Joe Landman
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/

Re: [Beowulf] OT, X11 editor which works well for very remote systems?

2018-06-06 Thread Joe Landman
typed on a phone, so auto-co-wrecked ... VNC On 06/06/2018 05:56 PM, David Mathog wrote: Thanks for all the responses. On 06-Jun-2018 14:40, Joe Landman wrote: When I absolutely need a gui for something this this, I'll light up BBC over ssh session.  Performance has been good even cro

Re: [Beowulf] OT, X11 editor which works well for very remote systems?

2018-06-06 Thread Joe Landman
Wait ... nedit? I wrote my thesis with that (LaTeX) some (mumble) decades ago ... On June 6, 2018 5:28:30 PM David Mathog wrote: Off Topic. I need to do some work on a system 3000 miles away. No problem connecting to it with ssh or setting X11 forwarding, but the delays are such that my us

Re: [Beowulf] OT, X11 editor which works well for very remote systems?

2018-06-06 Thread Joe Landman
When I absolutely need a gui for something this this, I'll light up BBC over ssh session. Performance has been good even crossing the big pond. This said, vim handles this nicely as well. On June 6, 2018 5:28:30 PM David Mathog wrote: Off Topic. I need to do some work on a system 3000 mile

Re: [Beowulf] Bright Cluster Manager

2018-05-03 Thread Joe Landman
ling list, Beowulf@beowulf.org <mailto:Beowulf@beowulf.org> sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf <http://www.beowulf.org/mailman/listinfo/beowulf>

Re: [Beowulf] Theoretical vs. Actual Performance

2018-02-22 Thread Joe Landman
__ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalabi

Re: [Beowulf] Theoretical vs. Actual Performance

2018-02-22 Thread Joe Landman
em with my OpenBLAS build. ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https:/

Re: [Beowulf] Theoretical vs. Actual Performance

2018-02-22 Thread Joe Landman
ough tuning per use case mattered significantly. I don't want this to be a discussion of what could be wrong at this point, we will get to that in future posts, I assure you! -- Joe Landman t: @hpcjoe w: https://scalability.org ___ Beowulf mai

Re: [Beowulf] Intel CPU design bug & security flaw - kernel fix imposes performance penalty

2018-01-03 Thread Joe Landman
part (the expectation of Intel fixing it in their newer HW) is all the more reason I'm inclined to believe the fix will be delivered as a tunable. Best, ellis -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://

Re: [Beowulf] Openlava down?

2017-12-23 Thread Joe Landman
On 12/23/2017 05:49 PM, Jeffrey Layton wrote: I tried it but it doesn't come up as the job scheduler - just capabilities of a company. Hmm.. FYI:  https://soylentnews.org/article.pl?sid=16/11/06/0254233 -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g:

Re: [Beowulf] Julia Language

2017-09-19 Thread Joe Landman
beowulf.org/mailman/listinfo/beowulf> ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalabili

Re: [Beowulf] Weird blade performs worse as more cpus are used?

2017-09-14 Thread Joe Landman
u have a fixed sized resource bandwidth contention issue you are fighting. The question is what. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___

Re: [Beowulf] Varying performance across identical cluster nodes.

2017-09-14 Thread Joe Landman
owulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www

Re: [Beowulf] Varying performance across identical cluster nodes.

2017-09-13 Thread Joe Landman
y idea why this is only occuring with RHEL 6 w/ NFS root OS? ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com

Re: [Beowulf] Varying performance across identical cluster nodes.

2017-09-08 Thread Joe Landman
.0°C) Core 2:+33.0°C (high = +82.0°C, crit = +92.0°C) Core 3:+34.0°C (high = +82.0°C, crit = +92.0°C) ... -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/

Re: [Beowulf] RAID5 rebuild, remount with write without reboot?

2017-09-05 Thread Joe Landman
on this system which have no match on the other. Regards, David Mathog mat...@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscripti

Re: [Beowulf] cluster deployment and config management

2017-09-05 Thread Joe Landman
Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman _

Re: [Beowulf] Poor bandwith from one compute node

2017-08-17 Thread Joe Landman
that does this for you ... https://github.com/joelandman/pcilist :D On Thu, Aug 17, 2017 at 12:35 PM, Joe Landman <mailto:joe.land...@gmail.com>> wrote: On 08/17/2017 12:00 PM, Faraz Hussain wrote: I noticed an mpi job was taking 5X longer to run whenever it

Re: [Beowulf] Poor bandwith from one compute node

2017-08-17 Thread Joe Landman
owulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Joe Landman
enable ipoib and then rerun test? It would then show ~40GB/sec I assume. No. 9GB/s is about 80 Gb/s. Infiniband is working. Looks like you might have dual-rail IB setup, or you were doing a bidirectional/full duplex test. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https

Re: [Beowulf] How to know if infiniband network works?

2017-08-02 Thread Joe Landman
level activity that the subnet manager (OpenSM or a switch level version) enables. For OpenMPI, my recollection is that they expect the IB ports to have ethernet addresses as well (and will switch to RDMA after initialization). What does ifconfig -a report? -- Joe Landman e: joe.land

Re: [Beowulf] How to know if infiniband network works?

2017-08-02 Thread Joe Landman
mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman l: https://www.linkedin.com/in/joelandman ___ Beowulf mailing list

[Beowulf] Anyone know whom to speak with at Broadcom about NIC drivers?

2017-07-05 Thread Joe Landman
Hi folks I am trying to find contacts at Broadcom to speak to about NIC drivers. All my networking contacts seem to have moved on. Does anyone have a recommendation as to someone to speak with? Thanks! Joe --- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https

Re: [Beowulf] BeeGFS usage question

2017-06-28 Thread Joe Landman
. All the best, Chris -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org g: https://github.com/joelandman ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or

Re: [Beowulf] Register article on Epyc

2017-06-21 Thread Joe Landman
Yeah, they should make very sweet storage units (single socket sku). Dual socket is also nice, as you'll have 64x lanes of fabric between sockets, as well as 64 from each socket to peripherals. I'd love to see the QPI contention issue just go away. This looks like it pushes back the problem

Re: [Beowulf] GPFS and failed metadata NSD

2017-05-21 Thread Joe Landman
others, but ... wow ... losing Ph.D. project data. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe w: https://scalability.org ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsub

Re: [Beowulf] Not OT, but a quick link to an article on InsideHPC

2017-03-24 Thread Joe Landman
On 03/24/2017 12:02 PM, C Bergström wrote: On Fri, Mar 24, 2017 at 11:48 PM, Joe Landman wrote: On 3/23/17 5:27 PM, C Bergström wrote: [...] No issue, and I am sorry to see this happen. I enjoyed my time using the PathScale compilers. Its sad that an ecosystem chooses not to support

Re: [Beowulf] Not OT, but a quick link to an article on InsideHPC

2017-03-24 Thread Joe Landman
On 3/23/17 5:27 PM, C Bergström wrote: Tiz the season for HPC software to die? https://www.hpcwire.com/2017/03/23/hpc-compiler-company-pathscale-seeks-life-raft/ (sorry I don't mean to hijack your thread, but timing of both announcements is quite overlapping) No issue, and I am sorry to see th

[Beowulf] Not OT, but a quick link to an article on InsideHPC

2017-03-23 Thread Joe Landman
For those who I've not talked with yet ... http://insidehpc.com/2017/03/scalable-informatics-closes-shop/ -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe c: +1 734 612 4615 w: https://scalability.org ___ Beowulf mailing list, Beowulf@beowul

Re: [Beowulf] Suggestions to what DFS to use

2017-02-14 Thread Joe Landman
https://scalability.org/2016/03/not-even-breaking-a-sweat-10gbs-write-to-single-node-forte-unit-over-100gb-net-realhyperconverged-hpc-storage/ Excellent performance, ease of configuration is what you should expect from BeeGFS. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe c: +1 734 612 461

Re: [Beowulf] solaris?

2017-02-14 Thread Joe Landman
et up some infrastructure with this before, and it was relatively painless to use. Think of it as a predecessor to CoreOS, RancherOS, and others. -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe c: +1 734 612 4615 w: https://scalability.org __

Re: [Beowulf] Mobos for portable use

2017-01-20 Thread Joe Landman
change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joe Landman e: joe.land...@gmail.com t: @hpcjoe c: +1 734 612 4615 w: https://scalability.org ___ Beowulf mailing list, Beowulf@beow

Re: [Beowulf] non-stop computing

2016-10-26 Thread Joe Landman
On 10/26/2016 10:20 AM, Prentice Bisbal wrote: How so? By only having a single seat or node-locked license? Either ... for licensed code this is a non-starter. Which is a shame that we still are talking about node locked/single seat in 2016. -- Joseph Landman, Ph.D Founder and CEO Scala

Re: [Beowulf] non-stop computing

2016-10-26 Thread Joe Landman
Licensing might impede this ... Usually does. On 10/26/2016 09:50 AM, Prentice Bisbal wrote: There is a amazing beauty in this simplicity. Prentice On 10/25/2016 02:46 PM, Gavin W. Burris wrote: Hi, Michael. What if the same job ran on two separate nodes, with IO to local scratch? What a

Re: [Beowulf] non-stop computing

2016-10-25 Thread Joe Landman
On 10/25/2016 02:24 PM, Michael Di Domenico wrote: here's an interesting thought exercise and a real problem i have to tackle. i have a researchers that want to run magma codes for three weeks or so at a time. the process is unfortunately sequential in nature and magma doesn't support check poi

Re: [Beowulf] Underwater data centers -- the future?

2016-09-09 Thread Joe Landman
On 09/09/2016 07:20 AM, Tim Cutts wrote: 2. Surely, we heat up the oceans regardless of whether it's directly by cooling with the sea or indirectly by cooling in air, and atmospheric warming slowly warming the oceans. Ultimately it will all come to equilibrium (with possible disastrous consequen

Re: [Beowulf] Parallel programming for Xeon Phis

2016-08-24 Thread Joe Landman
On 08/24/2016 09:51 AM, Prentice Bisbal wrote: his is an old article, but it's relevant to the recent discussion on programming for Xeon Phis, 'code modernization', and the speedups 'code modernization' can provide. https://www.hpcwire.com/2015/08/24/cosmos-team-achieves-100x-speedup-on-co

Re: [Beowulf] Recent HPE support missive

2016-08-23 Thread Joe Landman
On 08/23/2016 10:01 AM, Peter St. John wrote: HPE is in the process of being bought by CSC. ??? On the scale of 12 months you will be contracting with CSC. I thought they were spinning out their services organization to CSC ... not the whole kit and kaboodle ... http://www.csc.com/invest

Re: [Beowulf] bring back 2012?

2016-08-23 Thread Joe Landman
Erp ... On 08/23/2016 09:58 AM, Prentice Bisbal wrote: How much power does that system use at full-tilt? I'm guessing about 2250 - 2500 kW. Prentice On 08/22/2016 07:40 PM, Stu Midgley wrote: I measured the power draw of our 2RU 8 phi nodes with and without fans... the fans draw about 20% pow

Re: [Beowulf] bring back 2012?

2016-08-17 Thread Joe Landman
On 08/17/2016 11:50 AM, Kilian Cavalotti wrote: On Wed, Aug 17, 2016 at 7:10 AM, Prentice Bisbal wrote: When Intel first started marketing the Xeon Phi, they emphasized that you wouldn't need to rewrite your code to use the Xeon Phi. This was a marketing moving to differentiate the Xeon Phi fro

Re: [Beowulf] HP Enterprise to buy SGI

2016-08-12 Thread Joe Landman
On 08/12/2016 10:46 AM, Douglas Eadline wrote: I remember when the old HP bought Convex. More like 1 + 1 = .2 in that case. And, then in recent years many of the old Convex crew emerged as Convey which was then bought by Micron last year. Maybe I am biased, but I see actual (strong) value in

Re: [Beowulf] HP Enterprise to buy SGI

2016-08-11 Thread Joe Landman
On 08/11/2016 07:22 PM, Christopher Samuel wrote: So SGI is getting bought (yet again), this time by HP Enterprise. http://investors.sgi.com/releasedetail.cfm?ReleaseID=984160 Go off to class for a few hours and stuff happens ... For their in memory analytics machines. Go figure. -- Jose

[Beowulf] Any pointers to the what the fields in the sysfs for infiniband are?

2016-06-12 Thread Joe Landman
I am working on extracting meaningful data on various components for our monitoring tools, and realized that I don't have a good writeup anywhere (other than the source) of what the fields are. Anyone have or know of such a writeup? For example: root@n01:/sys/devices/pci:00/:00:03.0/

  1   2   3   4   5   6   7   8   9   10   >