Re: [Beowulf] And wearing another hat ...

2023-11-13 Thread Joshua Mora
-relies-entirely-on-artificial-intelligence-for-its-future -- Original Message -- Received: 09:07 AM CST, 11/11/2023 From: "Douglas Eadline"  To: "Joshua Mora"  Cc: beowulf@beowulf.org Subject: Re: [Beowulf] And wearing another hat ... > > I was talking to a wr

Re: [Beowulf] And wearing another hat ...

2023-11-04 Thread Joshua Mora
It would be good to track how governments try to "regulate" technologies/materials/processes that have an impact on HPC (AI at scale fits into HPC) for good and for bad. It could be for instance as convoluted as DC emissions cap aligning to a climate policy. Joshua -- Original Message --

Re: [Beowulf] Best case performance of HPL on EPYC 7742 processor ...

2020-10-25 Thread Joshua Mora
Reach out AMD, they have specific instructions (including BIOS/OS settings) and even binaries on how to get the best performance. Dont go try and error as is very time consuming. BLIS has also multiple parameters as it has nested loops, so you could also have to try multiple configurations to get t

Re: [Beowulf] [EXTERNAL] Re: HPE completes Cray acquisition

2019-09-27 Thread Joshua Mora
IMHO, it is about the competitive scaleout solutions they can put together: HW -> interconnect, fault tolerance and QOS. SW -> communication libraries(accelerated collectives) and application optimization with scalable frameworks. And to protect their current world wide portfolio of customers (HPE,

Re: [Beowulf] Fwd: SSD performance

2018-07-27 Thread Joshua Mora
Buy server grade not consumer grade. Also use trim. Joshua -- Original Message -- Received: 11:30 PM PDT, 07/26/2018 From: Jonathan Engwall  To: Beowulf Mailing List  Subject: [Beowulf] Fwd: SSD performance > While tarring files after cloning the drive became hopeless, I realized all >

Re: [Beowulf] Question for the community

2015-11-23 Thread Joshua Mora
need to do the review on AMZN web site. Joshua Mora -- Original Message -- Received: 08:19 AM CST, 11/23/2015 From: "Douglas Eadline" To: "John Hearns" Cc: "beowulf@beowulf.org" Subject: [Beowulf] Question for the community > > > > >

Re: [Beowulf] Hyper Convergence Infrastructure

2015-10-03 Thread Joshua Mora
Without giving specific names of technologies, hypervisors can reduce around 5-10% the performance. I tested linpack and stream and I got small reduction ~5%. For networking: bandwidth about 5%, for latency about 10% reduction. On hyperconverged, since data is replicated, the write performance is h

Re: [Beowulf] Hyper Convergence Infrastructure

2015-10-03 Thread Joshua Mora
Economic solutions will compromise those network pipes, so for I/O intensive solutions, you have to be careful on the setups. Therefore understanding the I/O requirements of the applications is fundamental to understand if the hyperconverged solution of choice is going to choke. Best regards, Joshu

Re: [Beowulf] hadoop

2015-02-07 Thread Joshua Mora
Hello Jonathan. Here it is a good document to get you thinking. http://www.cs.berkeley.edu/~rxin/db-papers/WarehouseScaleComputing.pdf Although Doug said "Oh, and Hadoop clusters are not going to supplant your HPC cluster" I believe that there is an ongoing effort to converge Cloud computing (eg.

Re: [Beowulf] Open source and the Draft Report of the Task Force on High Performance Computing

2014-08-28 Thread Joshua Mora
> This is something China readily admits, and is > working to address.* At the pace their, moving though, I imagine it > won't be long before this is fixed, but a cultural change like that > would still probably take some time, I'd say 10 or more years. I read on an interview to a scientist on

Re: [Beowulf] Open source and the Draft Report of the Task Force on High Performance Computing

2014-08-28 Thread Joshua Mora
The codesign effort pushed by the new requirements/constraints (power and performance) is shaking design decision of existing SW frameworks, hence forcing to get rewritten overtime to add new fundamental functionality (eg. progress threads for asynchronous communication and fault tolerance). The

Re: [Beowulf] 8p 16 core x86_64 systems

2014-08-12 Thread Joshua Mora
s. Joshua -- Original Message -- Received: 10:53 AM PDT, 08/12/2014 From: "C. Bergström" To: Joshua Mora Cc: doug.latt...@l-3com.com, beowulf@beowulf.org Subject: Re: [Beowulf] 8p 16 core x86_64 systems > On 08/12/14 11:57 PM, Joshua Mora wrote: > > Hello Doug. > > AMD

Re: [Beowulf] 8p 16 core x86_64 systems

2014-08-12 Thread Joshua Mora
://www.numascale.com Best regards, Joshua Mora. -- Original Message -- Received: 09:02 AM PDT, 08/12/2014 From: To: Subject: [Beowulf] 8p 16 core x86_64 systems > Does anyone know of any manufactures who build an 8 processor (8-way) motherboard which can utilize 16 core opteron ch

Re: [Beowulf] The computing Crunch for CFD at Exascale

2014-06-10 Thread Joshua Mora
My 2 cents. For CFD: From math point of view: hybrid Eulerian-Lagrangian formulation, hybrid numerical + analytic models, automatic differentiation, interval arithmetic for sensitive analysis, machine learning (artificial intelligence). For HPC (that includes CFD): From sw implementation point of

Re: [Beowulf] Negative latency systems

2014-04-01 Thread Joshua Mora
Hi Joe. I don´t think this is such an innovative thing. Isn´t the government already applying these concepts ? I mean spending the money they do not have before hand ? Joshua -- Original Message -- Received: 08:43 AM PDT, 04/01/2014 From: Joe Landman To: beowulf@beowulf.org Subject: [Beo

Re: [Beowulf] Intel Xeon E5-2600 v2

2013-09-14 Thread Joshua Mora
Sandybridge and Ivybridge do not have AVX2 extensions. Haswell does. Therefore SB and IB do 8DP FLOPs/clk/core HW does 16DP FLOPs/clk/core AMD processors Interlagos,Abudhabi support FMA4 and FMA3/4 respectively. They are capable as well of 8DP FLOPs/clk/core. floating point operations in single

Re: [Beowulf] breaking Amdahl's law

2013-06-19 Thread Joshua Mora
DR on gen2" about 3 years ago. Eugen, a link to the post would have been sufficient. Joshua Mora. -- Original Message -- Received: 02:26 AM PDT, 06/19/2013 From: Eugen Leitl To: Beowulf@beowulf.org, i...@postbiota.org Subject: [Beowulf] breaking Amdahl's law > > http

Re: [Beowulf] China to eclipse Titan with 48,000 Intel MICs?

2013-06-05 Thread Joshua Mora
Some apps scale upto several hundreds of thousands of cores. These are Gordon Bell award apps, with sustained levels around the PF. See this link : http://www.ncsa.illinois.edu/News/Stories/BW1year/apps.pdf Another one not in this list is DCA++, which has been ported also to GPUs. Jaguar had five

Re: [Beowulf] Register article on Linux State of the Union

2013-04-18 Thread Joshua Mora
Search on the web for instance "PGAS over Ethernet" to get an idea of where _some_ of those things are headed. Joshua -- Original Message -- Received: 05:50 PM CEST, 04/18/2013 From: "Douglas Eadline" To: "Hearns, John" Cc: "beowulf@beowulf.org" Subject: Re: [Beowulf] Register article o

Re: [Beowulf] May 1st 2013 (10 weeks) Coursera High Performance Scientific Computing

2013-04-10 Thread Joshua Mora
Thanks for the pointer. It seems rather complete but it is missing an important or fundamental topic for high performance computing: profiling, at least at introductory level. Joshua -- Original Message -- Received: 04:45 PM CEST, 04/10/2013 From: Eugen Leitl To: Beowulf@beowulf.org, i..

Re: [Beowulf] Often favorable to hire HPC specialists over more hardware

2013-04-05 Thread Joshua Mora
Sorry, I did not read in order the post. Brian Dobbins made the same point few hours earlier in the very same way. Joshua -- Original Message -- Received: 05:52 AM CEST, 04/06/2013 From: "Joshua Mora" To: Beowulf Mailing List Subject: Re: [Beowulf] Often favorable t

Re: [Beowulf] Often favorable to hire HPC specialists over more hardware

2013-04-05 Thread Joshua Mora
Similar rational arguments apply to why you want to invest into a good compiler. I find though easier to justify these things in terms of the metric money rather than performance metrics. Just HPC people are not that used to use the metric money. In other words it becomes a business decision. Exam

Re: [Beowulf] Revelations on Roadrunner's Retirement

2013-04-05 Thread Joshua mora acosta
It would be good to know what were the levels of efficiency of the applications wrt FLOP/s and GB/s and the typical node count for the runs. Then compare that against the current PF/s systems. Joshua -- Original Message -- Received: 05:49 PM CEST, 04/05/2013 From: Eugen Leitl To: Beowulf

Re: [Beowulf] Roadrunner shutdown

2013-04-01 Thread Joshua mora acosta
I can't wait to read the postmortem report if that becomes publicly available (ie. lessons learned). Joshua Mora. -- Original Message -- Received: 12:46 PM CEST, 04/01/2013 From: John Hearns To: Beowulf Mailing List Subject: [Beowulf] Roadrunner shutdown > I now we have all s

Re: [Beowulf] difference between accelerators and co-processors

2013-03-12 Thread Joshua mora acosta
Good comments. My comments inline. Joshua -- Original Message -- Received: 11:02 PM CDT, 03/11/2013 From: Brendan Moloney To: Joshua mora acosta Cc: Vincent Diepeveen , Mark Hahn , Beowulf List Subject: Re: [Beowulf] difference between accelerators and co-processors > I think t

Re: [Beowulf] difference between accelerators and co-processors

2013-03-10 Thread Joshua mora acosta
See this paper http://synergy.cs.vt.edu/pubs/papers/daga-saahpc11-apu-efficacy.pdf While discrete GPUs underperform wrt APU on host to/from device transfers in a ratio of ~2X, it compensates by far the computing power and local bandwidth ~8-10X. You can cook though a test where you do little comp

Re: [Beowulf] Themes for a talk on beowulf clustering

2013-03-03 Thread Joshua mora acosta
How about: "Methodologies for a sanity check in 30 minutes of your HPC solution ? " Joshua -- Original Message -- Received: 11:36 AM CST, 03/03/2013 From: Andrew Holway To: Bewoulf Subject: [Beowulf] Themes for a talk on beowulf clustering > Hello all, > > I am giving a talk on beowul

Re: [Beowulf] AMD performance (was 500GB systems)

2013-01-11 Thread Joshua mora acosta
Sorry I forgot to attach the file related with the comparison of performance/dollar of 6200 vs E2600. here it is. Thanks, Joshua <>___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mo

Re: [Beowulf] AMD performance (was 500GB systems)

2013-01-11 Thread Joshua mora acosta
f Intel went to match the lines of AMD, ie. to become Perf/USD competitive or on par without having to discount on AMD. Best regards, Joshua Mora ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your sub

Re: [Beowulf] Any beowulfers attending SC12?

2012-11-07 Thread Joshua mora acosta
It is at the planetarium, walking distance from Convention Center. Joshua -- Original Message -- Received: 09:42 PM MST, 11/07/2012 From: "Douglas Eadline" To: "Ellis H. Wilson III" Cc: beowulf@beowulf.org Subject: Re: [Beowulf] Any beowulfers attending SC12? > > > > > >> Has anyone f

[Beowulf] Do theoretical FLOPs matter for real application´s performance ?

2012-11-05 Thread Joshua mora acosta
, Joshua Mora. ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Degree

2012-10-25 Thread Joshua mora acosta
The most exceptional people I have met on any field did not had a formal educational process. Formal education though sets in most cases a foundation to start building the professional skills. These professionals are good because of learning as they needed with high motivation or better said with p

Re: [Beowulf] paralle computation question:please help

2012-09-03 Thread Joshua mora acosta
If a program scales in parallel you should be able to see a reduction of the elapsed time or number of clocks of the entire code. Look at your code for both the total of your code and the key functions that are scaling. By scaling I mean that you would take the analysis of 1 core,2 cores,4 cores,..

Re: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand business

2012-01-31 Thread Joshua mora acosta
I agree with Joe. Plus I know that most of us, if not all, truly want to share knowledge, and why not, opinions as well based on personal experiences as long as "we all do the effort to be respectful with both the individual and the technology and being open /receptive to be criticized as well". T

Re: [Beowulf] Intel buys QLogic InfiniBand business

2012-01-23 Thread Joshua mora acosta
Do you mean IB over QPI ? Either way, High Node Count Coherence will be an issue. In any case, by acquiring their IP it is a step forward towards SoC (System on Chip). A preliminary step (building block) for the Exascale strategy and for low cost enterprise/cloud solutions. Joshua -- Original

Re: [Beowulf] Westmere EX

2011-04-06 Thread Joshua mora acosta
_3D_ FFT scaling will allow you to see how well balanced is the system. Joshua -- Original Message -- Received: 07:40 PM CDT, 04/06/2011 From: Mark Hahn To: Beowulf Mailing List Subject: Re: [Beowulf] Westmere EX > > http://www.theregister.co.uk/2011/04/05/intel_xeon_e7_launch/ > > > >

[Beowulf] Second chance to learn HPC

2011-01-03 Thread Joshua mora acosta
-to-high-performance-scientific-computing/14408128 This is really a second chance to learn HPC. BTW, I learn from the author about 12 years ago in solvers. I am sure he will value direct feedback from the folks in this list. Best regards, Joshua Mora

Re: [Beowulf] Begginers question # 1

2010-10-06 Thread Joshua mora acosta
Hi Gabriel. If your app is something single threaded (ie. runs on single core) that works on a per frame basis and it is fairly cache friendly, then the more cores the better from ecconomical point of view without hurting necessarily on performance. A fat node would do as well as a bunch of tiny no

Re: [Beowulf] Begginers question # 1

2010-10-04 Thread Joshua mora acosta
chnologies that will allow you to get to the next computational/science challenge. Best regards, Joshua Mora. -- Original Message -- Received: 08:53 PM CDT, 10/04/2010 From: Mark Hahn To: gabriel lorenzo Cc: beowulf@beowulf.org Subject: Re: [Beowulf] Begginers question # 1 > > IN

Re: [Beowulf] HPL efficiency on Magny-Cours and Westmere?

2010-07-06 Thread Joshua mora acosta
MC 12core at 2.2GHz: 91% on die, 86.7% on node 2 socket , above 82% on cluster. Joshua -- Original Message -- Received: 10:34 AM CDT, 07/06/2010 From: Mark Hahn To: Beowulf Mailing List Subject: [Beowulf] HPL efficiency on Magny-Cours and Westmere? > Hi all, > can anyone tell me what k

RE: [Beowulf] dollars-per-teraflop : any lists like the Top500?

2010-06-30 Thread Joshua mora acosta
the caches. Joshua -- Original Message -- Received: 10:12 AM CDT, 06/30/2010 From: Bill Rankin To: Joshua mora acosta , Rahul Nabar , Beowulf Mailing List Subject: RE: [Beowulf] dollars-per-teraflop : any lists like the Top500? > > I think the money part will be difficult to get (it

Re: [Beowulf] dollars-per-teraflop : any lists like the Top500?

2010-06-30 Thread Joshua mora acosta
I think the money part will be difficult to get (it is like a politically incorrect question). Nevertheless, you can split the money in two parts: purchase (which I am sure you will never get) and electric bill for kipping the system up and running while you run HPL and when you run stream. Then yo

Re: [Beowulf] Best Way to Use 48-cores For Undergrad Cluster?

2010-05-06 Thread Joshua mora acosta
onfigured as low power consumption where you would downclock the cores as much as you can without affecting the others. And before I forget, every device hanging from chipset (Eth, IB NICs,GPUs) can be also virtualized thanks to IOMMU features. Best regards, Joshua Mora. -- Original Message ---

Re: [Beowulf] AMD 6100 vs Intel 5600

2010-04-01 Thread Joshua mora acosta
It does not make sense to come up with a general/wide statement of product A better than product B and or product C. Each architecture/solution has its strong points and its weak points wrt others _for_a_given_feature. There is also certain level of overlapping of features between those solutions,

Re: [Beowulf] Re: Sun/AMD HPC for Dummies ebook now as PDF

2010-01-13 Thread Joshua mora acosta
Hi, I think you misinterpreted the tittle. It is what it is, "HPC for dummies". Enough to expose in a plain way to anyone what HPC is, which may not be that easy to make a good summary of such a broad topic in 46 pages. It would be great to see though a tittle like "HPC for the next decade" or "bey

Re: [Beowulf] Programming Help needed

2009-11-09 Thread Joshua mora acosta
Just try it and you'll understand what it means communication overhead most of these apps are network latency dominated: small messages but lots because of i) many neighbor processors involved and iterative process. Packing all the faces that need to be exchanges is the right way to go. You can

Re: [Beowulf] Practicality of a Beowulf Cluster

2009-08-28 Thread Joshua mora acosta
n stresses that specific component of the cluster (eg. processor, networking, storage, OS, settings), or sw tools for management/debugging of the HW+SW clustered solution, among many other things... Best regards, Joshua Mora. -- Original Message -- Received: 11:42 PM CEST, 08/27/2009 From:

Re: [Beowulf] Approach For Diagnosing Heat Related Failure?

2009-07-21 Thread Joshua mora acosta
You can run HPL bound to a specific socket maximizing also the memory associated to that socket in order to try to shutdown it because of reaching the "hardware thermal control" due to lack of cooling. On BIOS you can also have HW monitoring to tell you speed of fans and perhaps detect the diff of

[Beowulf] NWChem 5.1.1 + Infiniband OFED 1.3 + GA 4.1.1

2009-04-22 Thread Joshua mora acosta
Hello I am trying to get to work NWChem 5.1.1 + Infiniband OFED 1.3 + GA 4.1.1 using HPMPI (or any other MPI) and PGI or any other MPI. I get the well known problem of not working over the network. Here you have the configuration I am using just in case someone spots the error. It runs fine in node

Re: [Beowulf] Lowered latency with multi-rail IB?

2009-03-27 Thread Joshua mora acosta
From: Håkon Bugge To: Craig Tierney Cc: Joshua mora acosta , dphu...@uncg.edu,beowulf@beowulf.org Subject: Re: [Beowulf] Lowered latency with multi-rail IB? > On Mar 27, 2009, at 18:20 , Craig Tierney wrote: > > > What about using multi-rail to increase message rate? That isn&#x

Re: [Beowulf] Lowered latency with multi-rail IB?

2009-03-27 Thread Joshua mora acosta
The only way I got under 1usec in PingPong test or with ib_[write/send/read]_lat is with QDR and back to back (ie. no switch). With switch I get 1.1[3-7]usec [HP-MPI, OpenMPI, MVAPICH]. It does not matter the MPI although I have to agree with Greg that multirail also increases latency. Multirail is

Re: [Beowulf] Followup on the ConnectX EN 10 GbE driver

2009-02-16 Thread Joshua mora acosta
Hi Joe. Could you please get some dd to either read or write through NFS with lots of small chunks (ie. high request rate rather than high throughput rate) in order to find out how it correlates with the higher latency wrt Infiniband ? Thanks, Joshua -- Original Message -- Received: 11:02

Re: [Beowulf] programming guidence request

2009-01-25 Thread Joshua mora acosta
Answers inline. Joshua -- Original Message -- Received: 12:49 AM CST, 01/23/2009 From: amjad ali To: Beowulf Mailing List Subject: [Beowulf] programming guidence request > Hello All, > I am developing my parallel CFD code on a small cluster. My system has > openmpi installed based on g

Re: [Beowulf] Nehalem and Shanghai code performance for our rzf example

2009-01-17 Thread Joshua mora acosta
Hi Joe. I guess it would be straight forward to get an openMP version run. Can you please share your results on 1,2,4,8 threads ? Use HT off on Nehalem. Use thread affinity through environment variables or explicitly in the code. Power management enabled or disabled, but disclosed. Use SSE3 (Shangh

Re: [Beowulf] MPI build with different compilers

2008-08-19 Thread Joshua mora acosta
Comments inline. Joshua -- Original Message -- Received: Mon, 18 Aug 2008 11:29:22 PM PDT From: "amjad ali" <[EMAIL PROTECTED]> To: "Beowulf Mailing List" Subject: [Beowulf] MPI build with different compilers > Hi, > Please reply me about followings: > > > 1) Is there any significant

Re: [Beowulf] MPI build with different compilers

2008-08-19 Thread Joshua mora acosta
. Joshua -- Original Message -- Received: Tue, 19 Aug 2008 12:37:17 AM PDT From: "Joshua mora acosta" <[EMAIL PROTECTED]> To: "amjad ali" <[EMAIL PROTECTED]>, "Beowulf Mailing List" Subject: Re: [Beowulf] MPI build with different compilers > Comm

Re: [Beowulf] mandatory use of command qsub

2008-05-16 Thread Joshua mora acosta
Hola Javier. For each node /etc/ssh/sshd_config AllowUsers root sgeadmin /etc/init.d/sshd restart On SGE disable interactive access to the queues. Salu2 Joshua. -- Original Message -- Received: Thu, 15 May 2008 09:32:42 AM PDT From: "Javier Lazaro" <[EMAIL PROTECTED]> To: beowulf@beow

Re: Re[4]: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors?

2008-05-09 Thread Joshua mora acosta
It means NorthBridge -- Original Message -- Received: Fri, 09 May 2008 01:09:37 PM PDT From: Jan Heichler <[EMAIL PROTECTED]> To: "Joshua mora acosta" <[EMAIL PROTECTED]>Cc: Mark Hahn <[EMAIL PROTECTED]>, Tom Elken <[EMAIL PROTECTED]>, Beowulf Mail

RE: Re[2]: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors?

2008-05-09 Thread Joshua mora acosta
If you had a 2.3GHz at 2.0GHz NB you would get 17.5GB/sec. Joshua -- Original Message -- Received: Thu, 08 May 2008 02:18:30 PM PDT From: Mark Hahn <[EMAIL PROTECTED]> To: Tom Elken <[EMAIL PROTECTED]>Cc: Beowulf Mailing List Subject: RE: Re[2]: [Beowulf] Recent comparisons of 1600 MHz e

Re: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs. 235x AMD processors?

2008-05-08 Thread Joshua mora acosta
800MHz isn't there but I can say despite its improvement on multiple directions it does not close the huge gap on those memory intensive applications. Joshua Mora. -- Original Message -- Received: Wed, 07 May 2008 01:47:40 PM PDT From: Bill Johnstone <[EMAIL PROTECTED]>

Re: [Beowulf] Purdue Supercomputer

2008-05-03 Thread Joshua mora acosta
Does anyone know what is the detailed plan for building that thing with 200 people in just 1 day? I am very curious to understand what things can be done in parallel, what things are serialized from the point of view of installation, testing and evaluation/assesment. Even monitoring the progress,id

Re: [Beowulf] HPL Benchmarking and Optimization

2008-04-03 Thread Joshua mora acosta
Get for AMD based systems ACML and gcc,pgi or pathscale Get for Intel based systems MKL and intel compiler run N problem size around 90% workload.is, 1.8GB per core memory footprint. Run NB 192 on AMD, I don't know the best blocking factor for MKL. I've tried the same 192 and does fairly well. Set

[Beowulf] Problems with HPMPI under Infiniband

2006-10-03 Thread Joshua mora acosta
faced this type of problem and know a solution/workaround to it. Best regards, Joshua Mora. ___ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf