Re: [Beowulf] [External] anyone have modern interconnect metrics?

Scott Atchley Mon, 22 Jan 2024 08:38:42 -0800

On Mon, Jan 22, 2024 at 11:16 AM Prentice Bisbal <pbis...@pppl.gov> wrote:


> <snip>
>
>> > Another interesting topic is that nodes are becoming many-core - any
>> > thoughts?
>>
>> Core counts are getting too high to be of use in HPC. High core-count
>> processors sound great until you realize that all those cores are now
>> competing for same memory bandwidth and network bandwidth, neither of
>> which increase with core-count.
>>
>> Last April we were evaluating test systems from different vendors for a
>> cluster purchase. One of our test users does a lot of CFD simulations
>> that are very sensitive to mem bandwidth. While he was getting a 50%
>> speed up in AMD compared to Intel (which makes sense since AMDs require
>> 12 DIMM slots to be filled instead of Intel's 8), he asked us consider
>> servers with LESS cores. Even with the AMDs, he was saturating the
>> memory bandwidth before scaling to all the cores, causing his
>> performance to plateau. For him, buying cheaper processors with lower
>> core-counts was better for him, since the savings would allow us to by
>> additional nodes, which would be more beneficial to him.
>>
>
> We see this as well in DOE especially when GPUs are doing a significant
> amount of the work.
>
> Yeah, I noticed that Frontier and Aurora will actually be single-socket
> systems w/ "only" 64 cores.
>
 Yes, Frontier is a *single* *CPU* socket and *four GPUs* (actually eight
GPUs from the user's perspective). It works out to eight cores per Graphics
Compute Die (GCD). The FLOPS ratio is roughly 1:100 between the CPU and
GPUs.

Note, Aurora is a dual CPU and six GPU. I am not sure if the user sees six
or more GPUs. The Aurora node is similar to our Summit node but with more
connectivity between the GPUs.

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Re: [Beowulf] [External] anyone have modern interconnect metrics?

Reply via email to