On Mon, Jan 22, 2024 at 11:16 AM Prentice Bisbal <pbis...@pppl.gov> wrote:
> <snip> > >> > Another interesting topic is that nodes are becoming many-core - any >> > thoughts? >> >> Core counts are getting too high to be of use in HPC. High core-count >> processors sound great until you realize that all those cores are now >> competing for same memory bandwidth and network bandwidth, neither of >> which increase with core-count. >> >> Last April we were evaluating test systems from different vendors for a >> cluster purchase. One of our test users does a lot of CFD simulations >> that are very sensitive to mem bandwidth. While he was getting a 50% >> speed up in AMD compared to Intel (which makes sense since AMDs require >> 12 DIMM slots to be filled instead of Intel's 8), he asked us consider >> servers with LESS cores. Even with the AMDs, he was saturating the >> memory bandwidth before scaling to all the cores, causing his >> performance to plateau. For him, buying cheaper processors with lower >> core-counts was better for him, since the savings would allow us to by >> additional nodes, which would be more beneficial to him. >> > > We see this as well in DOE especially when GPUs are doing a significant > amount of the work. > > Yeah, I noticed that Frontier and Aurora will actually be single-socket > systems w/ "only" 64 cores. > Yes, Frontier is a *single* *CPU* socket and *four GPUs* (actually eight GPUs from the user's perspective). It works out to eight cores per Graphics Compute Die (GCD). The FLOPS ratio is roughly 1:100 between the CPU and GPUs. Note, Aurora is a dual CPU and six GPU. I am not sure if the user sees six or more GPUs. The Aurora node is similar to our Summit node but with more connectivity between the GPUs.
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf