We have found that running in cached/quadrant mode gives excellent
performance. With our codes, the optimal is 2threads per core. KNL broke
the model of KNC which did a full context change every clock cycle (so you
HAD to have multiple threads per core) which has had the roll-on effect of
reducin
Well ...
KNL is (only?) superior for highly vectorizable codes that at scale can run out
of MCDRAM (slow scaler performance). Multiple memory and interconnect modes
(requiring a reboot to change) create a programming complexity (e.g managing
affinity across 8-9-9-8 tiles in quad mode) that fe
On 19/11/17 10:40, Jonathan Engwall wrote:
> I had no idea x86 began its life as a co-processor chip, now it is not
> even a product at all.
Ah no, this was when floating point was done via a co-processor for the
Intel x86..
--
Christopher SamuelSenior Systems Administrator
Melbourne