Donc, Brice, Can hwloc detect this and give a suitably large complaint?
-- -- Jim -- James Cownie <[email protected]> (and intel, as I’m sure most of you know…) > On 18 Dec 2016, at 21:28, Brice Goglin <[email protected]> wrote: > > Hello > Do you know if all your CPU memory channels are populated? CoD requires > that each half of the CPU has some memory DIMMs (so that each NUMA node > actually contains some memory). If both channels of one half are empty, > the NUMA node might somehow disappear. > Brice > > > > > Le 16/12/2016 23:26, Elken, Tom a écrit : >> Hi John and Greg, >> >> You showed Nodes 0 & 2 (no node 1) and a strange CPU assignment to nodes! >> Even though you had Cluster On Die (CoD) Endabled in your BIOS, I have never >> seen that arrangement of Numa nodes and CPUs. You may have a bug in your >> BIOS or OS ? >> With CoD enabled, I would have expected 4 NUMA nodes, 0-3, and 6 cores >> assigned to each one. >> >> The Omni-Path Performance Tuning User Guide >> http://www.intel.com/content/dam/support/us/en/documents/network-and-i-o/fabric-products/Intel_OP_Performance_Tuning_UG_H93143_v6_0.pdf >> >> does recommend Disabling CoD in Xeon BIOSes (Table 2 on P. 12), but it's >> not considered a hard prohibition. >> Disabling improves some fabric performance benchmarks, but Enabling helps >> some single-node applications performance, which could outweigh the fabric >> performance aspects. >> >> -Tom >> >>> -----Original Message----- >>> From: Beowulf [mailto:[email protected]] On Behalf Of Greg >>> Lindahl >>> Sent: Friday, December 16, 2016 2:00 PM >>> To: John Hearns >>> Cc: Beowulf Mailing List >>> Subject: Re: [Beowulf] NUMA zone weirdness >>> >>> Wow, that's pretty obscure! >>> >>> I'd recommend reporting it to Intel so that they can add it to the >>> descendants of ipath_checkout / ipath_debug. It's exactly the kind of >>> hidden gotcha that leads to unhappy systems! >>> >>> -- greg >>> >>> On Fri, Dec 16, 2016 at 03:52:34PM +0000, John Hearns wrote: >>>> Problem solved. >>>> I have changed the QPI Snoop Mode on these servers from >>>> ClusterOnDIe Enabled to Disabled and they display what I take to be correct >>>> behaviour - ie >>>> >>>> [root@comp006 ~]# numactl --hardware >>>> available: 2 nodes (0-1) >>>> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 >>>> node 0 size: 32673 MB >>>> node 0 free: 31541 MB >>>> node 1 cpus: 12 13 14 15 16 17 18 19 20 21 22 23 >>>> node 1 size: 32768 MB >>>> node 1 free: 31860 MB >>>> node distances: >>>> node 0 1 >>>> 0: 10 21 >>>> 1: 21 10 >>> _______________________________________________ >>> Beowulf mailing list, [email protected] sponsored by Penguin Computing >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >> _______________________________________________ >> Beowulf mailing list, [email protected] sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ > Beowulf mailing list, [email protected] sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
