We are (beta) releasing a drop-in package for SGE6.2u5, SGE6.2u5p1, and SGE6.2u5p2 for thread-binding:
http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html Mainly tested on Intel boxes -- would be great if AMD Magny-Cours server owners offer help with testing! (Play it safe -- setup a 1 or 2-node test cluster by using the non-standard SGE TCP ports). Thanks! Rayson On Mon, Apr 18, 2011 at 2:26 PM, Rayson Ho <raysonlo...@gmail.com> wrote: > For those who had issues with earlier version, please try the latest > loadcheck v4: > > http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html > > I compiled the binary on Oracle Linux, which is compatible with RHEL > 5.x, Scientific Linux or Centos 5.x. I tested the binary on the > standard Red Hat kernel, and Oracle enhanced "Unbreakable Enterprise > Kernel", Fedora 13, Ubuntu 10.04 LTS. > > Optimizing for AMD's NUMA machine characteristics is on the ToDo list. > > Rayson > > > > On Wed, Apr 13, 2011 at 2:15 PM, Prakashan Korambath <p...@ats.ucla.edu> > wrote: >> Hi Rayson, >> >> Do you have a statically linked version? Thanks. >> >> ./loadcheck: /lib64/libc.so.6: version `GLIBC_2.7' not found (required by >> ./loadcheck) >> >> Prakashan >> >> >> >> On 04/13/2011 09:21 AM, Rayson Ho wrote: >>> >>> Carlos, >>> >>> I notice that you have "lx24-amd64" instead of "lx26-amd64" for the >>> arch string, so I believe you are running the loadcheck from standard >>> Oracle Grid Engine, Sun Grid Engine, or one of the forks instead of >>> the one from the Open Grid Scheduler page. >>> >>> The existing Grid Engine (including the latest Open Grid Scheduler >>> releases: SGE 6.2u5p1& SGE 6.2u5p2, or Univa's fork) uses PLPA, and >>> it is known to be wrong on magny-cours. >>> >>> (i.e. SGE 6.2u5p1& SGE 6.2u5p2 from: >>> http://sourceforge.net/projects/gridscheduler/files/ ) >>> >>> >>> Chansup on the Grid Engine mailing list (it's the general purpose Grid >>> Engine mailing list for now) tested the version I uploaded last night, >>> and seems to work on a dual-socket magny-cours AMD machine. It prints: >>> >>> m_topology SCCCCCCCCCCCCSCCCCCCCCCCCC >>> >>> However, I am still fixing the processor, core id mapping code: >>> >>> http://gridengine.org/pipermail/users/2011-April/000629.html >>> http://gridengine.org/pipermail/users/2011-April/000628.html >>> >>> I compiled the hwloc enabled loadcheck on kernel 2.6.34& glibc 2.12, >>> so it may not work on machines running lower kernel or glibc versions, >>> you can download it from: >>> >>> http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html >>> >>> Rayson >>> >>> >>> >>> On Wed, Apr 13, 2011 at 3:03 AM, Carlos Fernandez Sanchez >>> <carl...@cesga.es> wrote: >>>> >>>> This is the output of a 2 sockets, 12 cores/socket (magny-cours) AMD >>>> system >>>> (and seems to be wrong!): >>>> >>>> arch lx24-amd64 >>>> num_proc 24 >>>> m_socket 2 >>>> m_core 12 >>>> m_topology SCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTT >>>> load_short 0.29 >>>> load_medium 0.13 >>>> load_long 0.04 >>>> mem_free 26257.382812M >>>> swap_free 8191.992188M >>>> virtual_free 34449.375000M >>>> mem_total 32238.328125M >>>> swap_total 8191.992188M >>>> virtual_total 40430.320312M >>>> mem_used 5980.945312M >>>> swap_used 0.000000M >>>> virtual_used 5980.945312M >>>> cpu 0.0% >>>> >>>> >>>> Carlos Fernandez Sanchez >>>> Systems Manager >>>> CESGA >>>> Avda. de Vigo s/n. Campus Vida >>>> Tel.: (+34) 981569810, ext. 232 >>>> 15705 - Santiago de Compostela >>>> SPAIN >>>> >>>> -------------------------------------------------- >>>> From: "Rayson Ho"<raysonlo...@gmail.com> >>>> Sent: Tuesday, April 12, 2011 10:31 PM >>>> To: "Beowulf List"<Beowulf@beowulf.org> >>>> Subject: [Beowulf] Grid Engine multi-core thread binding enhancement >>>> -pre-alpha release >>>> >>>>> If you are using the "Job to Core Binding" feature in SGE and running >>>>> SGE on newer hardware, then please give the new hwloc enabled >>>>> loadcheck a try. >>>>> >>>>> http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html >>>>> >>>>> The current hardware topology discovery library (Portable Linux >>>>> Processor Affinity - PLPA) used by SGE was deprecated in 2009, and new >>>>> hardware topology may not be detected correctly by PLPA. >>>>> >>>>> If you are running SGE on AMD Magny-Cours servers, please post your >>>>> loadcheck output, as it is known to be wrong when handled by PLPA. >>>>> >>>>> The Open Grid Scheduler is migrating to hwloc -- we will ship hwloc >>>>> support in later releases of Grid Engine / Grid Scheduler. >>>>> >>>>> http://gridscheduler.sourceforge.net/ >>>>> >>>>> Thanks!! >>>>> Rayson >>>>> _______________________________________________ >>>>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing >>>>> To change your subscription (digest mode or unsubscribe) visit >>>>> http://www.beowulf.org/mailman/listinfo/beowulf >>>> >>>> >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >> > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf