Re: [Beowulf] Again about NUMA (numactl and taskset)

Vincent Diepeveen Mon, 23 Jun 2008 16:35:06 -0700


On Jun 23, 2008, at 9:12 PM, Mark Hahn wrote:

"how sure are we that a process (or thread) that allocated andinitialized and writes to memory at a single specific memory node,
also keeps getting scheduled at a core on that memory node?"
numactl --cpubind=0 --membind=0
It seems to me that sometimes (like every second or so) threadsjump from 1 memory node to another. I could be wrong,
but i certainly have that impression with the linux kernels.
you can always tie a thread to a core.  for non-bound threads,
the question is really how long the kernel should leave a runnablethread "on" a busy cpu before running it on another (idle) cpu.the kerneldoes try to avoid this, but how hard has in the past depended onthe kernel's guess about the cache footprint of the thread and its"natural"
timeslice (how long it typically runs before yielding.)
______

Mark, thanks for your input. I've tried that numactl several times tono avail. It kept doing wrong. Though this is from a few years ago,last time i toyed a few days with numactl, it could have beenimproved by now, maybe.

It is in itself a very relevant topic that Michael Kuzminskyadresses, as when a thread allocates a lot of memory, it is reallyrelevant.

Now i assume what i'm doing is on paper the ideal situation. If anAMD machine with 2 to 4 memory controllers

has say 4 GB of ram, i give each process (memory - 500MB) / 4.

So that is quite a tad of RAM. This ram gets nonstop hammered uponstoring better ways to achieve finding the holy grail.It's writing about 150k entries a second to RAM on each core tomemory controller at my AMD dual opteron dual core 2.4ghz machine(probably by most of you considered nowadays as old energy wastingjunk, but well). If a cpu has say 750MB ram that meansto get a loading factor alpha of 0.5 into memory (ignoring thechaining that happens a lot by the way as taking theefficiencydecrease bychaining is faster thanks to how latency to RAM works) is roughly 0.5* 750M / (20 bytes * 0.150M/s) = 0.5 * 750 / 3 = 125 seconds


A game can last for an hour or 6.

So even a single switch within that 6 hours to another memory node isvery bad.

I tried to lock with commands each process to a different core (4processes, 4 cores). I still saw a flip sometimes.

Now of course at big clusters/supers some software support frommanufacturers allows automigration of nodes and memory, with goodreasons.So i guess for this type of scheduling we speak at a different level.Avoiding latency of RAM over the network by scheduling

nodes closer to each other is really important.

Yet within 1 node it is a different story.

Suppose we've got 4 search processes P0..P3 and we have 4 cores C0..C3

I am guessing this happens, please tell me it is wrong:

some OS-service gets a timeslice at C1, searchprocess P1 getspushed backwards in the queue.C2's timeslice at memory node 1 finishes. P2 gets pushed back inFIFO queue. P1 is before P2 in the queue,

    so P1 runs on C2.

What i want is in fact that P2 starts to run on C2 and P1 still keepsin queue ntil the service timeslice finished.

Note the above is based purely on guessing based upon a 100assumptions from something i *thought* i saw this or that;i didn't look in kernel code for it. Compared to that Perry is notparanoia at all.

When getting in a further state you only consider someone paranoiawhen a person is paranoia with respect to future occasions;seeing some sort of ghost or bottleneck will be there as "it has tosuck".RGB don't give up yet! Do your public duty and take care that Perryspeaks out on those subjects,as we all have to deal with it, some more than others! Maybe tempthim post on how he deals with potential nuke builders?


Best Regards,
Vincent

_________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visithttp://www.beowulf.org/mailman/listinfo/beowulf


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Again about NUMA (numactl and taskset)

Reply via email to