> On 5 Nov 2014, at 10:12, Mike Belopuhov <m...@belopuhov.com> wrote: > > On 5 November 2014 00:38, David Gwynne <da...@gwynne.id.au> wrote: >> >>> On 30 Oct 2014, at 07:52, Ted Unangst <t...@tedunangst.com> wrote: >>> >>> On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote: >>> >>> >>>> i dunno. im fine with either removing colouring altogether or setting it >>>> from something else completely. i just want a decision to be made cos >>>> right now ph_color isnt set, which is a bug. >>> >>> there. i fixed it. >> >> looks like we were both ignorant and wrong. mikeb@ points out this from the >> original slab paper: >> >> 4.1. Impact of Buffer Address Distribution on Cache >> Utilization >> >> The address distribution of mid-size buffers can >> affect the system’s overall cache utilization. In par- >> ticular, power-of-two allocators - where all buffers >> are 2 n bytes and are 2 n -byte aligned - are pes- >> simal.* Suppose, for example, that every inode >> (∼ 300 bytes) is assigned a 512-byte buffer, 512-byte >> aligned, and that only the first dozen fields of an >> inode (48 bytes) are frequently referenced. Then >> the majority of inode-related memory traffic will be >> at addresses between 0 and 47 modulo 512. Thus >> the cache lines near 512-byte boundaries will be >> heavily loaded while the rest lie fallow. In effect >> only 9% (48/512) of the cache will be usable by >> inodes. Fully-associative caches would not suffer >> this problem, but current hardware trends are toward >> simpler rather than more complex caches. >> >> 4.3. Slab Coloring >> >> The slab allocator incorporates a simple coloring >> scheme that distributes buffers evenly throughout >> the cache, resulting in excellent cache utilization >> and bus balance. The concept is simple: each time >> a new slab is created, the buffer addresses start at a >> slightly different offset (color) from the slab base >> (which is always page-aligned). For example, for a >> cache of 200-byte objects with 8-byte alignment, the >> first slab’s buffers would be at addresses 0, 200, >> 400, ... relative to the slab base. The next slab’s >> buffers would be at offsets 8, 208, 408, ... and so >> on. The maximum slab color is determined by the >> amount of unused space in the slab. >> >> >> we run on enough different machines that i think we should consider this. >> > > well, first of all, right now this is a rather theoretical gain. we > need to test it > to understand if it makes things easier. to see cache statistics we can use > performance counters, however current pctr code might be a bit out of date.
pctr is x86 specific though. how would you measure on all the other archs? > >> so the question is if we do bring colouring back, how do we calculate it? >> arc4random? mask bits off ph_magic? atomic_inc something in the pool? >> read a counter from the pool? shift bits off the page address? > > the way i read it is that you have a per-pool running value pr_color that you > increment by the item alignment or native cache line size modulo space > available for every page you are getting from uvm. however i can see that > it might entail a problem by locating a page header (or was it page boundary? > don't have the code at hand) using simple math. the stuff that finds a page header for a page doesnt care about the address of individual items within a page, and colouring doesnt change an item being wholly contained within a page. ive run with arc4random_uniform coloured addresses for a couple of weeks now without problems of that nature.