You sure that's 1.4.24? None of those fail for me :(
On Mon, 3 Aug 2015, Scott Mansfield wrote:
> The command line I've used that will start is:
>
> memcached -m 64 -o slab_reassign,slab_automove
>
>
> the ones that fail are:
>
>
> memcached -m 64 -o slab_reassign,slab_automove,lru_crawler,lru_maintainer
>
> memcached -o lru_crawler
>
>
> I'm sure I've missed something during compile, though I just used ./configure
> and make.
>
>
> On Monday, August 3, 2015 at 12:22:33 AM UTC-7, Scott Mansfield wrote:
> I've attached a pretty simple program to connect, fill a slab with
> data, and then fill another slab slowly with data of a different size. I've
> been trying to get memcached to run with the lru_crawler and lru_maintainer
> flags, but I get '
>
> Illegal suboption "(null)"' every time I try to start with either in
> any configuration.
>
>
> I haven't seen it start to move slabs automatically with a freshly
> installed 1.2.24.
>
>
> On Tuesday, July 21, 2015 at 4:55:17 PM UTC-7, Scott Mansfield wrote:
> I realize I've not given you the tests to reproduce the behavior.
> I should be able to soon. Sorry about the delay here.
> In the mean time, I wanted to bring up a possible secondary use of the same
> logic to move items on slab rebalancing. I think the system might benefit
> from using the same logic to crawl the pages in a slab and compact the data
> in the background. In the case where we have memory that is assigned to the
> slab but not being used because of replaced
> or TTL'd out data, returning the memory to a pool of free memory will allow a
> slab to grow with that memory first instead of waiting for an event where
> memory is needed at that instant.
>
> It's a change in approach, from reactive to proactive. What do you think?
>
> On Monday, July 13, 2015 at 5:54:11 PM UTC-7, Dormando wrote:
> > First, more detail for you:
> >
> > We are running 1.4.24 in production and haven't noticed any bugs as
> of yet. The new LRUs seem to be working well, though we nearly always run
> memcached scaled to hold all data without evictions. Those with evictions are
> behaving well. Those without evictions haven't seen crashing or any other
> noticeable bad behavior.
>
> Neat.
>
> >
> > OK, I think I see an area where I was speculating on functionality.
> If you have a key in slab 21 and then the same key is written again at a
> larger size in slab 23 I assumed that the space in 21 was not freed on the
> second write. With that assumption, the LRU crawler would not free up that
> space. Also just by observation in the
> macro, the space is not freed
> > fast enough to be effective, in our use case, to accept the writes
> that are happening. Think in the hundreds of millions of "overwrites" in a 6
> - 10 hour period across a cluster.
>
> Internally, "items" (a key/value pair) are generally immutable. The only
> time when it's not is for INCR/DECR, and it still becomes immutable if
> two
> INCR/DECR's collide.
>
> What this means, is that the new item is staged in a piece of free
> memory
> while the "upload" stage of the SET happens. When memcached has all of
> the
> data in memory to replace the item, it does an internal swap under a
> lock.
> The old item is removed from the hash table and LRU, and the new item
> gets
> put in its place (at the head of the LRU).
>
> Since items are refcounted, this means that if other users are
> downloading
> an item which just got replaced, their memory doesn't get corrupted by
> the
> item changing out from underneath them. They can continue to read the
> old
> item until they're done. When the refcount reaches zero the old memory
> is
> reclaimed.
>
> Most of the time, the item replacement happens then the old memory is
> immediately removed.
>
> However, this does mean that you need *one* piece of free memory to
> replace the old one. Then the old memory gets freed after that set.
>
> So if you take a memcached instance with 0 free chunks, and do a rolling
> replacement of all items (within the same slab class as before), the
> first
> one would cause an eviction from the tail of the LRU to get a free
> chunk.
> Every SET after that would use the chunk freed from the replacement of
> the
> previous memory.
>
> > After that last sentence I realized I also may not have explained
> well enough the access pattern. The keys are all overwritten every day, but
> it takes some time to write them all (obviously). We see a huge increase in
> the bytes metric as if the new data for the old keys was being written for
> the first time. Since the "old" slab for
> the same key doesn't
> > proactively release memory, it starts to fill up the cache and then
> start evicting data in the new slab. Once that happens, we see evictions in
> the old slab because of the algorithm you mentioned (random picking / freeing
> of memory). Typically we don't see any use for "upgrading" an item as the new
> data would be entirely new and
> should wholesale replace the
> > old data for that key. More specifically, the operation is always
> set, with different data each day.
>
> Right. Most of your problems will come from two areas. One being that
> writing data aggressively into the new slab class (unless you set the
> rebalancer to always-replace mode), the mover will make memory available
> more slowly than you can insert. So you'll cause extra evictions in the
> new slab class.
>
> The secondary problem is from the random evictions in the previous slab
> class as stuff is chucked on the floor to make memory moveable.
>
> > As for testing, we'll be able to put it under real production
> workload. I don't know what kind of data you mean you need for testing. The
> data stored in the caches are highly confidential. I can give you all kinds
> of metrics, since we collect most of the ones that are in the stats and some
> from the stats slabs output. If you have
> some specific ones that
> > need collecting, I'll double check and make sure we can get those.
> Alternatively, it might be most beneficial to see the metrics in person :)
>
> I just need stats snapshots here and there, and actually putting the
> thing
> under load. When I did the LRU work I had to beg for several months
> before anyone tested it with a production load. This slows things down
> and
> demotivates me from working on the project.
>
> Unfortunately my dayjob keeps me pretty busy so ~internet~ would
> probably
> be best.
>
> > I can create a driver program to reproduce the behavior on a smaller
> scale. It would write e.g. 10k keys of 10k size, then rewrite the same keys
> with different size data. I'll work on that and post it to this thread when I
> can reproduce the behavior locally.
>
> Ok. There're slab rebalance unit tests in the t/ directory which do
> things
> like this, and I've used mc-crusher to slam the rebalancer. It's pretty
> easy to run one config to load up 10k objects, then flip to the other
> using the same key namespace.
>
> > Thanks,
> > Scott
> >
> > On Saturday, July 11, 2015 at 12:05:54 PM UTC-7, Dormando wrote:
> > Hey,
> >
> > On Fri, 10 Jul 2015, Scott Mansfield wrote:
> >
> > > We've seen issues recently where we run a cluster that
> typically has the majority of items overwritten in the same slab every day
> and a sudden change in data size evicts a ton of data, affecting downstream
> systems. To be clear that is our problem, but I think there's a tweak in
> memcached that might be useful and another
> possible feature that
> > would be even
> > > better.
> > > The data that is written to this cache is overwritten every
> day, though the TTL is 7 days. One slab takes up the majority of the space in
> the cache. The application wrote e.g. 10KB (slab 21) every day for each key
> consistently. One day, a change occurred where it started writing 15KB (slab
> 23), causing a migration of data
> from one slab to
> > another. We had -o
> > > slab_reassign,slab_automove=1 set on the server, causing
> large numbers of evictions on the initial slab. Let's say the cache could
> hold the data at 15KB per key, but the old data was not technically TTL'd out
> in it's old slab. This means that memory was not being freed by the lru
> crawler thread (I think) because its expiry
> had not come
> > around.
> > >
> > > lines 1199 and 1200 in items.c:
> > > if ((search->exptime != 0 && search->exptime < current_time)
> || is_flushed(search)) {
> > >
> > > If there was a check to see if this data was "orphaned," i.e.
> that the key, if accessed, would map to a different slab than the current
> one, then these orphans could be reclaimed as free memory. I am working on a
> patch to do this, though I have reservations about performing a hash on the
> key on the lru crawler thread (if
> the hash is not
> > already available).
> > > I have very little experience in the memcached codebase so I
> don't know the most efficient way to do this. Any help would be appreciated.
> >
> > There seems to be a misconception about how the slab classes
> work. A key,
> > if already existing in a slab, will always map to the slab
> class it
> > currently fits into. The slab classes always exist, but the
> amount of
> > memory reserved for each of them will shift with the
> slab_reassign. ie: 10
> > pages in slab class 21, then memory pressure on 23 causes it to
> move over.
> >
> > So if you examine a key that still exists in slab class 21, it
> has no
> > reason to move up or down the slab classes.
> >
> > > Alternatively, and possibly more beneficial is compaction of
> data in a slab using the same set of criteria as lru crawling.
> Understandably, compaction is a very difficult problem to solve since moving
> the data would be a pain in the ass. I saw a couple of discussions about this
> in the mailing list, though I didn't see any
> firm thoughts about
> > it. I think it
> > > can probably be done in O(1) like the lru crawler by limiting
> the number of items it touches each time. Writing and reading are doable in
> O(1) so moving should be as well. Has anyone given more thought on compaction?
> >
> > I'd be interested in hacking this up for you folks if you can
> provide me
> > testing and some data to work with. With all of the LRU work I
> did in
> > 1.4.24, the next things I wanted to do is a big improvement on
> the slab
> > reassignment code.
> >
> > Currently it picks essentially a random slab page, empties it,
> and moves
> > the slab page into the class under pressure.
> >
> > One thing we can do is first examine for free memory in the
> existing slab,
> > IE:
> >
> > - Take a page from slab 21
> > - Scan the page for valid items which need to be moved
> > - Pull free memory from slab 21, migrate the item (moderately
> complicated)
> > - When the page is empty, move it (or give up if you run out of
> free
> > chunks).
> >
> > The next step is to pull from the LRU on slab 21:
> >
> > - Take page from slab 21
> > - Scan page for valid items
> > - Pull free memory from slab 21, migrate the item
> > - If no memory free, evict tail of slab 21. use that chunk.
> > - When the page is empty, move it.
> >
> > Then, when you hit this condition your least-recently-used data
> gets
> > culled as new data migrates your page class. This should match
> a natural
> > occurrance if you would already be evicting valid (but old)
> items to make
> > room for new items.
> >
> > A bonus to using the free memory trick, is that I can use the
> amount of
> > free space in a slab class as a heuristic to more quickly move
> slab pages
> > around.
> >
> > If it's still necessary from there, we can explore "upgrading"
> items to a
> > new slab class, but that is much much more complicated since
> the item has
> > to shift LRU's. Do you put it at the head, the tail, the
> middle, etc? It
> > might be impossible to make a good generic decision there.
> >
> > What version are you currently on? If 1.4.24, have you seen any
> > instability? I'm currently torn between fighting a few bugs and
> start on
> > improving the slab rebalancer.
> >
> > -Dormando
> >
> >
> > On Saturday, July 11, 2015 at 12:05:54 PM UTC-7, Dormando wrote:
> > Hey,
> >
> > On Fri, 10 Jul 2015, Scott Mansfield wrote:
> >
> > > We've seen issues recently where we run a cluster that
> typically has the majority of items overwritten in the same slab every day
> and a sudden change in data size evicts a ton of data, affecting downstream
> systems. To be clear that is our problem, but I think there's a tweak in
> memcached that might be useful and another
> possible feature that
> > would be even
> > > better.
> > > The data that is written to this cache is overwritten every
> day, though the TTL is 7 days. One slab takes up the majority of the space in
> the cache. The application wrote e.g. 10KB (slab 21) every day for each key
> consistently. One day, a change occurred where it started writing 15KB (slab
> 23), causing a migration of data
> from one slab to
> > another. We had -o
> > > slab_reassign,slab_automove=1 set on the server, causing
> large numbers of evictions on the initial slab. Let's say the cache could
> hold the data at 15KB per key, but the old data was not technically TTL'd out
> in it's old slab. This means that memory was not being freed by the lru
> crawler thread (I think) because its expiry
> had not come
> > around.
> > >
> > > lines 1199 and 1200 in items.c:
> > > if ((search->exptime != 0 && search->exptime < current_time)
> || is_flushed(search)) {
> > >
> > > If there was a check to see if this data was "orphaned," i.e.
> that the key, if accessed, would map to a different slab than the current
> one, then these orphans could be reclaimed as free memory. I am working on a
> patch to do this, though I have reservations about performing a hash on the
> key on the lru crawler thread (if
> the hash is not
> > already available).
> > > I have very little experience in the memcached codebase so I
> don't know the most efficient way to do this. Any help would be appreciated.
> >
> > There seems to be a misconception about how the slab classes
> work. A key,
> > if already existing in a slab, will always map to the slab
> class it
> > currently fits into. The slab classes always exist, but the
> amount of
> > memory reserved for each of them will shift with the
> slab_reassign. ie: 10
> > pages in slab class 21, then memory pressure on 23 causes it to
> move over.
> >
> > So if you examine a key that still exists in slab class 21, it
> has no
> > reason to move up or down the slab classes.
> >
> > > Alternatively, and possibly more beneficial is compaction of
> data in a slab using the same set of criteria as lru crawling.
> Understandably, compaction is a very difficult problem to solve since moving
> the data would be a pain in the ass. I saw a couple of discussions about this
> in the mailing list, though I didn't see any
> firm thoughts about
> > it. I think it
> > > can probably be done in O(1) like the lru crawler by limiting
> the number of items it touches each time. Writing and reading are doable in
> O(1) so moving should be as well. Has anyone given more thought on compaction?
> >
> > I'd be interested in hacking this up for you folks if you can
> provide me
> > testing and some data to work with. With all of the LRU work I
> did in
> > 1.4.24, the next things I wanted to do is a big improvement on
> the slab
> > reassignment code.
> >
> > Currently it picks essentially a random slab page, empties it,
> and moves
> > the slab page into the class under pressure.
> >
> > One thing we can do is first examine for free memory in the
> existing slab,
> > IE:
> >
> > - Take a page from slab 21
> > - Scan the page for valid items which need to be moved
> > - Pull free memory from slab 21, migrate the item (moderately
> complicated)
> > - When the page is empty, move it (or give up if you run out of
> free
> > chunks).
> >
> > The next step is to pull from the LRU on slab 21:
> >
> > - Take page from slab 21
> > - Scan page for valid items
> > - Pull free memory from slab 21, migrate the item
> > - If no memory free, evict tail of slab 21. use that chunk.
> > - When the page is empty, move it.
> >
> > Then, when you hit this condition your least-recently-used data
> gets
> > culled as new data migrates your page class. This should match
> a natural
> > occurrance if you would already be evicting valid (but old)
> items to make
> > room for new items.
> >
> > A bonus to using the free memory trick, is that I can use the
> amount of
> > free space in a slab class as a heuristic to more quickly move
> slab pages
> > around.
> >
> > If it's still necessary from there, we can explore "upgrading"
> items to a
> > new slab class, but that is much much more complicated since
> the item has
> > to shift LRU's. Do you put it at the head, the tail, the
> middle, etc? It
> > might be impossible to make a good generic decision there.
> >
> > What version are you currently on? If 1.4.24, have you seen any
> > instability? I'm currently torn between fighting a few bugs and
> start on
> > improving the slab rebalancer.
> >
> > -Dormando
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google
> Groups "memcached" group.
> > To unsubscribe from this group and stop receiving emails from it,
> send an email to [email protected].
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
>