We've seen issues recently where we run a cluster that typically has the
majority of items overwritten in the same slab every day and a sudden
change in data size evicts a ton of data, affecting downstream systems. To
be clear that is our problem, but I think there's a tweak in memcached that
might be useful and another possible feature that would be even better.
The data that is written to this cache is overwritten every day, though the
TTL is 7 days. One slab takes up the majority of the space in the cache.
The application wrote e.g. 10KB (slab 21) every day for each key
consistently. One day, a change occurred where it started writing 15KB
(slab 23), causing a migration of data from one slab to another. We had -o
slab_reassign,slab_automove=1 set on the server, causing large numbers of
evictions on the initial slab. Let's say the cache could hold the data at
15KB per key, but the old data was not technically TTL'd out in it's old
slab. This means that memory was not being freed by the lru crawler thread
(I think) because its expiry had not come around.
lines 1199 and 1200 in items.c:
if ((search->exptime != 0 && search->exptime < current_time) ||
is_flushed(search)) {
If there was a check to see if this data was "orphaned," i.e. that the key,
if accessed, would map to a different slab than the current one, then these
orphans could be reclaimed as free memory. I am working on a patch to do
this, though I have reservations about performing a hash on the key on the
lru crawler thread (if the hash is not already available). I have very
little experience in the memcached codebase so I don't know the most
efficient way to do this. Any help would be appreciated.
Alternatively, and possibly more beneficial is compaction of data in a slab
using the same set of criteria as lru crawling. Understandably, compaction
is a very difficult problem to solve since moving the data would be a pain
in the ass. I saw a couple of discussions about this in the mailing list,
though I didn't see any firm thoughts about it. I think it can probably be
done in O(1) like the lru crawler by limiting the number of items it
touches each time. Writing and reading are doable in O(1) so moving should
be as well. Has anyone given more thought on compaction?
--
---
You received this message because you are subscribed to the Google Groups
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.