I think this issue has something to do with our access pattern (although we
run very limited commands and not very high traffic either).

We always start having issues on the same instance (I guess because of the
system accessing a specific key). When we notice the issue we bounce the
instance within 15/20 mins, I don't know if you think this is not enough
time to recover.

Sometimes the issue "moves" to other instaces in other servers (our client
doesn't rebalance so the system is trying to access completely different
keys). On the other servers sometimes the issue goes away on its own or the
spike is not at 100pct.
On Aug 7, 2014 6:36 PM, "dormando" <[email protected]> wrote:

> Those three stats commands aren't problematic. The others I listed are.
> Sadly there aren't stats counters for them, I think... Are you sure it's
> not completely crashing after the CPU spike? it actually recovers on its
> own?
>
> On Thu, 7 Aug 2014, Claudio Santana wrote:
>
> >
> > I run every minute stats, stats items and stats slabs.
> >
> > the only commands executed are remove, incr, add, get, set and cas.
> >
> > I'm running now with 6 threads per instance with 3 per server and
> haven't had the issue again,  not that this change fixed it.
> >
> > I'll definitely update.
> >
> > On Aug 7, 2014 6:13 PM, "dormando" <[email protected]> wrote:
> >       Please upgrade. If you have problems with the latest version we
> can look
> >       into it more.
> >
> >       You can also look at command counters for odd commands being
> given: make
> >       sure nobody's running flushes, or "stats sizes", or "stats
> cachedump"
> >       since those can cause CPU spikes and hangs.
> >
> >       With 1.4.20 you can use "stats conns" to see what the connections
> are
> >       doing during the cpu spike.
> >
> >       On Thu, 7 Aug 2014, Claudio Santana wrote:
> >
> >       > Forgot to say I'm running version 1.4.13  libevent 2.0.16-stable
> >       >
> >       >
> >       >
> >       > On Thu, Aug 7, 2014 at 6:08 PM, Claudio Santana <
> [email protected]> wrote:
> >       >       Sorry for the late response.
> >       >
> >       > My CPU utilization normally is min 2.5% to 6.5% max.
> >       >
> >       > So it's interesting you ask this. The reason why I submitted the
> 1st question is because I've experienced some
> >       random CPU
> >       > utilization spikes. From this about 6% CPU utilization all of
> the sudden it spikes to 100% and I can see the
> >       offending
> >       > process is one of the Memcached instances. Sadly this CPU spike
> is accompanied by all requests timing out causing
> >       the
> >       > whole system to become unusable.
> >       >
> >       > I collect minute by minute stats of all these memcached
> instances and according to my stats this issue happens
> >       within 2
> >       > minutes. I can see in the number of commands there's no increase
> in number of commands being issued right before
> >       the CPU
> >       > spike nor increase in the number of bytes in/out.
> >       >
> >       > Does anybody have any ideas of what could be going on?
> >       >
> >       > I have all Memcached stats collected by minute in Graphite, I
> can provide other stats that could help explain this
> >       issue
> >       > if necessary.
> >       >
> >       >
> >       > On Mon, Aug 4, 2014 at 9:36 PM, dormando <[email protected]>
> wrote:
> >       >       You could run one instance with one thread and serve all
> of that just
> >       >       fine. have you actually looked at graphs of the CPU usage
> of the host?
> >       >       memcached should be practically idle with load that low.
> >       >
> >       >       One with -t 6 or -t 8 would do it just fine.
> >       >
> >       >       On Mon, 4 Aug 2014, Claudio Santana wrote:
> >       >
> >       >       > Dormando, thanks for the quick response. Sorry for the
> confusion, I don't have exact metrics per second
> >       but
> >       >       per minute 1.12
> >       >       > million sets and 1.8 million gets which translates to
> 18,666 sets per minute and 30,000 gets per second.
> >       >       >
> >       >       > These stats are per Memcached instance which I currently
> run 3 on each server.
> >       >       >
> >       >       > Claudio.
> >       >       >
> >       >       >
> >       >       > On Mon, Aug 4, 2014 at 6:22 PM, dormando <
> [email protected]> wrote:
> >       >       >       On Mon, 4 Aug 2014, Claudio Santana wrote:
> >       >       >
> >       >       >       > I have this Memcached cluster where 3 instances
> of Memcached run in a single server. These servers
> >       >       have 24 cores,
> >       >       >       each instance
> >       >       >       > is configured to have 8 threads each. Each
> individual instance serves  have about 5000G gets/sets
> >       a
> >       >       day and about
> >       >       >       3k current
> >       >       >       > connections.
> >       >       >
> >       >       > I don't know what "5000G gets/sets a day" translates to
> in per-second (nor
> >       >       > what the G-unit even is?), can you define this?
> >       >       >
> >       >       > > What would be better? consolidate these 3 instances to
> a single instance per server with 24 threads?
> >       I've
> >       >       read in a few
> >       >       > articles
> >       >       > > that Memcached's performance starts suffering with
> more than 4-6 threads per instance, is this generally
> >       >       true?
> >       >       > >
> >       >       > > How about keeping the 3 instances per server and
> decreasing the number of threads to say 4 or 6? or
> >       >       creating 4 instances
> >       >       > in the
> >       >       > > same servers instead of 3 and decreasing the number of
> threads per instance to 6 so there is one thread
> >       >       per core.
> >       >       > >
> >       >       > > Is there a guide you could recommend to configure the
> right number of threads and strategies to get the
> >       >       most out of a
> >       >       > Memcached
> >       >       > > server/instance?
> >       >       > >
> >       >       > > Thanks,
> >       >       > > Claudio
> >       >       > >
> >       >       > > --
> >       >       > >
> >       >       > > ---
> >       >       > > You received this message because you are subscribed
> to the Google Groups "memcached" group.
> >       >       > > To unsubscribe from this group and stop receiving
> emails from it, send an email to
> >       >       > [email protected].
> >       >       > > For more options, visit
> https://groups.google.com/d/optout.
> >       >       > >
> >       >       > >
> >       >       >
> >       >       > --
> >       >       >
> >       >       > ---
> >       >       > You received this message because you are subscribed to
> the Google Groups "memcached" group.
> >       >       > To unsubscribe from this group and stop receiving emails
> from it, send an email to
> >       >       [email protected].
> >       >       > For more options, visit
> https://groups.google.com/d/optout.
> >       >       >
> >       >       >
> >       >       > --
> >       >       >
> >       >       > ---
> >       >       > You received this message because you are subscribed to
> the Google Groups "memcached" group.
> >       >       > To unsubscribe from this group and stop receiving emails
> from it, send an email to
> >       >       [email protected].
> >       >       > For more options, visit
> https://groups.google.com/d/optout.
> >       >       >
> >       >       >
> >       >
> >       >       --
> >       >
> >       >       ---
> >       >       You received this message because you are subscribed to
> the Google Groups "memcached" group.
> >       >       To unsubscribe from this group and stop receiving emails
> from it, send an email to
> >       >       [email protected].
> >       >       For more options, visit https://groups.google.com/d/optout
> .
> >       >
> >       >
> >       >
> >       > --
> >       >
> >       > ---
> >       > You received this message because you are subscribed to the
> Google Groups "memcached" group.
> >       > To unsubscribe from this group and stop receiving emails from
> it, send an email to
> >       [email protected].
> >       > For more options, visit https://groups.google.com/d/optout.
> >       >
> >       >
> >
> >       --
> >
> >       ---
> >       You received this message because you are subscribed to the Google
> Groups "memcached" group.
> >       To unsubscribe from this group and stop receiving emails from it,
> send an email to
> >       [email protected].
> >       For more options, visit https://groups.google.com/d/optout.
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google
> Groups "memcached" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to