The client is the EVCache client jar: https://github.com/netflix/evcache

When a user calls the batch get function on the client, it will spread
those batch gets out over many servers because it is hashing keys to
different servers. Imagine many of these batch gets happening at the same
time, though, and each server's queue will get a bunch of gets from a bunch
of different user-facing batch gets. It all gets intermixed. These
client-side read queues are rather large (10000) and might end up sending a
batch of a few hundred keys at a time. These large batch gets are sent off
to the servers as "one" getq|getq|getq|getq|getq|getq|getq|getq|getq|getq|noop
package and read back in that order. We are reading the responses fairly
efficiently internally, but the batch get call that the user made is
waiting on the data from all of these separate servers to come back in
order to properly respond to the user in a synchronous manner.

Now on the memcached side, there's many servers all doing this same pattern
of many large batch gets. Memcached will stop responding to that connection
after 20 requests on the same event and go serve other connections. If that
happens, any user-facing batch call that is waiting on any getq command
still waiting to be serviced on that connection can be delayed. It doesn't
normally end up causing timeouts but it does at a low level.

Our timeouts for this app in particular are 5 seconds for a single
user-facing batch get call. This client app is fine with higher latency for
higher throughput.

At this point we have the reqs_per_event set to a rather high 300 and it
seems to have solved our problem. I don't think it's causing any more
consternation (for now), but having a dynamic setting would have lowered
the operational complexity of the tuning.


*Scott Mansfield*

Product > Consumer Science Eng > EVCache > Sr. Software Eng
{
  M: 352-514-9452 <(352)%20514-9452>
  E: [email protected]
  K: {M: mobile, E: email, K: key}
}

On Wed, Jan 25, 2017 at 11:04 AM, dormando <[email protected]> wrote:

> I guess when I say dynamic I mostly mean runttime-settable. Dynamic is a
> little harder so I tend to do those as a second pass.
>
> You're saying your client had head-of-line blocking for unrelated
> requests? I'm not 100% sure I follow.
>
> Big multiget comes in, multiget gets processed slightly slower than normal
> due to other clients making requests, so requests *behind* the multiget
> time out, or the multiget itself?
>
> How long is your timeout? :P
>
> I'll take a look at it as well and see about raising the limit in `-o
> modern` after some performance tests. The default is from 2006.
>
> thanks!
>
> On Wed, 25 Jan 2017, 'Scott Mansfield' via memcached wrote:
>
> > The reqs_per_event setting was causing a client that was doing large
> batch-gets (of a few hundred keys) to see some timeouts. Since memcached
> will delay responding fully until
> > other connections are serviced and our client will wait until the batch
> is done, we see some client-side timeouts for the users of our client
> library. Our solution has been to
> > up the setting during startup, but just as a thought experiment I was
> asking if we could have done it dynamically to avoid losing data. At the
> moment there's quite a lot of
> > machinery to change the setting (deploy, copy data over with our cache
> warmer, flip traffic, tear down old boxes) and I would have rather left
> everything as is and adjusted the
> > setting on the fly until our client's problem was resolved.
> > I'm interested in patching this specific setting to be settable, but
> having it fully dynamic in nature is not something I'd want to tackle.
> There's a natural tradeoff of
> > latency for other connections / throughput for the one that is currently
> being serviced. I'm not sure it's a good idea to dynamically change that.
> It might cause unexpected
> > behavior if one bad client sends huge requests.
> >
> >
> > Scott Mansfield
> > Product > Consumer Science Eng > EVCache > Sr. Software Eng
> > {
> >   M: 352-514-9452
> >   E: [email protected]
> >   K: {M: mobile, E: email, K: key}
> > }
> >
> > On Tue, Jan 24, 2017 at 11:53 AM, dormando <[email protected]> wrote:
> >       Hey,
> >
> >       Would you mind explaining a bit how you determined the setting was
> causing
> >       an issue, and what the impact was? The default there is very old
> and might
> >       be worth a revisit (or some kind of auto-tuning) as well.
> >
> >       I've been trending as much as possible to online configuration,
> inlcuding
> >       the actual memory limit.. You can turn the lru crawler on and off,
> >       automoving on and off, manually move slab pages, etc. I'm hoping
> to make
> >       the LRU algorithm itself modifyable at runtime.
> >
> >       So yeah, I'd take a patch :)
> >
> >       On Mon, 23 Jan 2017, 'Scott Mansfield' via memcached wrote:
> >
> >       > There was a single setting my team was looking at today and wish
> we could have changed dynamically: the
> >       > reqs_per_event setting. Right now in order to change it we need
> to shut down the process and start it again
> >       > with a different -R parameter. I don't see a way to change many
> of the settings, though there are some that
> >       > are ad-hoc changeable through some stats commands. I was going
> to see if I could patch memcached to be able
> >       > to change the reqs_per_event setting at runtime, but before
> doing so I wanted to check to see if that's
> >       > something that would be amenable. I also didn't want to do
> something specifically for that setting if it was
> >       > going to be better to add it as a general feature.
> >       > I see some pros and cons:
> >       >
> >       > One easy pro is that you can easily change things at runtime to
> save performance while not losing all of
> >       > your data. If client request patterns change, the process can
> react.
> >       >
> >       > A con is that the startup parameters won't necessarily match
> what the process is doing, so they are no
> >       > longer going to be a useful way to determine the settings of
> memcached. Instead you would need to connect
> >       > and issue a stats settings command to read them. It also
> introduces change in places that may have
> >       > previously never seen it, e.g. the reqs_per_event setting is
> simply read at the beginning of the
> >       > drive_machine loop. It might need some kind of synchronization
> around it now instead. I don't think it
> >       > necessarily needs it on x86_64 but it might on other platforms
> which I am not familiar with.
> >       >
> >       > --
> >       >
> >       > ---
> >       > You received this message because you are subscribed to the
> Google Groups "memcached" group.
> >       > To unsubscribe from this group and stop receiving emails from
> it, send an email to
> >       > [email protected].
> >       > For more options, visit https://groups.google.com/d/optout.
> >       >
> >       >
> >
> >       --
> >
> >       ---
> >       You received this message because you are subscribed to a topic in
> the Google Groups "memcached" group.
> >       To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
> >       To unsubscribe from this group and all its topics, send an email
> to [email protected].
> >       For more options, visit https://groups.google.com/d/optout.
> >
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google
> Groups "memcached" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "memcached" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/to
> pic/memcached/C6l8aoXQO4A/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to