Nope, NRT is within seconds at most in several cases. Sounds like cloud
needs to be whah we plan for.

Thanks!

On Mon, Oct 2, 2017 at 5:39 PM Erick Erickson <erickerick...@gmail.com>
wrote:

> Short form: Use SolCloud from what you've described.
>
> NRT and M/S is simply oil and water. The _very_ best you can do when
> searching slaves is
> master's commit interval + slave polling interval + time to transmit
> the index to the slave + autowarming time on the slave.
>
> Now, that said, when you say NRT it's really "10 minutes is OK" then
> M/S will work for you.
>
> But otherwise I'd be using SolrCloud.
>
> Best,
> Erick
>
> On Mon, Oct 2, 2017 at 1:48 PM, John Blythe <johnbly...@gmail.com> wrote:
> > thanks for the responses, guys.
> >
> > erick: we do need NRT in several cases. also in need of HA pending where
> > the line is drawn. we do need it relatively speaking, i.e. w/i our user
> > base. if the largest of our cores falters then our business is completely
> > stopped till we can get everything reindexed.
> >
> > is there a general rule when it comes to query rate and efficiency
> between
> > Cloud and M/S? in either case we need to add complexity to the system so,
> > if it's a jump ball, that will be the thing that likely tips in favor.
> >
> > emir: the logs aren't write intensive. what are the core benefits to
> > splitting up the machine if there isn't a jvm load issue we're currently
> > experiencing?
> >
> > i can def provide more info that could help in the discussion. help me
> know
> > the best way / stuff to send if you can please.
> >
> > thanks again for the help guys-
> >
> > --
> > John Blythe
> >
> > On Fri, Sep 29, 2017 at 10:27 AM, Erick Erickson <
> erickerick...@gmail.com>
> > wrote:
> >
> >> SolrCloud. SolrCloud. SolrCloud.
> >>
> >> Well, it actually depends. I recommend people go to SolrCloud when any
> >> of the following apply:
> >>
> >> > The instant you need to break any collection up into shards because
> >> you're running into the constraints of your hardware (you can't just
> keep
> >> adding memory to the JVM forever).
> >>
> >> > You need NRT searching and need multiple replicas for either your
> >> traffic rate or HA purposes.
> >>
> >> > You find yourself dealing with lots of administrative complexity for
> >> various indexes. You have what sounds like 6-10 cores laying around. You
> >> can move them to different machines without going to SolrCloud, but then
> >> something has to keep track of where they all are and route requests
> >> appropriately. If that gets onerous, SolrCloud will simplify it.
> >>
> >> If none of the above apply, master/slave is just fine. Since you can
> >> rebuild in a couple of hours, most of the difficulty with M/S when the
> >> master goes down are manageable. With a master and several slaves, you
> >> have HA, and a load balancer will see to it that some are used.
> >> There's no real need to exclusively search on the slaves, I've seen
> >> situations where the master is used for queries as well as indexing.
> >>
> >> To increase your query rate, you can just add more slaves to the hot
> >> index, assuming you're content with the latency between indexing and
> >> being able to search newly indexed documents.
> >>
> >> SolrCloud, of course, comes with the added complexity of ZooKeeper.
> >>
> >> Best,
> >> Erick
> >>
> >>
> >>
> >> On Fri, Sep 29, 2017 at 5:34 AM, John Blythe <johnbly...@gmail.com>
> wrote:
> >> > hi all.
> >> >
> >> > complete noob as to solrcloud here. almost-non-noob on solr in
> general.
> >> >
> >> > we're experiencing growing pains in our data and am thinking through
> >> moving
> >> > to solrcloud as a result. i'm hoping to find out if it seems like a
> good
> >> > strategy or if we need to get other areas of interest handled first
> >> before
> >> > introducing new complexities.
> >> >
> >> > here's a rundown of things:
> >> > - we are on a 30g ram aws instance
> >> > - we have ~30g tucked away in the ../solr/server/ dir
> >> > - our largest core is 6.8g w/ ~25 segments at any given time. this is
> >> also
> >> > the core that our business directly runs off of, users interact with,
> >> etc.
> >> > - 5g is for a logs type of dataset that analytics can be built off of
> to
> >> > help inform the primary core above
> >> > - 3g are taken up by 3 different third party sources that we use solr
> to
> >> > warehouse and have available for query for the sake of linking items
> in
> >> our
> >> > primary core to these cores for data enrichment
> >> > - several others take up < 1g each
> >> > - and then we have dev- and demo- flavors for some of these
> >> >
> >> > we had been operating on a 16gb machine till a few weeks ago (actually
> >> > bumped while at lucene revolution bc i hadn't noticed how much we'd
> >> > outgrown the cache size's needs till the week before!). the load when
> >> doing
> >> > an import or running our heavier operations is much better and doesn't
> >> fall
> >> > under the weight of the operations like it had been doing.
> >> >
> >> > we have no master/slave replica. all of our data is 'replicated' by
> the
> >> > fact that it exists in mysql. if solr were to go down it'd be a nice
> big
> >> > fire but one we could recover from within a couple hours by simply
> >> > reimporting.
> >> >
> >> > i'd like to have a more sophisticated set up in place for fault
> tolerance
> >> > than that, of course. i'd also like to see our heavy, many-query based
> >> > operations be speedier and better capable of handling multi-threaded
> runs
> >> > at once w/ ease.
> >> >
> >> > is this a matter of getting still more ram on the machine? cpus for
> >> faster
> >> > processing? splitting up the read/write operations between
> master/slave?
> >> > going full steam into a solrcloud configuration?
> >> >
> >> > one more note. per discussion at the conference i'm combing through
> our
> >> > configs to make sure we trim any fat we can. also wanting to get
> >> > optimization scheduled more regularly to help out w segmentation and
> >> > garbage heap. not sure how far those two alone will get us, though.
> >> >
> >> > thanks for any thoughts!
> >> >
> >> > --
> >> > John Blythe
> >>
>
-- 
John Blythe

Reply via email to