Re: Concurrent Indexing and Searching in Solr.

Nitin Solanki Fri, 07 Aug 2015 12:16:34 -0700

Hi Erick,
                posting files to Solr via curl =>
Rather than posting files via curl. Which is better SolrJ or post.jar... I
don't use both things. I wrote a python script for indexing and using
urllib and urllib2 for indexing data via http.. I don't have any  option to
use SolrJ Right now. How can I do same thing via post.jar in python? Any
help Please.


indexing with 100 threads is going to eat up a lot of CPU cycles
=> So, How much minimum concurrent threads should I run? And I also need
concurrent searching. So, How much?

And Thanks for solr 5.2, I will go through that. Thanking for reply. Please
help me..

On Fri, Aug 7, 2015 at 11:51 PM Erick Erickson <erickerick...@gmail.com>
wrote:

> bq: How much limitations does Solr has related to indexing and searching
> simultaneously? It means that how many simultaneously calls, I made for
> searching and indexing once?
>
> None a-priori. It all depends on the hardware you're throwing at it.
> Obviously
> indexing with 100 threads is going to eat up a lot of CPU cycles that
> can't then
> be devoted to satisfying queries. You need to strike a balance. Do
> seriously
> consider using some other method than posting files to Solr via curl
> or the like,
> that's rarely a robust solution for production.
>
> As for adding the commit=true, this shouldn't be affecting the index size,
> I
> suspect you were mislead by something else happening.
>
> Really, remove it or you'll beat up your system hugely. As for the soft
> commit
> interval, that's totally irrelevant when you're committing every
> document. But do
> lengthen it as much as you can. Most of the time when people say "real
> time",
> it turns out that 10 seconds is OK. Or 60 seconds is OK.  You have to check
> what the _real_ requirement is, it's often not what's stated.
>
> bq: I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
> indexing and searching data.
>
> Did you read the link I provided? With replicas, 5.2 will index almost
> twice as
> fast. That means (roughly) half the work on the followers is being done,
> freeing up cycles for performing queries.
>
> Best,
> Erick
>
>
> On Fri, Aug 7, 2015 at 2:06 PM, Nitin Solanki <nitinml...@gmail.com>
> wrote:
> > Hi Erick,
> >               You said that soft commit should be more than 3000 ms.
> > Actually, I need Real time searching and that's why I need soft commit
> fast.
> >
> > commit=true => I made commit=true because , It reduces by indexed data
> size
> > from 1.5GB to 500MB on* each shard*. When I did commit=false then, my
> > indexed data size was 1.5GB. After changing it to commit=true, then size
> > reduced to 500MB only. I am not getting how is it?
> >
> > I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
> > indexing and searching data.
> >
> > How much limitations does Solr has related to indexing and searching
> > simultaneously? It means that how many simultaneously calls, I made for
> > searching and indexing once?
> >
> >
> > On Fri, Aug 7, 2015 at 9:18 PM Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> >> Your soft commit time of 3 seconds is quite aggressive,
> >> I'd lengthen it to as long as possible.
> >>
> >> Ugh, looked at your query more closely. Adding commit=true to every
> update
> >> request is horrible performance wise. Let your autocommit process
> >> handle the commits is the first thing I'd do. Second, I'd try going to
> >> SolrJ
> >> and batching up documents (I usually start with 1,000) or using the
> >> post.jar
> >> tool rather than sending them via a raw URL.
> >>
> >> I agree with Upayavira, 100 concurrent threads is a _lot_. Also, what
> >> version of Solr?
> >> There was a 2x speedup in Solr 5.2, see:
> >>
> http://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
> >>
> >> One symptom was that the followers were doing waaaaay more work than the
> >> leader
> >> (BTW, using master/slave when talking SolrCloud is a bit confusing...)
> >> which will
> >> affect query response rates.
> >>
> >> Basically, if query response is paramount, you really need to throttle
> >> your indexing,
> >> there's just a whole lot of work going on here..
> >>
> >> Best,
> >> Erick
> >>
> >> On Fri, Aug 7, 2015 at 11:23 AM, Upayavira <u...@odoko.co.uk> wrote:
> >> > How many CPUs do you have? 100 concurrent indexing calls seems like
> >> > rather a lot. You're gonna end up doing a lot of context switching,
> >> > hence degraded performance. Dunno what others would say, but I'd aim
> for
> >> > approx one indexing thread per CPU.
> >> >
> >> > Upayavira
> >> >
> >> > On Fri, Aug 7, 2015, at 02:58 PM, Nitin Solanki wrote:
> >> >> Hello Everyone,
> >> >>                           I have indexed 16 million documents in Solr
> >> >> Cloud. Created 4 nodes and 8 shards with single replica.
> >> >> I am trying to make concurrent indexing and searching on those
> indexed
> >> >> documents. Trying to make 100 concurrent indexing calls along with
> 100
> >> >> concurrent searching calls.
> >> >> It *degrades searching and indexing* performance both.
> >> >>
> >> >> Configuration :
> >> >>
> >> >>       "commitWithin":{"softCommit":true},
> >> >>       "autoCommit":{
> >> >>         "maxDocs":-1,
> >> >>         "maxTime":60000,
> >> >>         "openSearcher":false},
> >> >>       "autoSoftCommit":{
> >> >>         "maxDocs":-1,
> >> >>         "maxTime":3000}},
> >> >>
> >> >>       "indexConfig":{
> >> >>       "maxBufferedDocs":-1,
> >> >>       "maxMergeDocs":-1,
> >> >>       "maxIndexingThreads":8,
> >> >>       "mergeFactor":-1,
> >> >>       "ramBufferSizeMB":100.0,
> >> >>       "writeLockTimeout":-1,
> >> >>       "lockType":"native"}}}
> >> >>
> >> >> AND  <maxWarmingSearchers>2</maxWarmingSearchers>
> >> >>
> >> >> I don't have know that how master and slave works. Normally, I
> created 8
> >> >> shards and indexed documents using :
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> *http://localhost:8983/solr/test_commit_fast/update/json?commit=true
> >> >> <http://localhost:8983/solr/test_commit_fast/update/json?commit=true
> >
> >> -H
> >> >> 'Content-type:application/json' -d ' [ JSON_Document ]'*And Searching
> >> >> using
> >> >> *: http://localhost:8983/solr/test_commit_fast/select
> >> >> <http://localhost:8983/solr/test_commit_fast/select>*?q=<
> field_name:
> >> >> search_string>
> >> >>
> >> >> Please any help on it. To make searching and indexing fast
> concurrently.
> >> >> Thanks.
> >> >>
> >> >>
> >> >> Regards,
> >> >> Nitin
> >>
>

Re: Concurrent Indexing and Searching in Solr.

Reply via email to