Re: Concurrent Indexing and Searching in Solr.

Erick Erickson Fri, 07 Aug 2015 11:21:52 -0700

bq: How much limitations does Solr has related to indexing and searching
simultaneously? It means that how many simultaneously calls, I made for
searching and indexing once?


None a-priori. It all depends on the hardware you're throwing at it. Obviously
indexing with 100 threads is going to eat up a lot of CPU cycles that can't then
be devoted to satisfying queries. You need to strike a balance. Do seriously
consider using some other method than posting files to Solr via curl
or the like,
that's rarely a robust solution for production.

As for adding the commit=true, this shouldn't be affecting the index size, I
suspect you were mislead by something else happening.

Really, remove it or you'll beat up your system hugely. As for the soft commit
interval, that's totally irrelevant when you're committing every
document. But do
lengthen it as much as you can. Most of the time when people say "real time",
it turns out that 10 seconds is OK. Or 60 seconds is OK.  You have to check
what the _real_ requirement is, it's often not what's stated.

bq: I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
indexing and searching data.

Did you read the link I provided? With replicas, 5.2 will index almost twice as
fast. That means (roughly) half the work on the followers is being done,
freeing up cycles for performing queries.

Best,
Erick


On Fri, Aug 7, 2015 at 2:06 PM, Nitin Solanki <nitinml...@gmail.com> wrote:
> Hi Erick,
>               You said that soft commit should be more than 3000 ms.
> Actually, I need Real time searching and that's why I need soft commit fast.
>
> commit=true => I made commit=true because , It reduces by indexed data size
> from 1.5GB to 500MB on* each shard*. When I did commit=false then, my
> indexed data size was 1.5GB. After changing it to commit=true, then size
> reduced to 500MB only. I am not getting how is it?
>
> I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
> indexing and searching data.
>
> How much limitations does Solr has related to indexing and searching
> simultaneously? It means that how many simultaneously calls, I made for
> searching and indexing once?
>
>
> On Fri, Aug 7, 2015 at 9:18 PM Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> Your soft commit time of 3 seconds is quite aggressive,
>> I'd lengthen it to as long as possible.
>>
>> Ugh, looked at your query more closely. Adding commit=true to every update
>> request is horrible performance wise. Let your autocommit process
>> handle the commits is the first thing I'd do. Second, I'd try going to
>> SolrJ
>> and batching up documents (I usually start with 1,000) or using the
>> post.jar
>> tool rather than sending them via a raw URL.
>>
>> I agree with Upayavira, 100 concurrent threads is a _lot_. Also, what
>> version of Solr?
>> There was a 2x speedup in Solr 5.2, see:
>> http://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
>>
>> One symptom was that the followers were doing waaaaay more work than the
>> leader
>> (BTW, using master/slave when talking SolrCloud is a bit confusing...)
>> which will
>> affect query response rates.
>>
>> Basically, if query response is paramount, you really need to throttle
>> your indexing,
>> there's just a whole lot of work going on here..
>>
>> Best,
>> Erick
>>
>> On Fri, Aug 7, 2015 at 11:23 AM, Upayavira <u...@odoko.co.uk> wrote:
>> > How many CPUs do you have? 100 concurrent indexing calls seems like
>> > rather a lot. You're gonna end up doing a lot of context switching,
>> > hence degraded performance. Dunno what others would say, but I'd aim for
>> > approx one indexing thread per CPU.
>> >
>> > Upayavira
>> >
>> > On Fri, Aug 7, 2015, at 02:58 PM, Nitin Solanki wrote:
>> >> Hello Everyone,
>> >>                           I have indexed 16 million documents in Solr
>> >> Cloud. Created 4 nodes and 8 shards with single replica.
>> >> I am trying to make concurrent indexing and searching on those indexed
>> >> documents. Trying to make 100 concurrent indexing calls along with 100
>> >> concurrent searching calls.
>> >> It *degrades searching and indexing* performance both.
>> >>
>> >> Configuration :
>> >>
>> >>       "commitWithin":{"softCommit":true},
>> >>       "autoCommit":{
>> >>         "maxDocs":-1,
>> >>         "maxTime":60000,
>> >>         "openSearcher":false},
>> >>       "autoSoftCommit":{
>> >>         "maxDocs":-1,
>> >>         "maxTime":3000}},
>> >>
>> >>       "indexConfig":{
>> >>       "maxBufferedDocs":-1,
>> >>       "maxMergeDocs":-1,
>> >>       "maxIndexingThreads":8,
>> >>       "mergeFactor":-1,
>> >>       "ramBufferSizeMB":100.0,
>> >>       "writeLockTimeout":-1,
>> >>       "lockType":"native"}}}
>> >>
>> >> AND  <maxWarmingSearchers>2</maxWarmingSearchers>
>> >>
>> >> I don't have know that how master and slave works. Normally, I created 8
>> >> shards and indexed documents using :
>> >>
>> >>
>> >>
>> >>
>> >> *http://localhost:8983/solr/test_commit_fast/update/json?commit=true
>> >> <http://localhost:8983/solr/test_commit_fast/update/json?commit=true>
>> -H
>> >> 'Content-type:application/json' -d ' [ JSON_Document ]'*And Searching
>> >> using
>> >> *: http://localhost:8983/solr/test_commit_fast/select
>> >> <http://localhost:8983/solr/test_commit_fast/select>*?q=< field_name:
>> >> search_string>
>> >>
>> >> Please any help on it. To make searching and indexing fast concurrently.
>> >> Thanks.
>> >>
>> >>
>> >> Regards,
>> >> Nitin
>>

Re: Concurrent Indexing and Searching in Solr.

Reply via email to