Re: Profiling Solr Lucene for query

Isaac Hebsh Tue, 01 Oct 2013 13:37:05 -0700

Hi Dmitry,

I'm trying to examine your suggestion to create a frontend node. It sounds
pretty usefull.
I saw that every node in solr cluster can serve request for any collection,
even if it does not hold a core of that collection. because of that, I
thought that adding a new node to the cluster (aka, the frontend/gateway
server), and creating a dummy collection (with 1 dummy core), will solve
the problem.


But, I see that a request which sent to the gateway node, is not then sent
to the shards. Instead, the request is proxyed to a (random) core of the
requested collection, and from there it is sent to the shards. (It is
reasonable, because the SolrCore on the gateway might run with different
configuration, etc). This means that my new node isn't functioning as a
frontend (which responsible for sorting, etc.), but as a poor load
balancer. No performance improvement will come from this implementation.

So, how do you suggest to implement a frontend? On the one hand, it has to
run a core of the target collection, but on the other hand, we don't want
it to hold any shard contents.


On Fri, Sep 13, 2013 at 1:08 PM, Dmitry Kan <solrexp...@gmail.com> wrote:

> Manuel,
>
> Whether to have the front end solr as aggregator of shard results depends
> on your requirements. To repeat, we found merging from many shards very
> inefficient fo our use case. It can be the opposite for you (i.e. requires
> testing). There are some limitations with distributed search, see here:
>
> http://docs.lucidworks.com/display/solr/Distributed+Search+with+Index+Sharding
>
>
> On Wed, Sep 11, 2013 at 3:35 PM, Manuel Le Normand <
> manuel.lenorm...@gmail.com> wrote:
>
> > Dmitry - currently we don't have such a front end, this sounds like a
> good
> > idea creating it. And yes, we do query all 36 shards every query.
> >
> > Mikhail - I do think 1 minute is enough data, as during this exact
> minute I
> > had a single query running (that took a qtime of 1 minute). I wanted to
> > isolate these hard queries. I repeated this profiling few times.
> >
> > I think I will take the termInterval from 128 to 32 and check the
> results.
> > I'm currently using NRTCachingDirectoryFactory
> >
> >
> >
> >
> > On Mon, Sep 9, 2013 at 11:29 PM, Dmitry Kan <solrexp...@gmail.com>
> wrote:
> >
> > > Hi Manuel,
> > >
> > > The frontend solr instance is the one that does not have its own index
> > and
> > > is doing merging of the results. Is this the case? If yes, are all 36
> > > shards always queried?
> > >
> > > Dmitry
> > >
> > >
> > > On Mon, Sep 9, 2013 at 10:11 PM, Manuel Le Normand <
> > > manuel.lenorm...@gmail.com> wrote:
> > >
> > > > Hi Dmitry,
> > > >
> > > > I have solr 4.3 and every query is distributed and merged back for
> > > ranking
> > > > purpose.
> > > >
> > > > What do you mean by frontend solr?
> > > >
> > > >
> > > > On Mon, Sep 9, 2013 at 2:12 PM, Dmitry Kan <solrexp...@gmail.com>
> > wrote:
> > > >
> > > > > are you querying your shards via a frontend solr? We have noticed,
> > that
> > > > > querying becomes much faster if results merging can be avoided.
> > > > >
> > > > > Dmitry
> > > > >
> > > > >
> > > > > On Sun, Sep 8, 2013 at 6:56 PM, Manuel Le Normand <
> > > > > manuel.lenorm...@gmail.com> wrote:
> > > > >
> > > > > > Hello all
> > > > > > Looking on the 10% slowest queries, I get very bad performances
> > (~60
> > > > sec
> > > > > > per query).
> > > > > > These queries have lots of conditions on my main field (more
> than a
> > > > > > hundred), including phrase queries and rows=1000. I do return
> only
> > > id's
> > > > > > though.
> > > > > > I can quite firmly say that this bad performance is due to slow
> > > storage
> > > > > > issue (that are beyond my control for now). Despite this I want
> to
> > > > > improve
> > > > > > my performances.
> > > > > >
> > > > > > As tought in school, I started profiling these queries and the
> data
> > > of
> > > > ~1
> > > > > > minute profile is located here:
> > > > > >
> > http://picpaste.com/pics/IMG_20130908_132441-ZyrfXeTY.1378637843.jpg
> > > > > >
> > > > > > Main observation: most of the time I do wait for readVInt, who's
> > > > > stacktrace
> > > > > > (2 out of 2 thread dumps) is:
> > > > > >
> > > > > > catalina-exec-3870 - Thread t@6615
> > > > > >  java.lang.Thread.State: RUNNABLE
> > > > > >  at
> org.apadhe.lucene.store.DataInput.readVInt(DataInput.java:108)
> > > > > >  at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apaChe.lucene.codeosAockTreeIermsReade$FieldReader$SegmentTermsEnumFrame.loadBlock(BlockTreeTermsReader.java:
> > > > > > 2357)
> > > > > >  at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ora.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekExact(BlockTreeTermsReader.java:1745)
> > > > > >  at
> org.apadhe.lucene.index.TermContext.build(TermContext.java:95)
> > > > > >  at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.PhraseQuery$PhraseWeight.<init>(PhraseQuery.java:221)
> > > > > >  at
> > > > >
> > org.apache.lucene.search.PhraseQuery.createWeight(PhraseQuery.java:326)
> > > > > >  at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanQuery$BooleanWeight.<init>(BooleanQuery.java:183)
> > > > > >  at
> > > > > >
> > > >
> > org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:384)
> > > > > >  at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.searth.BooleanQuery$BooleanWeight.<init>(BooleanQuery.java:183)
> > > > > >  at
> > > > > >
> > > >
> > oro.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:384)
> > > > > >  at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.searth.BooleanQuery$BooleanWeight.<init>(BooleanQuery.java:183)
> > > > > >  at
> > > > > >
> > > >
> > org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:384)
> > > > > >  at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:675)
> > > > > >  at
> > > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
> > > > > >
> > > > > >
> > > > > > So I do actually wait for IO as expected, but I might be too many
> > > time
> > > > > page
> > > > > > faulting while looking for the TermBlocks (tim file), ie locating
> > the
> > > > > term.
> > > > > > As I reindex now, would it be useful lowering down the
> termInterval
> > > > > > (default to 128)? As the FST (tip files) are that small (few
> 10-100
> > > MB)
> > > > > so
> > > > > > there are no memory contentions, could I lower down this param
> to 8
> > > for
> > > > > > example? The benefit from lowering down the term interval would
> be
> > to
> > > > > > obligate the FST to get on memory (JVM - thanks to the
> > > > > NRTCachingDirectory)
> > > > > > as I do not control the term dictionary file (OS caching, loads
> an
> > > > > average
> > > > > > of 6% of it).
> > > > > >
> > > > > >
> > > > > > General configs:
> > > > > > solr 4.3
> > > > > > 36 shards, each has few million docs
> > > > > > These 36 servers (each server has 2 replicas) are running
> virtual,
> > > 16GB
> > > > > > memory each (4GB for JVM, 12GB remain for the OS caching),
> >  consuming
> > > > > 260GB
> > > > > > of disk mounted for the index files.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Profiling Solr Lucene for query

Reply via email to