Re: Solr server becomes non-responsive.

Jack Krupansky Tue, 30 Dec 2014 07:12:06 -0800

I actually did that once as a test years ago, as well as support for
"paging" through the wildcard terms with a starting offset, and it worked
great.


One way to think of the feature is as the ability to "sample" the values of
the wildcard. I mean, not all queries require absolute precision. Sometimes
you just want to know whether "something" exists matching the pattern, or
"generally" what the values look like.

I think it would be worth a Jira.


-- Jack Krupansky

On Tue, Dec 30, 2014 at 6:16 AM, Modassar Ather <modather1...@gmail.com>
wrote:

> Hi,
>
> In the query having lots of wildcard can we put a limitation on number of
> expansion of terms done against a wildcard token something like
> maxBooleanClauses?
>
> Thanks,
> Modassar
>
> On Mon, Dec 29, 2014 at 11:15 AM, Modassar Ather <modather1...@gmail.com>
> wrote:
>
> > Thanks Jack for your suggestions.
> >
> > Regards,
> > Modassar
> >
> > On Fri, Dec 26, 2014 at 6:04 PM, Jack Krupansky <
> jack.krupan...@gmail.com>
> > wrote:
> >
> >> Either you have too little RAM on each node or too much data on each
> node.
> >>
> >> You may need to shard the data much more heavily so that the total work
> on
> >> a single query is distributed in parallel to more nodes, each node
> having
> >> a
> >> much smaller amount of data to work on.
> >>
> >> First, always make sure that the entire Lucene index for each node fits
> >> entirely in the system memory available for file system caching.
> Otherwise
> >> the queries will be I/O bound. Check your current queries to see if that
> >> is
> >> the case - are the nodes compute bound or I/O bound? If I/O bound, add
> >> more
> >> system memory until the queries are no longer I/O bound. If compute
> bound,
> >> shard more heavily until the query latency becomes acceptable.
> >>
> >>
> >>
> >> -- Jack Krupansky
> >>
> >> On Fri, Dec 26, 2014 at 1:02 AM, Modassar Ather <modather1...@gmail.com
> >
> >> wrote:
> >>
> >> > Thanks for your suggestions Erick.
> >> >
> >> > This may be one of those situations where you really have to
> >> > push back at the users and understand why they insist on these
> >> > kinds of queries. They must be very patient since it won't be
> >> > very performant. That said, I've seen this pattern; there are
> >> > certainly valid conditions under which response times can be
> >> > many seconds if there are few users and they are doing very
> >> > complex/expert-level things.
> >> >
> >> > We have tried educating the users but it did not work because they are
> >> used
> >> > to the old way. They feel that wildcard gives more control over the
> >> results
> >> > and may not fully understand stemming.
> >> >
> >> > Regards,
> >> > Modassar
> >> >
> >> > On Thu, Dec 25, 2014 at 3:17 AM, Erick Erickson <
> >> erickerick...@gmail.com>
> >> > wrote:
> >> >
> >> > > There's no magic bullet here that I know of. If your requirements
> >> > > are to support these huge, many-wildcard queries then you only
> >> > > have a few choices:
> >> > >
> >> > > 1> redo the index. I was surprised at how little it bloated the
> >> > > index as far as memory required is concerned to add ngrams.
> >> > > The key here is that there really aren't very many unique terms.
> >> > > If you use bigrams, then there are only maybe 36^2 distinct
> >> > > combinations. (assuming English and including numbers).
> >> > >
> >> > > 2> Increase the number of shards, putting many fewer docs
> >> > > on each shard.
> >> > >
> >> > > 3> give each shard a lot more memory. This isn't actually one
> >> > > of my preferred solutions since GC issues may raise their ugly
> >> > > heads here.
> >> > >
> >> > > 4> insert creative solution here.
> >> > >
> >> > > This may be one of those situations where you really have to
> >> > > push back at the users and understand why they insist on these
> >> > > kinds of queries. They must be very patient since it won't be
> >> > > very performant. That said, I've seen this pattern; there are
> >> > > certainly valid conditions under which response times can be
> >> > > many seconds if there are few users and they are doing very
> >> > > complex/expert-level things.
> >> > >
> >> > > Now, all that said, wildcards are often examples of poor habits
> >> > > or habits learned in DB systems where the only hammer was
> >> > > %whatever%. I've seen situations where users didn't
> >> > > understand that Solr broke the input stream up into words. And
> >> > > stemmed. And WordDelimiterFilterFactory did all the magic
> >> > > for finding, say D.C. and DC. So it's worth looking at the actual
> >> > > queries that are sent, perhaps talking to users and understanding
> >> > > what they _want_ out of the system, then perhaps educating them
> >> > > as to better ways to get what they want.
> >> > >
> >> > > Literally I've seen people insist on entering queries that
> >> > > wildcarded _everything_ both pre and post wildcards because
> >> > > they didn't realize that Solr tokenizes...
> >> > >
> >> > > Once you hit an OOM, all bets are off as Shawn outlined.
> >> > >
> >> > > Best,
> >> > > Erick
> >> > >
> >> > > On Wed, Dec 24, 2014 at 1:57 AM, Modassar Ather <
> >> modather1...@gmail.com>
> >> > > wrote:
> >> > > > Thanks for your response.
> >> > > >
> >> > > > How many items in the collection ?
> >> > > > There are about 100 millions documents.
> >> > > >
> >> > > > How are configured cache in solrconfig.xml ?
> >> > > > Each cache has size attribute as 128.
> >> > > >
> >> > > > Can you provide a sample of the query ?
> >> > > > Does it fail immediately after solrcloud startup or after several
> >> > hours ?
> >> > > > It is a query with many terms(more than a thousand) and phrase
> where
> >> > > > phrases have many wildcards in it.
> >> > > > Once such query is executed there are many zookeeper related
> >> exceptions
> >> > > and
> >> > > > with a couple of such queries executed it goes for OutOfMemory.
> >> > > >
> >> > > > Thanks,
> >> > > > Modassar
> >> > > >
> >> > > >
> >> > > > On Wed, Dec 24, 2014 at 1:37 PM, Dominique Bejean <
> >> > > dominique.bej...@eolya.fr
> >> > > >> wrote:
> >> > > >
> >> > > >> And you didn’t give how many RAM on each servers ?
> >> > > >>
> >> > > >> 2014-12-24 8:17 GMT+01:00 Dominique Bejean <
> >> dominique.bej...@eolya.fr
> >> > >:
> >> > > >>
> >> > > >> > Modassar,
> >> > > >> >
> >> > > >> > How many items in the collection ?
> >> > > >> > I mean how many documents per collection ? 1 million, 10
> >> millions,
> >> > …?
> >> > > >> >
> >> > > >> > How are configured cache in solrconfig.xml ?
> >> > > >> > What are the size attribute value for each cache ?
> >> > > >> >
> >> > > >> > Can you provide a sample of the query ?
> >> > > >> > Does it fail immediately after solrcloud startup or after
> several
> >> > > hours ?
> >> > > >> >
> >> > > >> > Dominique
> >> > > >> >
> >> > > >> > 2014-12-24 6:20 GMT+01:00 Modassar Ather <
> modather1...@gmail.com
> >> >:
> >> > > >> >
> >> > > >> >> Thanks for your suggestions.
> >> > > >> >>
> >> > > >> >> I will look into the link provided.
> >> > > >> >> http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
> >> > > >> >>
> >> > > >> >> This is usually an anti-pattern. The very first thing
> >> > > >> >> I'd be doing is trying to not do this. See ngrams for infix
> >> > > >> >> queries, or shingles or ReverseWildcardFilterFactory or.....
> >> > > >> >>
> >> > > >> >> We cannot avoid multiple wildcards since that's is our user's
> >> > > >> requirement.
> >> > > >> >> We try to discourage it but the users insist on firing such
> >> > queries.
> >> > > >> Also,
> >> > > >> >> ngrams etc. can be tried but our index is already huge and
> >> ngrams
> >> > may
> >> > > >> >> further add lot to it. We are OK with such queries failing as
> >> long
> >> > as
> >> > > >> >> other
> >> > > >> >> queries are not affected.
> >> > > >> >>
> >> > > >> >>
> >> > > >> >> Please find the details below.
> >> > > >> >>
> >> > > >> >> So, how many nodes in the cluster ?
> >> > > >> >> There are total 4 nodes on the cluster.
> >> > > >> >>
> >> > > >> >> How many shards and replicas for the collection ?
> >> > > >> >> There are 4 shards and no replica for any of them.
> >> > > >> >>
> >> > > >> >> How many items in the collection ?
> >> > > >> >> If I understand the question correctly there are two
> collection
> >> on
> >> > > each
> >> > > >> >> node and there size on each node is approximately 190GB and
> >> 130GB.
> >> > > >> >>
> >> > > >> >> What is the size of the index ?
> >> > > >> >> There are two collection on each node and there size on each
> >> node
> >> > is
> >> > > >> >> approximately 190GB and 130GB.
> >> > > >> >>
> >> > > >> >> How is updated the collection (frequency, how many items per
> >> days,
> >> > > what
> >> > > >> is
> >> > > >> >> your hard commit strategy) ?
> >> > > >> >> It is an optimized index and read-only. There are no
> >> inter-mediate
> >> > > >> update.
> >> > > >> >>
> >> > > >> >> How are configured cache in solrconfig.xml ?
> >> > > >> >> Filter cache, query result cache and document cache are
> enabled.
> >> > > >> >> Auto-warming is also done.
> >> > > >> >>
> >> > > >> >> Can you provide all other JVM parameters ?
> >> > > >> >> -Xms20g -Xmx24g -XX:+UseConcMarkSweepGC
> >> > > >> >>
> >> > > >> >> Thanks again,
> >> > > >> >> Modassar
> >> > > >> >>
> >> > > >> >> On Wed, Dec 24, 2014 at 2:29 AM, Dominique Bejean <
> >> > > >> >> dominique.bej...@eolya.fr
> >> > > >> >> > wrote:
> >> > > >> >>
> >> > > >> >> > Hi,
> >> > > >> >> >
> >> > > >> >> > I agree Erick it could be a good think to have more details
> >> about
> >> > > your
> >> > > >> >> > configuration and collection.
> >> > > >> >> >
> >> > > >> >> > Your heap size is 32Gb. How many RAM on each servers ?
> >> > > >> >> >
> >> > > >> >> > By « 4 shard Solr cluster », you mean a 4 nodes Solr servers
> >> or a
> >> > > >> >> > collection with 4 shards ?
> >> > > >> >> >
> >> > > >> >> > So, how many nodes in the cluster ?
> >> > > >> >> > How many shards and replicas for the collection ?
> >> > > >> >> > How many items in the collection ?
> >> > > >> >> > What is the size of the index ?
> >> > > >> >> > How is updated the collection (frequency, how many items per
> >> > days,
> >> > > >> what
> >> > > >> >> is
> >> > > >> >> > your hard commit strategy) ?
> >> > > >> >> > How are configured cache in solrconfig.xml ?
> >> > > >> >> > Can you provide all other JVM parameters ?
> >> > > >> >> >
> >> > > >> >> > Regards
> >> > > >> >> >
> >> > > >> >> > Dominique
> >> > > >> >> >
> >> > > >> >> > 2014-12-23 17:50 GMT+01:00 Erick Erickson <
> >> > erickerick...@gmail.com
> >> > > >:
> >> > > >> >> >
> >> > > >> >> > > Second most important part of your message:
> >> > > >> >> > > "When executing a huge query with many wildcards inside it
> >> the
> >> > > >> server"
> >> > > >> >> > >
> >> > > >> >> > > This is usually an anti-pattern. The very first thing
> >> > > >> >> > > I'd be doing is trying to not do this. See ngrams for
> infix
> >> > > >> >> > > queries, or shingles or ReverseWildcardFilterFactory
> or.....
> >> > > >> >> > >
> >> > > >> >> > > And if your corpus is very large with many unique terms
> it's
> >> > even
> >> > > >> >> > > worse, but you haven't really told us about that yet.
> >> > > >> >> > >
> >> > > >> >> > > Best,
> >> > > >> >> > > Erick
> >> > > >> >> > >
> >> > > >> >> > > On Tue, Dec 23, 2014 at 8:30 AM, Shawn Heisey <
> >> > > apa...@elyograg.org>
> >> > > >> >> > wrote:
> >> > > >> >> > > > On 12/23/2014 4:34 AM, Modassar Ather wrote:
> >> > > >> >> > > >> Hi,
> >> > > >> >> > > >>
> >> > > >> >> > > >> I have a setup of 4 shard Solr cluster with embedded
> >> > > zookeeper on
> >> > > >> >> one
> >> > > >> >> > of
> >> > > >> >> > > >> them. The zkClient time out is set to 30 seconds, -Xms
> is
> >> > 20g
> >> > > and
> >> > > >> >> -Xms
> >> > > >> >> > > is
> >> > > >> >> > > >> 24g.
> >> > > >> >> > > >> When executing a huge query with many wildcards inside
> it
> >> > the
> >> > > >> >> server
> >> > > >> >> > > >> crashes and becomes non-responsive. Even the dashboard
> >> does
> >> > > not
> >> > > >> >> > responds
> >> > > >> >> > > >> and shows connection lost error. This requires me to
> >> restart
> >> > > the
> >> > > >> >> > > servers.
> >> > > >> >> > > >
> >> > > >> >> > > > Here's the important part of your message:
> >> > > >> >> > > >
> >> > > >> >> > > > *Caused by: java.lang.OutOfMemoryError: Java heap space*
> >> > > >> >> > > >
> >> > > >> >> > > >
> >> > > >> >> > > > Your heap is not big enough for what Solr has been asked
> >> to
> >> > do.
> >> > > >> You
> >> > > >> >> > > > need to either increase your heap size or change your
> >> > > >> configuration
> >> > > >> >> so
> >> > > >> >> > > > that it uses less memory.
> >> > > >> >> > > >
> >> > > >> >> > > >
> >> > http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
> >> > > >> >> > > >
> >> > > >> >> > > > Most programs have pretty much undefined behavior when
> an
> >> > OOME
> >> > > >> >> occurs.
> >> > > >> >> > > > Lucene's IndexWriter has been hardened so that it tries
> >> > > extremely
> >> > > >> >> hard
> >> > > >> >> > > > to avoid index corruption when OOME strikes, and I
> believe
> >> > that
> >> > > >> >> works
> >> > > >> >> > > > well enough that we can call it nearly bulletproof ...
> but
> >> > the
> >> > > >> rest
> >> > > >> >> of
> >> > > >> >> > > > Lucene and Solr will make no guarantees.
> >> > > >> >> > > >
> >> > > >> >> > > > It's very difficult to have definable program behavior
> >> when
> >> > an
> >> > > >> OOME
> >> > > >> >> > > > happens, because you simply cannot know the precise
> point
> >> > > during
> >> > > >> >> > program
> >> > > >> >> > > > execution where it will happen, or what isn't going to
> >> work
> >> > > >> because
> >> > > >> >> > Java
> >> > > >> >> > > > did not have memory space to create an object.  Going
> >> > > unresponsive
> >> > > >> >> is
> >> > > >> >> > > > not surprising.
> >> > > >> >> > > >
> >> > > >> >> > > > If you can solve your heap problem, note that you may
> run
> >> > into
> >> > > >> other
> >> > > >> >> > > > performance issues discussed on the wiki page that I
> >> linked.
> >> > > >> >> > > >
> >> > > >> >> > > > Thanks,
> >> > > >> >> > > > Shawn
> >> > > >> >> > > >
> >> > > >> >> > >
> >> > > >> >> >
> >> > > >> >>
> >> > > >> >>
> >> > > >> >> On Wed, Dec 24, 2014 at 2:29 AM, Dominique Bejean <
> >> > > >> >> dominique.bej...@eolya.fr
> >> > > >> >> > wrote:
> >> > > >> >>
> >> > > >> >> > Hi,
> >> > > >> >> >
> >> > > >> >> > I agree Erick it could be a good think to have more details
> >> about
> >> > > your
> >> > > >> >> > configuration and collection.
> >> > > >> >> >
> >> > > >> >> > Your heap size is 32Gb. How many RAM on each servers ?
> >> > > >> >> >
> >> > > >> >> > By « 4 shard Solr cluster », you mean a 4 nodes Solr servers
> >> or a
> >> > > >> >> > collection with 4 shards ?
> >> > > >> >> >
> >> > > >> >> > So, how many nodes in the cluster ?
> >> > > >> >> > How many shards and replicas for the collection ?
> >> > > >> >> > How many items in the collection ?
> >> > > >> >> > What is the size of the index ?
> >> > > >> >> > How is updated the collection (frequency, how many items per
> >> > days,
> >> > > >> what
> >> > > >> >> is
> >> > > >> >> > your hard commit strategy) ?
> >> > > >> >> > How are configured cache in solrconfig.xml ?
> >> > > >> >> > Can you provide all other JVM parameters ?
> >> > > >> >> >
> >> > > >> >> > Regards
> >> > > >> >> >
> >> > > >> >> > Dominique
> >> > > >> >> >
> >> > > >> >> > 2014-12-23 17:50 GMT+01:00 Erick Erickson <
> >> > erickerick...@gmail.com
> >> > > >:
> >> > > >> >> >
> >> > > >> >> > > Second most important part of your message:
> >> > > >> >> > > "When executing a huge query with many wildcards inside it
> >> the
> >> > > >> server"
> >> > > >> >> > >
> >> > > >> >> > > This is usually an anti-pattern. The very first thing
> >> > > >> >> > > I'd be doing is trying to not do this. See ngrams for
> infix
> >> > > >> >> > > queries, or shingles or ReverseWildcardFilterFactory
> or.....
> >> > > >> >> > >
> >> > > >> >> > > And if your corpus is very large with many unique terms
> it's
> >> > even
> >> > > >> >> > > worse, but you haven't really told us about that yet.
> >> > > >> >> > >
> >> > > >> >> > > Best,
> >> > > >> >> > > Erick
> >> > > >> >> > >
> >> > > >> >> > > On Tue, Dec 23, 2014 at 8:30 AM, Shawn Heisey <
> >> > > apa...@elyograg.org>
> >> > > >> >> > wrote:
> >> > > >> >> > > > On 12/23/2014 4:34 AM, Modassar Ather wrote:
> >> > > >> >> > > >> Hi,
> >> > > >> >> > > >>
> >> > > >> >> > > >> I have a setup of 4 shard Solr cluster with embedded
> >> > > zookeeper on
> >> > > >> >> one
> >> > > >> >> > of
> >> > > >> >> > > >> them. The zkClient time out is set to 30 seconds, -Xms
> is
> >> > 20g
> >> > > and
> >> > > >> >> -Xms
> >> > > >> >> > > is
> >> > > >> >> > > >> 24g.
> >> > > >> >> > > >> When executing a huge query with many wildcards inside
> it
> >> > the
> >> > > >> >> server
> >> > > >> >> > > >> crashes and becomes non-responsive. Even the dashboard
> >> does
> >> > > not
> >> > > >> >> > responds
> >> > > >> >> > > >> and shows connection lost error. This requires me to
> >> restart
> >> > > the
> >> > > >> >> > > servers.
> >> > > >> >> > > >
> >> > > >> >> > > > Here's the important part of your message:
> >> > > >> >> > > >
> >> > > >> >> > > > *Caused by: java.lang.OutOfMemoryError: Java heap space*
> >> > > >> >> > > >
> >> > > >> >> > > >
> >> > > >> >> > > > Your heap is not big enough for what Solr has been asked
> >> to
> >> > do.
> >> > > >> You
> >> > > >> >> > > > need to either increase your heap size or change your
> >> > > >> configuration
> >> > > >> >> so
> >> > > >> >> > > > that it uses less memory.
> >> > > >> >> > > >
> >> > > >> >> > > >
> >> > http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
> >> > > >> >> > > >
> >> > > >> >> > > > Most programs have pretty much undefined behavior when
> an
> >> > OOME
> >> > > >> >> occurs.
> >> > > >> >> > > > Lucene's IndexWriter has been hardened so that it tries
> >> > > extremely
> >> > > >> >> hard
> >> > > >> >> > > > to avoid index corruption when OOME strikes, and I
> believe
> >> > that
> >> > > >> >> works
> >> > > >> >> > > > well enough that we can call it nearly bulletproof ...
> but
> >> > the
> >> > > >> rest
> >> > > >> >> of
> >> > > >> >> > > > Lucene and Solr will make no guarantees.
> >> > > >> >> > > >
> >> > > >> >> > > > It's very difficult to have definable program behavior
> >> when
> >> > an
> >> > > >> OOME
> >> > > >> >> > > > happens, because you simply cannot know the precise
> point
> >> > > during
> >> > > >> >> > program
> >> > > >> >> > > > execution where it will happen, or what isn't going to
> >> work
> >> > > >> because
> >> > > >> >> > Java
> >> > > >> >> > > > did not have memory space to create an object.  Going
> >> > > unresponsive
> >> > > >> >> is
> >> > > >> >> > > > not surprising.
> >> > > >> >> > > >
> >> > > >> >> > > > If you can solve your heap problem, note that you may
> run
> >> > into
> >> > > >> other
> >> > > >> >> > > > performance issues discussed on the wiki page that I
> >> linked.
> >> > > >> >> > > >
> >> > > >> >> > > > Thanks,
> >> > > >> >> > > > Shawn
> >> > > >> >> > > >
> >> > > >> >> > >
> >> > > >> >> >
> >> > > >> >>
> >> > > >> >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> >
> >
>

Re: Solr server becomes non-responsive.

Reply via email to