Setting group.ngroups=true considerable slows down queries

2011-12-09 Thread Michael Jakl
Hi, I'm using the grouping feature of Solr to return a list of unique documents together with a count of the duplicates. Essentially I use Solr's signature algorithm to create the "signature" field and use grouping on it. To provide good numbers for paging through my result list, I'd like to comp

Re: Setting group.ngroups=true considerable slows down queries

2011-12-09 Thread Michael Jakl
l number of facets as well. I'm using Solr 3.5 (upgraded from Solr 3.4 without reindexing). Thanks, Michael > On 9 December 2011 12:46, Michael Jakl wrote: >> Hi, I'm using the grouping feature of Solr to return a list of unique >> documents together with a count of the

Re: Setting group.ngroups=true considerable slows down queries

2011-12-12 Thread Michael Jakl
Hi! On Mon, Dec 12, 2011 at 13:57, Martijn v Groningen wrote: > As as I know currently there isn't another way. Unfortunately the > performance degrades badly when having a lot of unique groups. > I think an issue should be opened to investigate how we can improve this... > > Question: Does Solr

edismax/dismax/Lucene Query Parser converts some fields to be "mandatory"

2012-01-23 Thread Michael Jakl
Hi, I've been wondering why some of my queries did not return the results I expected. A debugQuery resulted in the following: "java"^0.0 OR "haskell"^0.0 OR "python"^0.0 OR ("ruby"^0.0) AND (("programming"^0.0)) OR "programming language"^0.0 OR "code coding"^0.0 OR -"mobile"^0.0 OR -"android"^0.0

Re: edismax/dismax/Lucene Query Parser converts some fields to be "mandatory"

2012-01-23 Thread Michael Jakl
Hi! On Mon, Jan 23, 2012 at 18:42, Erick Erickson wrote: > Count your parentheses (anyone here speak Lisp?) I think that + > is outside the entire clause, meaning it's saying that there is > a single mandatory clause, and it's the whole thing You're right in that case it's the whole query. P

Re: edismax/dismax/Lucene Query Parser converts some fields to be "mandatory"

2012-01-23 Thread Michael Jakl
On Mon, Jan 23, 2012 at 22:05, Erick Erickson wrote: > Right. Essentially, the precedence is given to AND, so this is parsed > as though it were python OR (ruby AND programming) OR "programming language" That's exactly what I'd expect, but the problem is that "ruby" is marked as mandatory, that i

Re: edismax/dismax/Lucene Query Parser converts some fields to be "mandatory"

2012-01-23 Thread Michael Jakl
On Tue, Jan 24, 2012 at 06:27, Erick Erickson wrote: > Well, at root the Lucene query parser makes no claim of > enforcing boolean logic. Think in terms of MUST, SHOULD > and NOT instead. > > Here's a good writeup... > > http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-and-not/ Hi,

Re: MoreLikeThis Question

2012-02-15 Thread Michael Jakl
Hi! On Wed, Feb 15, 2012 at 07:27, Jamie Johnson wrote: > Is there anyway with MLT to say get similar based on all fields or is > it always a requirement to specify the fields? It seems to be not the case. But you could append the fields Parameter in the solrconfig.xml: ... Cheers, Micha

Too many values for UnInvertedField faceting on field topic

2012-02-29 Thread Michael Jakl
Our Solr started to throw the following exception when requesting the facets of a multivalued field holding a lot of terms. SEVERE: org.apache.solr.common.SolrException: Too many values for UnInvertedField faceting on field topic at org.apache.solr.request.UnInvertedField.uninvert(UnInver

Re: Too many values for UnInvertedField faceting on field topic

2012-03-01 Thread Michael Jakl
Hi! On Wed, Feb 29, 2012 at 22:21, Emmanuel Espina wrote: > No. But probably we can find another way to do what you want. Please > describe the problem and include some "numbers" to give us an idea of > the sizes that you are handling. Number of documents, size of the > index, etc. Thank you! Ou

Re: Too many values for UnInvertedField faceting on field topic

2012-03-02 Thread Michael Jakl
Hi! On Thu, Mar 1, 2012 at 23:54, Yonik Seeley wrote: > On Thu, Mar 1, 2012 at 3:34 AM, Michael Jakl wrote: >> The topic field holds roughly 5 >> values per doc, but I wasn't able to compute the correct number right >> now. > > How many unique values for that fi

Get all matching terms of an OR query

2012-07-04 Thread Michael Jakl
Hi, is there an easy way to get the matches of an OR query? If I'm searching for "android OR google OR apple OR iphone OR -ipod", I'd like to know which of these terms document X contains. I've been using debugQuery and tried to extract the info from the explain information, unfortunately this is

Re: Get all matching terms of an OR query

2012-07-04 Thread Michael Jakl
Hi! On 4 July 2012 17:01, Jack Krupansky wrote: > First, "OR -ipod" needs to be written as "OR (*:* -ipod)" due to an ongoing > deficiency in Lucene query parsing, but I wonder what you really think you > are OR'ing in that clause - all documents that don't contain "ipod"? That > seems odd. Maybe

Re: Get all matching terms of an OR query

2012-07-05 Thread Michael Jakl
Thank you! On 4 July 2012 17:37, Jack Krupansky wrote: > What exactly is it that is too slow? I was comparing Queries with "debugQuery" enabled and disabled. The difference was 60 seconds to 30 seconds for some (unusual) large Queries (many Terms over a large set of documents chosen by filter qu