Re: facet optimizing

2007-02-08 Thread Chris Hostetter
I freely admit that i'm totally lost on most of what you're suggestion ... it seems like you're suggesting that organizing the terms in a facet field into a tree structure would help us know which terms to compute the counts for first for a given query -- but it's not clear to me why that would be

Re: facet optimizing

2007-02-08 Thread Chris Hostetter
: > query would be too expensive -- instead we have strucured metadata that : > drives the logic: only compute the constraint counts for this subset of : > manufactures where looking at teh Desktops category, only look at teh : > Operating System facet when in these categories, etc... rules like

random order

2007-02-08 Thread Ryan McKinley
Is there a way to return documents in a random order? (obviously paging would not work) thanks ryan

Re: dismax without q=

2007-02-08 Thread Ryan McKinley
i'm switching from standard to dismax and ran into this. I'll post a little patch in a sec. ryan

Re: dismax without q=

2007-02-08 Thread Mike Klaas
On 2/8/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 2/8/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > Assuming there is an fq=xxx in the query, could dismax support a > queryless query? It does seem reasonable for both dismax and the standard request hander, esp since we have faceting in the

Re: Spelling

2007-02-08 Thread Mike Klaas
On 2/8/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: You could fix this specific case with either configuring protected words on the stemmer, or by using the synonym filter and mapping one of the alternatives to something that won't be stemmed (but the former is probably a better option). More ge

Re: dismax without q=

2007-02-08 Thread Yonik Seeley
On 2/8/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: Assuming there is an fq=xxx in the query, could dismax support a queryless query? It does seem reasonable for both dismax and the standard request hander, esp since we have faceting in the mix (asking for facet constraints without supplying a

dismax without q=

2007-02-08 Thread Ryan McKinley
Assuming there is an fq=xxx in the query, could dismax support a queryless query? I know it may not seem too meaningfully, but it would be nice to use the same handler / query for: ?fq=type:file&ft=published=true ?fq=type:file&ft=published=true&q=ryan ?fq=type:file&ft=published=true&q= Even

Re: Spelling

2007-02-08 Thread Yonik Seeley
Adding to Mike's comments, for this specific query, one can see that both words stem to "illeg": http://localhost:8983/solr/select?q=illegible+illegal&debugQuery=on You could fix this specific case with either configuring protected words on the stemmer, or by using the synonym filter and mapping

Re: SMILE/Rails/Babel and Dynamic Facets?

2007-02-08 Thread Erik Hatcher
On Feb 8, 2007, at 7:23 PM, Antonio Eggberg wrote: You are doing some pretty cool stuff with flare! I am amazed! Now I have some questions :-) Thanks! - Smile and Babel does everything and its so easy so I wonder why you need ruby/rails for flare? What I mean is that one could get XML fe

SMILE/Rails/Babel and Dynamic Facets?

2007-02-08 Thread Antonio Eggberg
Erik: You are doing some pretty cool stuff with flare! I am amazed! Now I have some questions :-) - Smile and Babel does everything and its so easy so I wonder why you need ruby/rails for flare? What I mean is that one could get XML feed directly from Solr to smile if the xml is formatted corr

Re: MultiPhraseQuery and .equals()

2007-02-08 Thread Mike Klaas
On 2/8/07, Mike Klaas <[EMAIL PROTECTED]> wrote: On 2/8/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On 2/8/07, Mike Klaas <[EMAIL PROTECTED]> wrote: > > I had a filter type that was interacting poorly with the Solr > > filterCache--identical filters were causing _2_ filterCache insertions > > p

Re: Spelling

2007-02-08 Thread Mike Klaas
On 2/8/07, Michael Kimsal <[EMAIL PROTECTED]> wrote: Hello Solr friends: Mr. Klaas - I've not tested your patch yet (will try to get to it soon) but I've found almost the opposite problem now and people are questioning how/why things are happening as they are. I'm searching for the word "illega

Production application server recommendation...

2007-02-08 Thread escher2k
Hi, Thanks to the excellent support from the community and the application, we have made good progress towards building a solution using Solr. We currently use Lucene with Jakarta. I was wondering if anyone has recommendations on using Jetty vs. Jakarta. This will run on Solaris. Thanks. -

Re: MultiPhraseQuery and .equals()

2007-02-08 Thread Mike Klaas
On 2/8/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 2/8/07, Mike Klaas <[EMAIL PROTECTED]> wrote: > I had a filter type that was interacting poorly with the Solr > filterCache--identical filters were causing _2_ filterCache insertions > per query. That is strange. What were the two parts of th

[Droids] Re: crawler feed?

2007-02-08 Thread Thorsten Scherler
On Thu, 2007-02-08 at 14:40 +0100, rubdabadub wrote: > Thorsten: > > First of all I read your lab idea with great interest as I am in need > of such crawler. However there are certain things that I like to > discuss. I am not sure what forum will be appropriate for this but I > will do my idea sho

Re: Spelling

2007-02-08 Thread Michael Kimsal
Hello Solr friends: Mr. Klaas - I've not tested your patch yet (will try to get to it soon) but I've found almost the opposite problem now and people are questioning how/why things are happening as they are. I'm searching for the word "illegal" and the query results are coming back with an entry

Re: MultiPhraseQuery and .equals()

2007-02-08 Thread Yonik Seeley
On 2/8/07, Mike Klaas <[EMAIL PROTECTED]> wrote: I had a filter type that was interacting poorly with the Solr filterCache--identical filters were causing _2_ filterCache insertions per query. That is strange. What were the two parts of the code that added to the filterCache? If you re-run the

MultiPhraseQuery and .equals()

2007-02-08 Thread Mike Klaas
Hi, I had a filter type that was interacting poorly with the Solr filterCache--identical filters were causing _2_ filterCache insertions per query. The field is for a url's domain name, and for some reason WordDelimiterFilter was being uses with generateParts=catenateWords=1. The filer query lo

Re: facet optimizing

2007-02-08 Thread Erik Hatcher
And to add some fuel to this fire, I'm seeing in the (first 100k of UVa MARC records) data I'm processing that the facets are sparse with documents. There are a lot of documents that simply don't have a subject genre on them, for example... like almost 50%. Maybe the data will get cleaner

Re: facet optimizing

2007-02-08 Thread Yonik Seeley
A little more brainstorming on this... pruning by df is going to be one of the most important features here... so a variation (or optimization) would be to keep a list of the highest terms by df, and then build the facet tree excluding those top terms. That should lower the dfs in the tree nodes

RE: facet optimizing

2007-02-08 Thread Binkley, Peter
Yonik wrote: Thinking all this stuff up from scratch seems like the hard way... Does anyone know how other people have implemented this stuff? It's not really what Yonik was asking for, but on the semantic front, one thing that might help is OCLC's FAST project (Faceted Application of Subject

Re: crawler feed?

2007-02-08 Thread Otis Gospodnetic
Alright, this is good! (R2D2s) You may want to post this to nutch-dev. I was kind of asking for and about this the other day on nutch-(user?)... Otis - Original Message From: Thorsten Scherler <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, February 7, 2007 5:15:0

Re: crawler feed?

2007-02-08 Thread rubdabadub
Thorsten: First of all I read your lab idea with great interest as I am in need of such crawler. However there are certain things that I like to discuss. I am not sure what forum will be appropriate for this but I will do my idea shooting here first then please tell me where should I post further