Re: spellcheck: issues

2008-10-10 Thread Jason Rennie
popularity >if (freq > a.freq) { > return 1; >} > >if (freq < a.freq) { > return -1; >} >return 0; > } > > I could see you opening a JIRA issue in Lucene against the SC to make it so > that the sorting could be overridden/pl

Re: spellcheck: issues

2008-10-08 Thread Jason Rennie
On Wed, Oct 8, 2008 at 3:31 PM, Jason Rennie <[EMAIL PROTECTED]> wrote: > I just tried J-W and *yes* it seems to do a much better job! I'd certainly > vote for that becoming the default :) > Ack! I did some more testing and J-W results started to get weird (including sugg

Re: spellcheck: issues

2008-10-08 Thread Jason Rennie
ght try the Jaro-Winkler measure, too, as it is a bit more > sophisticated than Levenstein when it comes to scoring. > I just tried J-W and *yes* it seems to do a much better job! I'd certainly vote for that becoming the default :) Thanks for all the help! Much appreciated. Jason

Re: spellcheck: issues

2008-10-08 Thread Jason Rennie
On Wed, Oct 8, 2008 at 1:24 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > Token: chane OMP: false > Oct 8, 2008 1:19:56 PM org.apache.solr.core.SolrCore execute > INFO: [spell] webapp=null path=/select > params={q=description%3Achane&spellcheck=true&spellcheck.onlyMorePopular=false&spellcheck.e

Re: spellcheck: issues

2008-10-08 Thread Jason Rennie
Hi Grant, Here are solr config files (attached) and java code (included below) to recreate the test case. Jason List> terms = new ArrayList>(); terms.add(new Pair("chanel", 834)); terms.add(new Pair("chant", 10)); terms.add(new Pair("chang", 8)); terms.add

Re: spellcheck: issues

2008-10-07 Thread Jason Rennie
I > would like to reproduce it and see what's going on. > > > > > On Oct 7, 2008, at 2:18 PM, Jason Rennie wrote: > > On Tue, Oct 7, 2008 at 11:56 AM, Grant Ingersoll <[EMAIL PROTECTED] >> >wrote: >> >> Is there anyway you can write up a small te

Re: spellcheck: issues

2008-10-07 Thread Jason Rennie
On Tue, Oct 7, 2008 at 11:56 AM, Grant Ingersoll <[EMAIL PROTECTED]>wrote: > Is there anyway you can write up a small test case? This definitely sounds > like a bug. I tried adding single word documents according to the top ten suggestions and frequencies for "chanl". I.e. I created a fresh in

Re: spellcheck: issues

2008-10-06 Thread Jason Rennie
pellchecker only run when there are no document hits? Btw, is there a better place to be posting comments/questions like this? Jason On Mon, Oct 6, 2008 at 4:08 PM, Jason Rennie <[EMAIL PROTECTED]> wrote: > I've noticed a few issues with spellcheck as I've been testing

spellcheck: issues

2008-10-06 Thread Jason Rennie
I've noticed a few issues with spellcheck as I've been testing it out for use on our site... 1. Rebuild breaks requests - I'm using rebuildOnCommit ATM. If a commit is going on and files are being rebuilt in the spellcheck data dir, spellcheck requests yield bogus answers. I.e. I can is

Re: required keyword in all a document

2008-10-06 Thread Jason Rennie
t; > (k1_en:france^100 OR k2_en:france^10 OR k3_en:france) > AND > (k1_en:flag^100 OR k2_en:flag^10 OR k3_en:flag) > AND > (k1_en:french^100 OR k2_en:french^10 OR k3_en:french) > > Is there a better/more simple way to do this ? > > Thx in advance ! > > -- > ~

Re: Transitioning from Solr 1.2 to Solr 1.3

2008-10-06 Thread Jason Rennie
hat is the best way? I have been > using google and solr wiki but haven't found a way to do this. > > Mike Tedesco > > -- Jason Rennie Head of Machine Learning Technologies, StyleFeeder http://www.stylefeeder.com/ Samantha's blog & pictures: http://samanthalyrarennie.blogspot.com/

Re: How to tokenize/analyze docs for the spellchecker - at indexing and query time

2008-10-03 Thread Jason Rennie
Hi Martin, I'm a relative newbie to solr, have been playing with the spellcheck component and seem to have it working. I certainly can't explain what all is going on, but with any luck, I can help you get the spellchecker up-and-running. Additional replies in-lined below. On Wed, Oct 1, 2008 at

Re: using spellcheckcomponent via solrj

2008-10-03 Thread Jason Rennie
). Anyone know what might be going on here? Thanks, Jason On Wed, Sep 24, 2008 at 4:22 PM, Jason Rennie <[EMAIL PROTECTED]> wrote: > On Wed, Sep 24, 2008 at 4:07 PM, Grant Ingersoll <[EMAIL PROTECTED]>wrote: > >> Just mimic the configuration for the spellCheckCompRH

Re: spellcheck: buildOnOptimize?

2008-09-30 Thread Jason Rennie
On Fri, Sep 26, 2008 at 9:33 AM, Shalin Shekhar Mangar < [EMAIL PROTECTED]> wrote: > Jason, can you please open a jira issue to add this feature? > Done. https://issues.apache.org/jira/browse/SOLR-795 Jason

spellcheck: substitutions, but no inserts or deletes

2008-09-30 Thread Jason Rennie
I've been testing the SpellCheckComponent for use on StyleFeeder. It seems to do a great job of suggesting character substitutions, but I haven't seen any deletion/insertion suggestions. I've tried decreasing the "accuracy" parameter to 0.5. Some queries I've tried are: bluea: suggests "blues"

spellcheck: buildOnOptimize?

2008-09-25 Thread Jason Rennie
I see that there's an option to automatically rebuild the spelling index on a commit. That's a nice feature that we'll consider using, but we run commits every few thousand document updates, which would yield ~100 spelling index rebuilds a day. OTOH, we run an optimize about once/day which seems

Re: using spellcheckcomponent via solrj

2008-09-24 Thread Jason Rennie
On Wed, Sep 24, 2008 at 4:07 PM, Grant Ingersoll <[EMAIL PROTECTED]>wrote: > Just mimic the configuration for the spellCheckCompRH in the handler that > you use for querying. Sounds even better. Let me make sure I'm reading you correctly. Is the idea to add lines like this to the requestHandle

Re: using spellcheckcomponent via solrj

2008-09-24 Thread Jason Rennie
On Wed, Sep 24, 2008 at 3:43 PM, Erik Hatcher <[EMAIL PROTECTED]>wrote: > query.setQueryType("/spellCheckCompRH") > That's the trick I needed. Thanks! Jason

using spellcheckcomponent via solrj

2008-09-24 Thread Jason Rennie
I've got SpellCheckComponent working on my index using queries like so: /solr/spellCheckCompRH?q=shart&spellcheck.q=shart&spellcheck=true&qt=sfdismax But, I haven't had any luck getting solrj to produce such queries. I can't find any way to change the url from /solr/select to /solr/spellCheckCom

Re: What's the bottleneck?

2008-09-12 Thread Jason Rennie
Thanks for all the replies! Mike: we're not using pf. Our qf is always "status:0". The "status" field is "0" for all good docs (90%+) and some other integer for any docs we don't want returned. Jeyrl: federated search is definitely something we'll consider. On Fri, Sep 12, 2008 at 8:39 AM, Gra

Re: What's the bottleneck?

2008-09-11 Thread Jason Rennie
On Thu, Sep 11, 2008 at 1:29 PM, <[EMAIL PROTECTED]> wrote: > what is your index configuration??? Not sure what you mean. We're using 1.2, though we've tested with a recent nightly and didn't see a significant change in performance... > What is your average size form the returned fields ???

Re: What's the bottleneck?

2008-09-11 Thread Jason Rennie
On Thu, Sep 11, 2008 at 11:54 AM, Mark Miller <[EMAIL PROTECTED]> wrote: > What kind of traffic are you getting when it takes seconds? 1 request? 12? > I'd estimate concurrency around 3, though the speed doesn't change much when we run the same query on a server with zero traffic. Jason

What's the bottleneck?

2008-09-11 Thread Jason Rennie
s the bottleneck, is there anything we could do to easily trim-down computation time (besides removing common words from the query)? Jason -- Jason Rennie Head of Machine Learning Technologies, StyleFeeder http://www.stylefeeder.com/ Samantha's blog & pictures: http://samanthalyrarennie.blogspot.com/

Re: Question on how index works - runs out of disk space!

2008-09-11 Thread Jason Rennie
are easy to make via the solrj client we use. Though, for one of our indexes, we perform all of the updates offline and run an optimize before putting the index into production. Hope this helps. Cheers, Jason -- Jason Rennie Head of Machine Learning Technologies, StyleFeeder http

Re: Question on how index works - runs out of disk space!

2008-09-10 Thread Jason Rennie
_________ > Searching for the best deals on travel? Visit MSN Travel. > http://in.msn.com/coxandkings > -- Jason Rennie Head of Machine Learning Technologies, StyleFeeder http://www.stylefeeder.com/ Samantha's blog & pictures: http://samanthalyrarennie.blogspot.com/

Re: Index partioning

2008-09-10 Thread Jason Rennie
fferent beast from the "Index Partitioning" > topic this thread was discussing ... there's some good info on the wiki > about the various options (they each have their trade offs to consider) > > http://wiki.apache.org/solr/MultipleIndexes > >

Re: Less aggressive stemmer?

2008-08-22 Thread Jason Rennie
Kevin & Guillaume, Many thanks for the pointers. It sounds like one of these two solutions will fit our needs. Cheers, Jason On Thu, Aug 21, 2008 at 5:33 PM, Guillaume Smet <[EMAIL PROTECTED]>wrote: > On Thu, Aug 21, 2008 at 11:23 PM, Jason Rennie <[EMAIL PROTECTED]> wro

Less aggressive stemmer?

2008-08-21 Thread Jason Rennie
Is there an option to perform less aggressive stemming in solr? We're using the Porter stemmer. I see that there is an option for Snowball, but my understanding is that Snowball is a refinement of Porter rather than something radically different. I think we'd be best off with something very basi

Re: How to boost the score higher in case user query matches entire field value than just some words within a field

2008-08-21 Thread Jason Rennie
e:"cordless drill" will hit all three documents. So >> how can I make Doc1 score higher than the other two? >> BTW, I am using solr1.2. >> thanks! >> -Simon >> >> > -- Jason Rennie Head of Machine Learning Technologies, StyleFeeder http://www.stylefeeder.com/ Samantha's blog & pictures: http://samanthalyrarennie.blogspot.com/

Re: Administrative questions

2008-08-14 Thread Jason Rennie
On Wed, Aug 13, 2008 at 1:52 PM, Jon Drukman <[EMAIL PROTECTED]> wrote: > Duh. I should have thought of that. I'm a big fan of djbdns so I'm quite > familiar with daemontools. > > Thanks! > :) My pleasure. Was nice to hear recently that DJB is moving toward more flexible licensing terms. For

Re: Administrative questions

2008-08-13 Thread Jason Rennie
nd directs output to a set of rotated log files. Very handy for a production environment. A bit tricky to set, but solid once you have it in place. http://cr.yp.to/daemontools.html Jason -- Jason Rennie Head of Machine Learning Technologies, StyleFeeder http://www.st

Re: concurrent optimize and update

2008-08-12 Thread Jason Rennie
On Mon, Aug 11, 2008 at 6:41 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > It's safe... the adds will block until the commit or optimize has finished. > By block, do you mean that the update connection(s) will be held open? Our optimizes take many minutes to complete. I'm thinking that this cou

dismax bq

2008-08-05 Thread Jason Rennie
I'd like to be able to specify query term weights/boosts, which it sounds like bq was created for. I think my understanding from the wiki is a bit rough, so I'm hoping I might be able to get some questions answered here. Any thoughts/comments are much appreciated. I initially tried simply passing

Re: diversity in results

2008-08-04 Thread Jason Rennie
;wrote: > not out of the box, but I would use the mlt handler on the first result and > remove all the ones that appear in both the MLT and query response. > > B > > -- Jason Rennie Head of Machine Learning Technologies, StyleFeeder http://www.stylefeede

Re: diversity in results

2008-08-04 Thread Jason Rennie
Thanks for the pointers. Looks interesting, at least as a starting point for something more sophisticated. Cheers, Jason On Mon, Aug 4, 2008 at 4:38 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > See https://issues.apache.org/jira/browse/SOLR-236 and > http://wiki.apache.org/solr/FieldCollap

diversity in results

2008-08-04 Thread Jason Rennie
any plans to add it? Thanks, Jason -- Jason Rennie Head of Machine Learning Technologies, StyleFeeder http://www.stylefeeder.com/ Samantha's blog & pictures: http://samanthalyrarennie.blogspot.com/

Re: pf nixes fl

2008-07-22 Thread Jason Rennie
Doh! I mistakenly changed the request handler from dismax to standard. Ignore me... Jason On Tue, Jul 22, 2008 at 2:59 PM, Jason Rennie <[EMAIL PROTECTED]> wrote: > I'm using solrj and all I did was add a pf entry to solrconfig.xml. I > don't think it could be an ampersa

Re: pf nixes fl

2008-07-22 Thread Jason Rennie
name^1.5 tags description^0.6 vendorname^0.3 manufacturer^0.3 category name description id score 0 status:0 The above query returns all document fields, no "score" field. Jason On Tue, Jul 22, 2008 at 2:55 PM, Mike Klaas <[EMAIL PROTECTED]

pf nixes fl

2008-07-22 Thread Jason Rennie
Just tried adding a pf field to my request handler. When I did this, solr returned all document fields for each doc (no "score") instead of returning the fields specified in fl. Bug? Feature? Anyone know what the reason for this behavior is? I'm using solr 1.2. Thanks, Jason

Re: Internal Server Error and waitSearcher="false" for commit/optimize

2007-10-11 Thread Jason Rennie
at we needed to know. Our query threads are separate from the commit/optimize thread, so this option would not affect operations. In case you're curious, we use solr as the search engine for www.stylefeeder.com. It has served us very well so far, handling over 3000 queries/day. Tha

Internal Server Error and waitSearcher="false" for commit/optimize

2007-10-10 Thread Jason Rennie
Hello, We're using solr 1.2 and a nightly build of the solrj client code. We very occasionally see things like this: org.apache.solr.client.solrj.SolrServerException: Error executing query at org.apache.solr.client.solrj.request.QueryRequest.process( QueryRequest.java:86) at org.