Re: LucidWorks Solr

2010-04-22 Thread Robert Muir
On Wed, Apr 21, 2010 at 1:38 PM, Shashi Kant wrote: > Why do these approaches have to be mutually exclusive? > Do a dictionary lookup, if no satisfactory match found use an > algorithmic stemmer. Would probably save a few CPU cycles by > algorithmic stemming iff necessary. > > by the way, if you

Re: LucidWorks Solr

2010-04-21 Thread MitchK
e tolerant and made for highly relevant search results without exact matching. Kind regards - Mitch -- View this message in context: http://n3.nabble.com/LucidWorks-Solr-tp727341p741090.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 3:29 PM, Mark Miller wrote: > > Stemming/lematization will pretty much always improve recall at the cost of > precision - that's nothing new. If you stem instead, are you going to want > documents that had run and water when you searched for running water? I just > don't s

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 3:22 PM, Robert Muir wrote: On Wed, Apr 21, 2010 at 2:26 PM, Mark Miller wrote: Its an orthogonal issue - running will have that problem no matter what. It doesn't affect whether a user that types running may be just as interested in a doc that matches all of their other terms bu

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 2:26 PM, Mark Miller wrote: > > Its an orthogonal issue - running will have that problem no matter what. It > doesn't affect whether a user that types running may be just as interested > in a doc that matches all of their other terms but has ran instead of > running. Its al

Re: Stemming [was: LucidWorks Solr]

2010-04-21 Thread Darren Govoni
IMHO, a 'stemmer' (being a specific 'thing') is exactly that. An algorithm for stemming. A database or lexicon is not referred to as a 'stemmer'. One can perform "stemming" using a lexicon if that's their need. For me, its more than just stemming because some words have morphology totally separat

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 2:20 PM, Robert Muir wrote: On Wed, Apr 21, 2010 at 2:09 PM, Mark Miller wrote: Right - I agree they both have their strengths and weakness' - but you usually don't get things like running->ran with stemming. Like most things, its a tradeoff. There is always a hybrid approach as

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 2:09 PM, Mark Miller wrote: > > Right - I agree they both have their strengths and weakness' - but you > usually don't get things like running->ran with stemming. Like most things, > its a tradeoff. There is always a hybrid approach as well. > > I think running/ran has mor

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 2:02 PM, Robert Muir wrote: On Wed, Apr 21, 2010 at 1:49 PM, Mark Miller wrote: I believe that's covered by morphology? The problem is typically a morphological analyzer emits multiple solutions, which include POS. So morphology can tell you that "building" has two s

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 1:49 PM, Mark Miller wrote: > > I believe that's covered by morphology? > > The problem is typically a morphological analyzer emits multiple solutions, which include POS. So morphology can tell you that "building" has two solutions: the gerund form which you might stem t

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 1:43 PM, Walter Underwood wrote: On Apr 21, 2010, at 10:30 AM, Mark Miller wrote: But they don't usually call 'non algorithmic' stemming 'stemming'. Stemming usually means using a simple heuristic process. When you use vocabulary and morphology, its usually called lemmatization

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 1:43 PM, Robert Muir wrote: On Wed, Apr 21, 2010 at 1:30 PM, Mark Miller wrote: But they don't usually call 'non algorithmic' stemming 'stemming'. Stemming usually means using a simple heuristic process. When you use vocabulary and morphology, its usually called lemmatization

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 1:30 PM, Mark Miller wrote: > > But they don't usually call 'non algorithmic' stemming 'stemming'. > Stemming usually means using a simple heuristic process. When you use > vocabulary and morphology, its usually called lemmatization rather than > stemming. > > Lemmatizati

Re: LucidWorks Solr

2010-04-21 Thread Walter Underwood
On Apr 21, 2010, at 10:30 AM, Mark Miller wrote: > But they don't usually call 'non algorithmic' stemming 'stemming'. Stemming > usually means using a simple heuristic process. When you use vocabulary and > morphology, its usually called lemmatization rather than stemming. > "stemmer" is jargo

Re: LucidWorks Solr

2010-04-21 Thread Shashi Kant
Why do these approaches have to be mutually exclusive? Do a dictionary lookup, if no satisfactory match found use an algorithmic stemmer. Would probably save a few CPU cycles by algorithmic stemming iff necessary. On Wed, Apr 21, 2010 at 1:31 PM, Robert Muir wrote: > sy to look at the "faults" o

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 1:18 PM, Chris Hostetter wrote: > > Strictly speaking: you haven't "ditched" stemmers altogether -- you've > ditched *algorithmic* stemmers and moved to a *dictionary* based stemmer > -- but it's still a stemmer. > > (i just don't want people reading this thread to be confus

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 1:18 PM, Chris Hostetter wrote: : Regarding stemmers, I ditched them altogether a long time ago in favor : of a dictionary of morphologies of all known words (for any given : language). A simple lookup of any word morphology thus produces the set, : including the correct stem. Strictl

Re: LucidWorks Solr

2010-04-21 Thread Chris Hostetter
: Regarding stemmers, I ditched them altogether a long time ago in favor : of a dictionary of morphologies of all known words (for any given : language). A simple lookup of any word morphology thus produces the set, : including the correct stem. Strictly speaking: you haven't "ditched" stemmers a

Re: LucidWorks Solr

2010-04-19 Thread Andy
> Andy, > > This will help with smooth injection of your multilingual > documents into Solr (multilingual either in the sense of 1 > doc containing fields in multiple languages or 1 index > containing documents in different languages): > >   http://sematext.com/products/multilingual-indexer/inde

Re: LucidWorks Solr

2010-04-19 Thread Otis Gospodnetic
http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Andy > To: solr-user@lucene.apache.org > Sent: Mon, April 19, 2010 8:45:40 AM > Subject: Re: LucidWorks Solr > > Thanks for the explanation Mitc

Re: LucidWorks Solr

2010-04-19 Thread Erick Erickson
itive > >> for > >> a given word. The idea is that he produces always the same infintive for > >> any > >> derivate of the word. > >> > >> What would be, if there is an unknown word? For example something like > >> slang? How does your so

Re: LucidWorks Solr

2010-04-19 Thread darren
For example something like >> slang? How does your solution works here? Does it scale? >> >> Thank you for sharing experiences. :) >> >> - Mitch >> -- >> View this message in context: >> http://n3.nabble.com/LucidWorks-Solr-tp727341p730059.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >

Re: LucidWorks Solr

2010-04-19 Thread darren
your solution works here? Does it scale? > > Thank you for sharing experiences. :) > > - Mitch > -- > View this message in context: > http://n3.nabble.com/LucidWorks-Solr-tp727341p730059.html > Sent from the Solr - User mailing list archive at Nabble.com. >

Re: LucidWorks Solr

2010-04-19 Thread MitchK
s the application works as expected. - Mitch -- View this message in context: http://n3.nabble.com/LucidWorks-Solr-tp727341p730160.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: LucidWorks Solr

2010-04-19 Thread Erick Erickson
produces always the same infintive for > any > derivate of the word. > > What would be, if there is an unknown word? For example something like > slang? How does your solution works here? Does it scale? > > Thank you for sharing experiences. :) > > - Mitch > -- > View t

Re: LucidWorks Solr

2010-04-19 Thread MitchK
works here? Does it scale? Thank you for sharing experiences. :) - Mitch -- View this message in context: http://n3.nabble.com/LucidWorks-Solr-tp727341p730059.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: LucidWorks Solr

2010-04-19 Thread darren
gt; --- On Mon, 4/19/10, Darren Govoni wrote: > >> From: Darren Govoni >> Subject: Re: LucidWorks Solr >> To: solr-user@lucene.apache.org >> Date: Monday, April 19, 2010, 7:39 AM >> Regarding stemmers, I ditched them >> altogether a long time ago in favor &

Re: LucidWorks Solr

2010-04-19 Thread Andy
Thanks for the tip. Are there any publicly available dictionary of morphologies that I could use? Or did you build your own one? --- On Mon, 4/19/10, Darren Govoni wrote: > From: Darren Govoni > Subject: Re: LucidWorks Solr > To: solr-user@lucene.apache.org > Date: Monday, April

Re: LucidWorks Solr

2010-04-19 Thread Andy
ded way to deal with documents in multiple languages? --- On Mon, 4/19/10, MitchK wrote: > From: MitchK > Subject: Re: LucidWorks Solr > To: solr-user@lucene.apache.org > Date: Monday, April 19, 2010, 4:36 AM > > Andy, I think it is important to know what a stemmer reall

Re: LucidWorks Solr

2010-04-19 Thread Darren Govoni
Regarding stemmers, I ditched them altogether a long time ago in favor of a dictionary of morphologies of all known words (for any given language). A simple lookup of any word morphology thus produces the set, including the correct stem. Works great. 100% of the time. Just a tip from me. On Mon

Re: LucidWorks Solr

2010-04-19 Thread MitchK
ntext: http://n3.nabble.com/LucidWorks-Solr-tp727341p729110.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: LucidWorks Solr

2010-04-18 Thread Andy
--- On Sun, 4/18/10, Grant Ingersoll wrote: > > Sure, but I'm biased. ;-)  Hopefully, you will find it > useful, but choose the one that best fits your needs (and > let me know if you need help assessing that.) > Thanks for the explanation Grant. WHat is the advantage of KStem over the sta

Re: LucidWorks Solr

2010-04-18 Thread Grant Ingersoll
On Apr 18, 2010, at 3:53 AM, Andy wrote: > Just wanted to know if anyone has used LucidWorks Solr. > > - How do you compare it to the standard Apache Solr? We take a release of Solr. We wrap it w/ an installer, tomcat/jetty, our reference guide, Luke, etc. We also add in an

Re: LucidWorks Solr

2010-04-18 Thread Paolo Castagna
Thanks for asking, I am interested as well in reading the response to your questions. Paolo Andy wrote: Just wanted to know if anyone has used LucidWorks Solr. - How do you compare it to the standard Apache Solr? - the non-blocking IO of LucidWorks Solr -- is that for networking IO or disk

LucidWorks Solr

2010-04-18 Thread Andy
Just wanted to know if anyone has used LucidWorks Solr. - How do you compare it to the standard Apache Solr? - the non-blocking IO of LucidWorks Solr -- is that for networking IO or disk IO? what are its effects? - LucidWorks website also talked about "significantly improved fac

Re: LucidWorks Solr

2010-03-16 Thread Kevin Osborn
s. -Kevin From: blargy To: solr-user@lucene.apache.org Sent: Tue, March 16, 2010 12:31:09 PM Subject: Re: LucidWorks Solr Kevin, When you say you just included the war you mean the /packs/solr.war correct? I see that the KStemmer is nicely packed in there but I don'

Re: LucidWorks Solr

2010-03-16 Thread blargy
_ > From: blargy > To: solr-user@lucene.apache.org > Sent: Tue, March 16, 2010 11:52:17 AM > Subject: LucidWorks Solr > > > Has anyone used this?: > http://www.lucidimagination.com/Downloads/LucidWorks-for-Solr > > Other than the KStemmer and installer what are

Re: LucidWorks Solr

2010-03-16 Thread AJ Chen
> Other than the KStemmer and installer what are the other "enhancements" > that > this download offers? Is it worth using over the default Solr installation? > > Thanks > > -- > View this message in context: > http://old.nabble.com/LucidWorks-Solr-tp27922870p

Re: LucidWorks Solr

2010-03-16 Thread Kevin Osborn
y To: solr-user@lucene.apache.org Sent: Tue, March 16, 2010 11:52:17 AM Subject: LucidWorks Solr Has anyone used this?: http://www.lucidimagination.com/Downloads/LucidWorks-for-Solr Other than the KStemmer and installer what are the other "enhancements" that this download offers? Is

What is the process to build Lucidworks Solr?

2010-01-07 Thread Micah Koga
I am using LucidWorks Solr v1.4 and I would like to compile in a search component, however it does not seem like a very straightforward process. The ant script in the solr directory is that of the stock solr installation which does not compile out of the box. Has anyone been able to successfully