Re: LucidWorks 1.4 compilation

2010-07-27 Thread Eric Grobler
I did not realize the LucidWords.jar comes with an option to install the sources :-) On Tue, Jul 27, 2010 at 10:59 AM, Eric Grobler wrote: > Good Morning, afternoon or evening... > > If someone installed Solr using the LucidWorks.jar (1.4) installation how > can one make a small change and recomp

Re: Lucidworks

2010-05-04 Thread joyce chan
Sorry, please ignore my previous message, I figured it out. (That is, use the console mode) On Tue, May 4, 2010 at 11:01 AM, joyce chan wrote: > Hi > > Does anybody know how to install LucidWorks Solr (LucidWorks.jar) without > the gui installer? Or maybe to do it as a silent install? > > Than

Re: LucidWorks Solr

2010-04-22 Thread Robert Muir
On Wed, Apr 21, 2010 at 1:38 PM, Shashi Kant wrote: > Why do these approaches have to be mutually exclusive? > Do a dictionary lookup, if no satisfactory match found use an > algorithmic stemmer. Would probably save a few CPU cycles by > algorithmic stemming iff necessary. > > by the way, if you

Re: LucidWorks Solr

2010-04-21 Thread MitchK
I like this discussion pretty much. It is a really complex topic. I want to add another example. In english, you are saying "it is a red dress". In german it would mean "es ist ein rotes Kleid" (words can be translated in the same order). However the basic form of "rotes" is "rot". If your user

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 3:29 PM, Mark Miller wrote: > > Stemming/lematization will pretty much always improve recall at the cost of > precision - that's nothing new. If you stem instead, are you going to want > documents that had run and water when you searched for running water? I just > don't s

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 3:22 PM, Robert Muir wrote: On Wed, Apr 21, 2010 at 2:26 PM, Mark Miller wrote: Its an orthogonal issue - running will have that problem no matter what. It doesn't affect whether a user that types running may be just as interested in a doc that matches all of their other terms bu

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 2:26 PM, Mark Miller wrote: > > Its an orthogonal issue - running will have that problem no matter what. It > doesn't affect whether a user that types running may be just as interested > in a doc that matches all of their other terms but has ran instead of > running. Its al

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 2:20 PM, Robert Muir wrote: On Wed, Apr 21, 2010 at 2:09 PM, Mark Miller wrote: Right - I agree they both have their strengths and weakness' - but you usually don't get things like running->ran with stemming. Like most things, its a tradeoff. There is always a hybrid approach as

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 2:09 PM, Mark Miller wrote: > > Right - I agree they both have their strengths and weakness' - but you > usually don't get things like running->ran with stemming. Like most things, > its a tradeoff. There is always a hybrid approach as well. > > I think running/ran has mor

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 2:02 PM, Robert Muir wrote: On Wed, Apr 21, 2010 at 1:49 PM, Mark Miller wrote: I believe that's covered by morphology? The problem is typically a morphological analyzer emits multiple solutions, which include POS. So morphology can tell you that "building" has two s

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 1:49 PM, Mark Miller wrote: > > I believe that's covered by morphology? > > The problem is typically a morphological analyzer emits multiple solutions, which include POS. So morphology can tell you that "building" has two solutions: the gerund form which you might stem t

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 1:43 PM, Walter Underwood wrote: On Apr 21, 2010, at 10:30 AM, Mark Miller wrote: But they don't usually call 'non algorithmic' stemming 'stemming'. Stemming usually means using a simple heuristic process. When you use vocabulary and morphology, its usually called lemmatization

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 1:43 PM, Robert Muir wrote: On Wed, Apr 21, 2010 at 1:30 PM, Mark Miller wrote: But they don't usually call 'non algorithmic' stemming 'stemming'. Stemming usually means using a simple heuristic process. When you use vocabulary and morphology, its usually called lemmatization

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 1:30 PM, Mark Miller wrote: > > But they don't usually call 'non algorithmic' stemming 'stemming'. > Stemming usually means using a simple heuristic process. When you use > vocabulary and morphology, its usually called lemmatization rather than > stemming. > > Lemmatizati

Re: LucidWorks Solr

2010-04-21 Thread Walter Underwood
On Apr 21, 2010, at 10:30 AM, Mark Miller wrote: > But they don't usually call 'non algorithmic' stemming 'stemming'. Stemming > usually means using a simple heuristic process. When you use vocabulary and > morphology, its usually called lemmatization rather than stemming. > "stemmer" is jargo

Re: LucidWorks Solr

2010-04-21 Thread Shashi Kant
Why do these approaches have to be mutually exclusive? Do a dictionary lookup, if no satisfactory match found use an algorithmic stemmer. Would probably save a few CPU cycles by algorithmic stemming iff necessary. On Wed, Apr 21, 2010 at 1:31 PM, Robert Muir wrote: > sy to look at the "faults" o

Re: LucidWorks Solr

2010-04-21 Thread Robert Muir
On Wed, Apr 21, 2010 at 1:18 PM, Chris Hostetter wrote: > > Strictly speaking: you haven't "ditched" stemmers altogether -- you've > ditched *algorithmic* stemmers and moved to a *dictionary* based stemmer > -- but it's still a stemmer. > > (i just don't want people reading this thread to be confus

Re: LucidWorks Solr

2010-04-21 Thread Mark Miller
On 4/21/10 1:18 PM, Chris Hostetter wrote: : Regarding stemmers, I ditched them altogether a long time ago in favor : of a dictionary of morphologies of all known words (for any given : language). A simple lookup of any word morphology thus produces the set, : including the correct stem. Strictl

Re: LucidWorks Solr

2010-04-21 Thread Chris Hostetter
: Regarding stemmers, I ditched them altogether a long time ago in favor : of a dictionary of morphologies of all known words (for any given : language). A simple lookup of any word morphology thus produces the set, : including the correct stem. Strictly speaking: you haven't "ditched" stemmers a

Re: LucidWorks Solr

2010-04-19 Thread Andy
> Andy, > > This will help with smooth injection of your multilingual > documents into Solr (multilingual either in the sense of 1 > doc containing fields in multiple languages or 1 index > containing documents in different languages): > >   http://sematext.com/products/multilingual-indexer/inde

Re: LucidWorks Solr

2010-04-19 Thread Otis Gospodnetic
http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Andy > To: solr-user@lucene.apache.org > Sent: Mon, April 19, 2010 8:45:40 AM > Subject: Re: LucidWorks Solr > > Thanks for the explanation Mitc

Re: LucidWorks Solr

2010-04-19 Thread Erick Erickson
no big deal, just wanted to mention. On Mon, Apr 19, 2010 at 1:24 PM, wrote: > > This is a little bit of hijacking going on here, but > You are right. Accept my regrets. > > > > It's algorithmic. That is, there isn't a list of variants that > > stem to the same infinitive, and your state

Re: LucidWorks Solr

2010-04-19 Thread darren
> This is a little bit of hijacking going on here, but You are right. Accept my regrets. > It's algorithmic. That is, there isn't a list of variants that > stem to the same infinitive, and your statement > "always the same infintive for any derivate of the word" > isn't quite what happens. >

Re: LucidWorks Solr

2010-04-19 Thread darren
My use requires a mroe correct processing of language than what you define as a stemmer. My experience with stemmers is that even with some words without a stem, it makes a new word from it. I consider those false positives. My approach is based on the need to recognize that walk, walked, walking

Re: LucidWorks Solr

2010-04-19 Thread MitchK
Yes, you are right, thank you Erick. I've lost this point and thought only of common cases, not of special ones. However, one can combine the mentioned solutions and different stem-filters in different fields, so that one can be quite (not absolutely) sure, that in most of all cases the applicat

Re: LucidWorks Solr

2010-04-19 Thread Erick Erickson
This is a little bit of hijacking going on here, but It's algorithmic. That is, there isn't a list of variants that stem to the same infinitive, and your statement "always the same infintive for any derivate of the word" isn't quite what happens. Stemmers will always produce the same infiniti

Re: LucidWorks Solr

2010-04-19 Thread MitchK
I am curious: The idea behind a stemmer is not that he produces the correct infinitive for a given word. The idea is that he produces always the same infintive for any derivate of the word. What would be, if there is an unknown word? For example something like slang? How does your solution works

Re: LucidWorks Solr

2010-04-19 Thread darren
gt; --- On Mon, 4/19/10, Darren Govoni wrote: > >> From: Darren Govoni >> Subject: Re: LucidWorks Solr >> To: solr-user@lucene.apache.org >> Date: Monday, April 19, 2010, 7:39 AM >> Regarding stemmers, I ditched them >> altogether a long time ago in favor &

Re: LucidWorks Solr

2010-04-19 Thread Andy
Thanks for the tip. Are there any publicly available dictionary of morphologies that I could use? Or did you build your own one? --- On Mon, 4/19/10, Darren Govoni wrote: > From: Darren Govoni > Subject: Re: LucidWorks Solr > To: solr-user@lucene.apache.org > Date: Monday, April

Re: LucidWorks Solr

2010-04-19 Thread Andy
ded way to deal with documents in multiple languages? --- On Mon, 4/19/10, MitchK wrote: > From: MitchK > Subject: Re: LucidWorks Solr > To: solr-user@lucene.apache.org > Date: Monday, April 19, 2010, 4:36 AM > > Andy, I think it is important to know what a stemmer reall

Re: LucidWorks Solr

2010-04-19 Thread Darren Govoni
Regarding stemmers, I ditched them altogether a long time ago in favor of a dictionary of morphologies of all known words (for any given language). A simple lookup of any word morphology thus produces the set, including the correct stem. Works great. 100% of the time. Just a tip from me. On Mon

Re: LucidWorks Solr

2010-04-19 Thread MitchK
Andy, I think it is important to know what a stemmer really is. It reduces words to their infinitves. Those infinitives do not refer to the real infinitive everytime, but however: for the system, it is an infinitive, since all its derivates could be reduced to the same form. Thats a stemmer. Acc

Re: LucidWorks Solr

2010-04-18 Thread Andy
--- On Sun, 4/18/10, Grant Ingersoll wrote: > > Sure, but I'm biased. ;-)  Hopefully, you will find it > useful, but choose the one that best fits your needs (and > let me know if you need help assessing that.) > Thanks for the explanation Grant. WHat is the advantage of KStem over the sta

Re: LucidWorks Solr

2010-04-18 Thread Grant Ingersoll
On Apr 18, 2010, at 3:53 AM, Andy wrote: > Just wanted to know if anyone has used LucidWorks Solr. > > - How do you compare it to the standard Apache Solr? We take a release of Solr. We wrap it w/ an installer, tomcat/jetty, our reference guide, Luke, etc. We also add in an optimized versio

Re: LucidWorks Solr

2010-04-18 Thread Paolo Castagna
Thanks for asking, I am interested as well in reading the response to your questions. Paolo Andy wrote: Just wanted to know if anyone has used LucidWorks Solr. - How do you compare it to the standard Apache Solr? - the non-blocking IO of LucidWorks Solr -- is that for networking IO or disk

Re: LucidWorks Solr

2010-03-16 Thread Kevin Osborn
s. -Kevin From: blargy To: solr-user@lucene.apache.org Sent: Tue, March 16, 2010 12:31:09 PM Subject: Re: LucidWorks Solr Kevin, When you say you just included the war you mean the /packs/solr.war correct? I see that the KStemmer is nicely packed in there but I don'

Re: LucidWorks Solr

2010-03-16 Thread blargy
Kevin, When you say you just included the war you mean the /packs/solr.war correct? I see that the KStemmer is nicely packed in there but I don't see LucidGaze anywhere. Have you had any experience using this? So I'm guessing you would suggest using the LucidWorks solr.war over the apache-solr-

Re: LucidWorks Solr

2010-03-16 Thread AJ Chen
I'm trying it out right now. I hope it will work well out-of-box for indexing/searching a set of documents with frequent update. -aj On Tue, Mar 16, 2010 at 11:52 AM, blargy wrote: > > Has anyone used this?: > http://www.lucidimagination.com/Downloads/LucidWorks-for-Solr > > Other than the KStem

Re: LucidWorks Solr

2010-03-16 Thread Kevin Osborn
I used it mostly for KStemmer, but I also liked the fact that it included about a dozen or so stable patches since Solr 1.4 was released. We just use the included WAR in our project however. We don't use the installer or anything like that. From: blargy To: