Re: Spellchecker Question

2008-04-22 Thread Otis Gospodnetic
I'm not looking at the sources, but from memory, no, I don't think you should be supplying "q". cmd=rebuild sucks data from a search index field and rebuilds the spellchecker index, so a query string should not be needed for such a request. Otis -- Sematext -- http://sematext.com/ -- Lucene -

Spellchecker Question

2008-04-22 Thread Matt Mitchell
I'm using the Spellchecker handler but am a little confused. The docs say to run the cmd=rebuild when building the first time. Do I need to supply a "q" param with that cmd=rebuild? The examples show a url with the "q" param set while rebuilding, but the main section on the "cmd" param doesn't say

Re: better stemming engine than Porter?

2008-04-22 Thread Otis Gospodnetic
I actually doubt Porter's is slow. From what I recall, it's a bunch of simple if/elses. KStem can't get added to Lucene core due to its license (search Lucene JIRA for an issue that covered this several years ago). Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Origi

Re: Highlighted field gets truncated

2008-04-22 Thread Christian Wittern
Mike Klaas wrote: On 19-Apr-08, at 3:02 AM, Christian Wittern wrote: So it could be that the match is not part of the fragment? This sounds a bit strange. Is there a way to make sure the fragment contains the match other than returning the whole field and do the fragmenting myself? [...]

RE: logging through log4j

2008-04-22 Thread Will Johnson
Henri, There are some bridges out there but none had a version number > 0.1. I found the simplest way was to configure JUL using a custom config file and then tell it to use my custom handler to forward all messages to log4j. There are obvious performance implications but it is doable and fairly

logging through log4j

2008-04-22 Thread Henrib
Hi, I'm (still) seeking more advice on this deployment issue which is to use org.apache.log4j instead of java.util.logging. I'm not seeking re-starting any discussion on solr4j/commons/log4j/jul respective benefits; I'm seeking a way to bridge jul to log4j with the minimum specific per-container c

Re: Highlighted field gets truncated

2008-04-22 Thread Mike Klaas
On 19-Apr-08, at 3:02 AM, Christian Wittern wrote: Mike Klaas wrote: Fragments are generated independently from matching (I realize this isn't an ideal algorithm). So it could be that the match is not part of the fragment? This sounds a bit strange. Is there a way to make sure the fragm

RE: better stemming engine than Porter?

2008-04-22 Thread Wagner,Harry
Hi Jay, I did not do a timing comparison either, but any change in performance after switching to Kstem was not noticeable. Cheers... h -Original Message- From: Jay [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 22, 2008 12:26 PM To: solr-user@lucene.apache.org Subject: Re: better stemm

Re: More Like This boost

2008-04-22 Thread Francisco Sanmartin
Yep, it would be nice for MLT to have this feature, that's why I am trying to do it from the querys before sending the query to Solr. These are the steps I'm following: 1. execute a mlt.like() with the text document_example.getTitle() against the field "Title" of all the other documents. This

Re: better stemming engine than Porter?

2008-04-22 Thread Jay
Hi Wagner, Thanks for the intro of KStem! I quickly scanned the original paper on KStem by Robert Krovetz but could not find any timing comparison data on KStem and Porter stem. I wonder how slow/fast Kstem is compared to Porter stem based on your use in your application? Jay Wagner,Harry wr

Enhancing the query language

2008-04-22 Thread Kamran Shadkhast
The kind usage we have in our seaching the contents "news" we need a more sofisticated query language. currently the solr query language is not enough for our needs. I understand it is possible to add our own customized query parse to the system, but I was wondering if anybody have done that and i

Re: More Like This boost

2008-04-22 Thread Walter Underwood
It should help to weight the terms with their frequency in the original document. That will distinguish between two documents with the same terms, but different focus. wunder On 4/22/08 7:46 AM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote: > No, the MLT feature does not have that kind of field-spec

Re: More Like This boost

2008-04-22 Thread Erik Hatcher
No, the MLT feature does not have that kind of field-specific boosting capability. It sounds like it could be a useful enhancement though. Of course you do get boosts for "interesting terms" already, but maybe having an additional field-specific boost would be a nice touch too.

Re: More Like This boost

2008-04-22 Thread Francisco Sanmartin
I know that only one query of that type does not change anything. But when it's two or more with different boosts, i hope it does. Here is the situation: My docs have "Title" and "Description". What I want to do is to give more relevancy to the morelikethis on the title than on the description.

Re: More Like This boost

2008-04-22 Thread Erik Hatcher
On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote: Is it possible to boost the query that MoreLikeThis returns before sending it to Solr? I mean, technically is possible, because you can add a factor to the whole query but...does it make sense? (Remember that MoreLikeThis can already b

RE: better stemming engine than Porter?

2008-04-22 Thread Wagner,Harry
Thanks Ryan. I just opened SOLR-546. Please let me know if I can provide further help. Cheers! h -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Monday, April 21, 2008 2:33 PM To: solr-user@lucene.apache.org Subject: Re: better stemming engine than Porter? Hey- to

RE: better stemming engine than Porter?

2008-04-22 Thread Wagner,Harry
Mathieu, It's not my Kstem. It was written by someone at Umass, Amherst. More info here: http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi Someone else had already ported it to Lucene. I simply modified that wrapper to work with Solr. I'll open an issue for it so that it can (hopefully)

Re: XSLT transform before update?

2008-04-22 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi , There is this new patch which implements these features. I shall update the wiki with the documentation I guess we do not need to be too worried about the memory consumption. A few MB of memory should be fine (unless your are using a file which is in 10's of MB ). Consider using XPathEntityP

Re: CorruptIndexException

2008-04-22 Thread Michael McCandless
Robert Haschart <[EMAIL PROTECTED]> wrote: > To answer your questions: I completely deleted the index each time > before retesting. and the java command as shown by "ps" does show -Xbatch. > The program is running on: > > uname -a > Linux lab8.betech.virginia.edu 2.6.18-53.1.14.el5 #1 SM

Re: better stemming engine than Porter?

2008-04-22 Thread Mathieu Lecarme
Porter stemmer is not only agressive, it is ugly, too. The generated code is too old, too few object centric and should be too slow. If your kstem compile with java 1.4, why don't you suggest it to lucene core? M. Wagner,Harry a écrit : Hi HH, Here's a note I sent Solr-dev a while back: ---