searching multiple fields

2007-07-30 Thread Daniel Naber
Hi,

I want to search multiple fields by default (which is no supported by 
StandardRequestHandler), but I also want to be able to use Lucene's 
boolean syntax (AND/OR/NOT). This doesn't seem to be supported by 
DisMaxRequestHandler. I will need to copy or extend StandardRequestHandler 
and modify it, including the query parser it calls, or am I missing an 
easier alternative?

Regards
 Daniel

-- 
http://www.danielnaber.de


Re: searching multiple fields

2007-08-01 Thread Daniel Naber
On Wednesday 01 August 2007 09:47, Chris Hostetter wrote:

> for the record, using the Lucene boolean options "+" and "-" do work in
> the "q" expression for the dismax handler ... for that matter, the
> boolean keywords AND, OR, and NOT work as well

The only case that doesn't seem to work (and that's the one I'm interested 
in) is to have AND by default. With DisMaxReqHandler you can have AND by 
default for all terms, but as you don't have the OR operator you have 
*only* AND...

Regards
 Daniel

-- 
http://www.danielnaber.de


Re: searching multiple fields

2007-08-02 Thread Daniel Naber
On Thursday 02 August 2007 18:46, Walter Underwood wrote:

> Use the minimum match spec for a flexible version of all-terms
> matching.

I think this is too difficult and unpredictable. I also don't know how I 
should justify a setting like "75%", just because it maybe works fine for 
some examples.

> One wrong or misspelled word means no matches, and searchers don't
> know how to fix their query. If they couldn't spell it the first time,
> why should they be able to spell it a second time?

That's what the spell checker is for.

Regards
 Daniel

-- 
http://www.danielnaber.de


Re: searching multiple fields

2007-08-02 Thread Daniel Naber
On Thursday 02 August 2007 20:18, Walter Underwood wrote:

> I agree about the fussiness and mystery of good values for minimum
> match, but the requestor wanted 100% all the time. That is easy.

But I want it only by default, with an easy way to go back to OR for parts 
of the query, e.g. doing a search like: linux (speed OR performance)

Regards
 Daniel

-- 
http://www.danielnaber.de


Re: Different search results for (german) singular/plural searches - looking for a solution

2007-10-10 Thread Daniel Naber
On Wednesday 10 October 2007 12:00, Martin Grotzke wrote:

> Basically I see two options: stemming and the usage of synonyms. Are
> there others?

A large list of German words and their forms is available from a Windows 
software called Morphy 
(http://www.wolfganglezius.de/doku.php?id=public:cl:morphy). You can use 
it for mapping fullforms to base forms (Häuser -> Haus). You can also have 
a look at www.languagetool.org which uses this data in a Java software. 
LanguageTool also comes with jWordSplitter, which can find a compound's 
parts (Autowäsche -> Auto + Wäsche).

Regards
 Daniel

-- 
http://www.danielnaber.de


Re: Search results problem

2007-10-16 Thread Daniel Naber
On Tuesday 16 October 2007 12:03, Maximilian Hütter wrote:

> the content of one document is completely contained in another,
> but search for a special word I only get one document as result.
> I am absolutely sure it is contained in the other document, but I will
> only get the "parent" doc if I add a word.

You should try debugging the problem with Luke, e.g. use "reconstruct & 
edit" to see if the term is really indexed in both documents.

Regards
 Daniel

-- 
http://www.danielnaber.de


Re: Problems with Basic Install (newbie question)

2007-11-16 Thread Daniel Naber
On Donnerstag, 15. November 2007, Paul21 wrote:

> I never did install Tomcat. Maybe that's the problem?

Are you sure you have installed the JDK, not just the JRE?

Regards
 Daniel

-- 
http://www.danielnaber.de


Re: Is there a way to retrieve the "analyzed tokens" (e.g. the stemmed values) of a field from the SOLR index ?

2007-12-10 Thread Daniel Naber
On Sonntag, 9. Dezember 2007, s d wrote:

> Is there a way to retrieve the "analyzed tokens" (e.g. the stemmed
> values) of a field from the SOLR index ?

You could have a look at how Luke implements its "Reconstruct & Edit" 
feature. Or you could just re-analyze your text, using an analyzer 
directly. But both is on the Lucene level, not in Solr.

Regards
 Daniel

-- 
http://www.danielnaber.de


Re: wildcards and German umlauts

2008-01-15 Thread Daniel Naber
On Dienstag, 15. Januar 2008, Alexey Shakov wrote:

> Index-searching works, if i type complete word (such as "übersicht").
> But there are no hits, if i use wildcards (such as "über*")
> Searching with wildcards and without umlauts works as well.

Maybe this describes your problem on the Lucene level?
http://wiki.apache.org/lucene-java/LuceneFAQ#head-133cf44dd3dff3680c96c1316a663e881eeac35a

If that doesn't help, try Luke to see how your queries are parsed.

Regards
 Daniel

-- 
http://www.danielnaber.de


Re: strange results from lucene

2007-04-17 Thread Daniel Naber
On Tuesday 17 April 2007 21:51, Bill Tantzen wrote:

> However, when I search with 'q=ethics' in solr, I get almost 10,000
> matches. With my client, I get 0.

If you don't specify a field, your client will use this code:

Query query = new TermQuery( new Term("", "ethics") );

This is legal, but you will get no hits, as there's not field "". Also see 
the Lucene FAQ at
http://wiki.apache.org/lucene-java/LuceneFAQ#head-3558e5121806fb4fce80fc022d889484a9248b71

Regards
 Daniel

-- 
http://www.danielnaber.de



Re: Solr on JBOSS 4.0.3

2007-05-31 Thread Daniel Naber
On Thursday 31 May 2007 09:58, Thierry Collogne wrote:

> Is there someone who can explain to me what the dependencies are with
> the above jar files? Are perhaps offer another solution?

You need to find the right version of those files (probably newer than the 
ones in JBoss?) and place them in WEB-INF/lib of Solr. Then Solr should 
use them and the rest of the system should not be affected. I'm not sure 
how to find those versions other than trying.

Regards
 Daniel

-- 
http://www.danielnaber.de