Re: AW: Leading wildcards

Maarten . De . Vilder Mon, 23 Apr 2007 05:05:36 -0700

hey,

i'm sorry for the confusion : our "custom query parser" is not a Lucene 
query parser ....


it is something we built for the client-side of Solr ...

it basically transforms some search arguments into an Solr query URL

example : method query( searchID, searchQuery, category, ....) returns 
http://solrhost/solr/select/?q=id%3AsearchString+OR+query%3AsearchString&version=2.2&start=0&rows=10&indent=on
(that is what i mean by "query parsing")
this method will perform a series of operations on the keywords and return 
a working Solr-query

we are using the Java solr client and we built a framework around it to 
simplify our actions.

example for the wildcards :
we basically check if there is a keyword that starts and ends with an * 
(by using regular expressions)
and if such a keyword is found, we add a second * at the end ...
by doing this we make sure we send a working query to the Solr server

we also escape special characters and other wildcards this way

and we also built in highlighting for wildcard queries :
if we see the user is using wildcards, we dont use the standard 
solr-highlighting (which doesnt work with wildcards)
in stead we use regular expression to highlight the results after we get 
them back from the server
example : 
*foo*  in solr query becomes .*foo.* in regular expression... ( .* means a 
series of characters in RE)
then we check if our result contains this regular expression and put some 
<b>-tags around the matching words
and before we knew it, our wildcard searches were highlighted

wether this is a good way of handling these things is open for discussion, 
if we have more time we might actually change the Solr-server code to fix 
these things.
it's just a full proof work-around at this moment.

grts,m





"Michael Kimsal" <[EMAIL PROTECTED]> 
20/04/2007 16:30
Please respond to
solr-user@lucene.apache.org


To
solr-user@lucene.apache.org
cc

Subject
Re: AW: Leading wildcards






Maarten:

Would you mind sharing your custom query parser?


On 4/20/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
> thanks, this worked like a charm !!
>
> we built a custom "QueryParser" and we integrated the *foo** in it, so
> basically we can now search leading, trailing and both ...
>
> only crappy thing is the max Boolean clauses, but i'm going to look into
> that after the weekend
>
> for the next release of Solr :
> do not make this default, too many risks
> but do make an option in the config to enable it, it's a very nice 
feature
>
>
> thanks everybody for the help and have a nice weekend,
> maarten
>
>
>
>
>
> "Burkamp, Christian" <[EMAIL PROTECTED]>
> 19/04/2007 12:37
> Please respond to
> solr-user@lucene.apache.org
>
>
> To
> <solr-user@lucene.apache.org>
> cc
>
> Subject
> AW: Leading wildcards
>
>
>
>
>
>
> Hi there,
>
> Solr does not support leading wildcards, because it uses Lucene's 
standard
> QueryParser class without changing the defaults. You can easily change
> this by inserting the line
>
> parser.setAllowLeadingWildcards(true);
>
> in QueryParsing.java line 92. (This is after creating a QueryParser
> instance in QueryParsing.parseQuery(...))
>
> and it obviously means that you have to change solr's source code. It
> would be nice to have an option in the schema to switch leading 
wildcards
> on or off per field. Leading wildcards really make no sense on richly
> populated fields because queries tend to result in too many clauses
> exceptions most of the time.
>
> This works for leading wildcards. Unfortunately it does not enable
> searches with leading AND trailing wildcards. (E.g. searching for 
"*lega*"
> does not find results even if the term "elegance" is in the index. If 
you
> put a second asterisk at the end, the term "elegance" is found. (search
> for "*lega**" to get hits).
> Can anybody explain this though it seems to be more of a lucene
> QueryParser issue?
>
> -- Christian
>
> -----Ursprüngliche Nachricht-----
> Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> Gesendet: Donnerstag, 19. April 2007 08:35
> An: solr-user@lucene.apache.org
> Betreff: Leading wildcards
>
>
> hi,
>
> we have been trying to get the leading wildcards to work.
>
> we have been looking around the Solr website, the Lucene website, wiki's
> and the mailing lists etc ...
> but we found a lot of contradictory information.
>
> so we have a few question :
> - is the latest version of lucene capable of handling leading wildcards 
?
> - is the latest version of solr capable of handling leading wildcards ?
> - do we need to make adjustments to the solr source code ?
> - if we need to adjust the solr source, what do we need to change ?
>
> thanks in advance !
> Maarten
>
>
>


-- 
Michael Kimsal
http://webdevradio.com

Re: AW: Leading wildcards

Reply via email to