Re: Use SOLR like the "MySQL LIKE"

Aleksander M. Stensby Tue, 18 Nov 2008 00:49:01 -0800

Hi there,

You should use LowerCaseTokenizerFactory as you point out yourself. As faras I know, the StandardTokenizer "recognizes email addresses and internethostnames as one token". In your case, I guess you want an email, say"[EMAIL PROTECTED]" to be split into four tokens: average joe apacheorg, or something like that, which would indeed allow you to search for"joe" or "average j*" and match. To do so, you could use theWordDelimiterFilterFactory and split on intra-word delimiters (I think thedefaults here are non-alphanumeric chars).

Take a look at http://wiki.apache.org/solr/AnalyzersTokenizersTokenFiltersfor more info on tokenizers and filters.


cheers,
 Aleks

On Tue, 18 Nov 2008 08:35:31 +0100, Carsten L <[EMAIL PROTECTED]> wrote:


Hello.

The data:
I have a dataset containing ~500.000 documents.
In each document there is an email, a name and an user ID.

The problem:
I would like to be able to search in it, but it should be like the "MySQL
LIKE".

So when a user enters the search term: "carsten", then the query lookslike:

        "name:(carsten) OR name:(carsten*) OR email:(carsten) OR
email:(carsten*) OR userid:(carsten) OR userid:(carsten*)"

Then it should match:
carsten l
carsten larsen
Carsten Larsen
Carsten
CARSTEN
etc.

And when the user enters the term: "carsten l" the query looks like:
        "name:(carsten l) OR name:(carsten l*) OR email:(carsten l) OR
email:(carsten l*) OR userid:(carsten l) OR userid:(carsten l*)"

Then it should match:
carsten l
carsten larsen
Carsten Larsen

Or written to the MySQL syntax: "... WHERE `name` LIKE 'carsten%'  OR
`email` LIKE 'carsten%' OR `userid` LIKE 'carsten%'..."

I know that I need to use the "solr.LowerCaseTokenizerFactory" on my name
and email field, to ensure case insentitive behavior.
The problem seems to be the wildcards and the whitespaces.




--
Aleksander M. Stensby
Senior software developer
Integrasco A/S
www.integrasco.no

Re: Use SOLR like the "MySQL LIKE"

Reply via email to