Oh, and I forgot to mention that you should try your field type and query
terms in the Solr Admin analyzer page. There you can see what sequence is
generated for the query.
-- Jack Krupansky
-----Original Message-----
From: Farkas István
Sent: Wednesday, October 17, 2012 9:33 AM
To: solr-user@lucene.apache.org
Subject: Re: WordDelimiterFilter and the dot character
Hm, that makes sense, thank you, I will try this one.
Regards,
Istvan
You need to have separate "index" and "query" analyzers for that field
type. The "query" analyzer would not have preserveOriginal="1", which
would generate an extra term that would not match the exact term sequence
that was indexed.
A query of "123 2012" would not split any terms and hence not generate the
extra "preserved" term.
But a query of "123/2012" would actually query "123/2012 123 2012", which
is not a term sequence that was indexed.
-- Jack Krupansky
-----Original Message----- From: Farkas István
Sent: Wednesday, October 17, 2012 8:58 AM
To: solr-user@lucene.apache.org
Subject: WordDelimiterFilter and the dot character
Hello,
I've ran into an interesting problem. I am using Solr 3.5 on an Ubuntu
server.
I have some data with a code field, which contains some identifiers
(mostly) in the following format: E.123/2012.
I've set up a fieldType for this code field:
|<fieldType name="text_split" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" splitOnNumerics="1" preserveOriginal="1" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
|
If I search for the exact code ("E.123/2012."), I will get the expected
result. If I search for "123 2012", I also get the expected results. If
I search for the "123/2012" string, the result set is empty. Tried it
with catenateNumbers and catenateWords enabled, with the same results.
The interesting thing here is that using the field analysis tool, the
123/2012 gives a match if I select the "highlight matches" option. But
the same query yields nothing when I try to use it in the query debug
tool in the Solr admin. The query works if I use a wilcard search
(*123/2012*), but I would like to avoid that. What do I miss here?
Regards,
Istvan