Just to be clear for other readers, if you have "speedpost" in the index and you query "speedPost" using the "OR" operator and the WDF set to "catenate all", and use lower case filter, the query should work fine. If it fails in your case, well, maybe there is something else wrong... somewhere.

I tried this with the standard 4.4 example schema, adding this field type:

<fieldType name="text_wdf" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.WordDelimiterFilterFactory"
splitOnCaseChange="1" generateWordParts="1" generateNumberParts="1"
catenateWords="1"
catenateNumbers="1" catenateAll="1" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>

and adding this field:

<field name="wdf_text" type="text_wdf" indexed="true" stored="true" multiValued="false"/>

And indexing this data:

curl "http://localhost:8983/solr/update?commit=true"; \
-H 'Content-type:application/json' -d '
[{"id": "doc-1", "wdf_text": "This is the speedpost case."},
{"id": "doc-2", "wdf_text": "This is the speed post case."},
{"id": "doc-3", "wdf_text": "This is the speedPost case."},
{"id": "doc-4", "wdf_text": "This is the SpeedPost case."},
{"id": "doc-5", "wdf_text": "This is the Speed Post case."}]'

And this query:

curl "http://localhost:8983/solr/select/?q=speedpost&df=wdf_text&indent=true&wt=json";

Returns the first, third, and fourth docs, as expected.

And this query:

curl "http://localhost:8983/solr/select/?q=speedPost&df=wdf_text&indent=true&wt=json";

Returns all five docs, as expected.

Note: the default for q.op is "OR".

So, please try the same experiment yourself, and then tell us how your config/schema is different than this test case.

-- Jack Krupansky

-----Original Message----- From: vicky desai
Sent: Tuesday, August 20, 2013 7:50 AM
To: solr-user@lucene.apache.org
Subject: Re: struggling with solr.WordDelimiterFilterFactory

Hi Erik,

I was going to come to that. Now if I have the word *speedpost* in the index
and if I dont use catenation at the query end then query for the word
speedPost wont fetch me the results. It would then might make sense to
remove the entire WDFF from query and search for a few possible combinations
to fina all matching docs



--
View this message in context: http://lucene.472066.n3.nabble.com/struggling-with-solr-WordDelimiterFilterFactory-tp4085021p4085650.html Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to