What is the full query you're issuing to Solr and the corresponding
request handler configuration?
Chances are you're using the dismax query parser, which does not
support wildcards. Other things to check, be sure you've tied the
field to your new textIntact type, and that you're searching that
field (see defaultField in schema.xml).
Try something like /solr/select?q=field_name:blah*
Erik
On Mar 12, 2009, at 9:09 AM, Bruno Aranda wrote:
Thanks for your answer, I am trying now with this custom text field:
<fieldType name="textIntact" class="solr.TextField"
positionIncrementGap="100" >
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="0"
catenateWords="0" catenateNumbers="0" catenateAll="0"
expand="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
And still it does not find "blah" when using the wildcard and
searching for
"blah*". Am I missing something?
Thanks,
Bruno
2009/3/12 Erik Hatcher <e...@ehatchersolutions.com>
Remove the EnglishPorterFilterFactory from your "text" analyzer
configuration (both index and query sides). And reindex all
documents.
Erik
On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote:
Hi,
I am trying to disable stemming from the analyzer, but I am not
sure how
to
do it.
For instance, I have a field that contains "blah", but when I
search for
"blah*" it cannot find it, whereas if I search for "bla*" it does.
I was
using the text type field, from the example schema.xml. How should I
modify
it so that stemming is not done and I can find "blah" when I
search for
"blah*"?
<fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
I have tried using the "textTight" type to no avail. Most of the
fields in
my documents have this structure:
DOC1 field> gene name:brca2
DOC2 field> gene name:brca23
If I searched for "brca2*" I would like to find both documents. My
field
values normally contain colons ':' that should be used as stop
words.
Thank you in advance,
Bruno