What release of Solr?

Do you have autoGeneratePhraseQueries="true" on the field?

And when you said "But any of these does", did you mean "But NONE of these does"?

-- Jack Krupansky

-----Original Message----- From: heaven
Sent: Tuesday, August 19, 2014 2:34 PM
To: solr-user@lucene.apache.org
Subject: Help with StopFilterFactory

Hi, I have the next text field:

<fieldType name="words_ngram" class="solr.TextField" omitNorms="false">
 <analyzer>
   <tokenizer class="solr.PatternTokenizerFactory" pattern="[^\w]+" />
   <filter class="solr.StopFilterFactory" words="url_stopwords.txt"
ignoreCase="true" />
   <filter class="solr.LowerCaseFilterFactory" />
 </analyzer>
</fieldType>

url_stopwords.txt looks like:
http
https
ftp
www

So very simple. In index I have:
* twitter.com/testuser

All these queries do match:
* twitter.com/testuser
* com/testuser
* testuser

But any of these does:
* https://twitter.com/testuser
* https://www.twitter.com/testuser
* www.twitter.com/testuser

What do I do wrong? Analysis makes me think something is wrong with token
positions:
<http://lucene.472066.n3.nabble.com/file/n4153839/oi7o69.jpg>
but I was thinking StopFilterFactory is supposed to remove
https/http/ftw/www keywords. Why do they figure there at all? That doesn't
make much sense.

Regards,
Alexander



--
View this message in context: http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-tp4153839.html Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to