What release of Solr?
Do you have autoGeneratePhraseQueries="true" on the field?
And when you said "But any of these does", did you mean "But NONE of these
does"?
-- Jack Krupansky
-----Original Message-----
From: heaven
Sent: Tuesday, August 19, 2014 2:34 PM
To: solr-user@lucene.apache.org
Subject: Help with StopFilterFactory
Hi, I have the next text field:
<fieldType name="words_ngram" class="solr.TextField" omitNorms="false">
<analyzer>
<tokenizer class="solr.PatternTokenizerFactory" pattern="[^\w]+" />
<filter class="solr.StopFilterFactory" words="url_stopwords.txt"
ignoreCase="true" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
url_stopwords.txt looks like:
http
https
ftp
www
So very simple. In index I have:
* twitter.com/testuser
All these queries do match:
* twitter.com/testuser
* com/testuser
* testuser
But any of these does:
* https://twitter.com/testuser
* https://www.twitter.com/testuser
* www.twitter.com/testuser
What do I do wrong? Analysis makes me think something is wrong with token
positions:
<http://lucene.472066.n3.nabble.com/file/n4153839/oi7o69.jpg>
but I was thinking StopFilterFactory is supposed to remove
https/http/ftw/www keywords. Why do they figure there at all? That doesn't
make much sense.
Regards,
Alexander
--
View this message in context:
http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-tp4153839.html
Sent from the Solr - User mailing list archive at Nabble.com.