Hi Yonik, I got it working, but I think the Stopword Filter is not behaving as expected - (The document could be found when I disabled the stopword filter, details later in this mail...)
On 20.08.2010 16:57, Yonik Seeley wrote > On Thu, Aug 19, 2010 at 11:33 AM, Nikolas Tautenhahn > <nik_s...@livinglogic.de> wrote: >> But when I search for q=at%26s (=at&s), I get nothing. > > That's the correct encoding if you're typing it directly into a > browser address box. > http://localhost:8983/solr/select?defType=dismax&qf=text&q=at%26s&debugQuery=true > > But you should be able to verify that solr is getting the correct > query string by checking out "params" in the response (in the example > server, by default they are echoed back). And adding debugQuery=true > to the request should show you exactly what query is being generated. > > But the real issue likely lies with your fieldType definition. Can > you show that? As I (normally) query multiple fields, I changed my request URL to http://127.0.0.1:8983/solr/select?q=at%26s&fl=titel&qt=dismax&qf=titel&debugQuery=truefl=*&qt=dismax&qf=titel&debugQuery=true in order to narrow it down and got this response (cut to, as I think, relevant stuff) > <str name="rawquerystring">at&s</str> > <str name="querystring">at&s</str> > <str name="parsedquery">+DisjunctionMaxQuery((titel:"(at&s at) s")~0.1) > ()</str> > <str name="parsedquery_toString">+(titel:"(at&s at) s")~0.1 ()</str> > <lst name="explain"/> > <str name="QParser">DisMaxQParser</str> on my local debugging instance, using standard dismax config (from the examples directory at solr). The "titel"-Field is configured like this: > <field name="titel" type="textgen" indexed="true" stored="true"/> and "textgen" is configured like this > <fieldType name="textgen" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.HTMLStripStandardTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="false"/> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" > splitOnCaseChange="0" preserveOriginal="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true" /> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true"/> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" > splitOnCaseChange="0" preserveOriginal="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> The document is indexed correctly, a search for "at s" found it and all fields looked great ("at&s and not for example, at&s). As my stopword list does not contain "at" or "&" or "&", I don't quite understand, why my result is found, when I disable the stopword-list. My stopwordlist can be found here http://pastebin.com/RfLuBHqd Do you happen to see bad things for a string like "at&s" here? The analysis page in the admin panel tells me, these steps for the Index Analyzer: (HTMLStripStandardTokenizer) at&s => at&s (SynonymFilter) at&s => at&s (WordDelimiterFilter) at&s => term position 1: at&s, at; term pos 2: s, ats (LowerCaseFilter) 1: at&s, at; 2: s, ats => 1: at&s, at; 2: s, ats (StopFilter) 1: at&s, at; 2: s, ats => 1: at&s, at; 2: ats So, according to this, it should be found even with my stopwords enabled... best regards and thanks for your response, Nikolas Tautenhahn