Hi, I've been wondering why some of my queries did not return the results I expected. A debugQuery resulted in the following:
<str name="querystring"> "java"^0.0 OR "haskell"^0.0 OR "python"^0.0 OR ("ruby"^0.0) AND (("programming"^0.0)) OR "programming language"^0.0 OR "code coding"^0.0 OR -"mobile"^0.0 OR -"android"^0.0 OR -"microsoft"^0.0 OR -"windows"^0.0 </str> <str name="parsedquery"> +(DisjunctionMaxQuery((stemmedText:java)) DisjunctionMaxQuery((stemmedText:0.0)) DisjunctionMaxQuery((stemmedText:haskell)) DisjunctionMaxQuery((stemmedText:0.0)) DisjunctionMaxQuery((stemmedText:python)) DisjunctionMaxQuery((stemmedText:0.0)) DisjunctionMaxQuery((stemmedText:ruby)) +DisjunctionMaxQuery((stemmedText:0.0)) DisjunctionMaxQuery((stemmedText:program)) DisjunctionMaxQuery((stemmedText:0.0)) DisjunctionMaxQuery((stemmedText:"program language")) DisjunctionMaxQuery((stemmedText:0.0)) DisjunctionMaxQuery((stemmedText:"code code")) DisjunctionMaxQuery((stemmedText:0.0)) -DisjunctionMaxQuery((stemmedText:mobile)) DisjunctionMaxQuery((stemmedText:0.0)) -DisjunctionMaxQuery((stemmedText:android)) DisjunctionMaxQuery((stemmedText:0.0)) -DisjunctionMaxQuery((stemmedText:microsoft)) DisjunctionMaxQuery((stemmedText:0.0)) -DisjunctionMaxQuery((stemmedText:window)) DisjunctionMaxQuery((stemmedText:0.0))) () </str> Why is the "java" part marked mandatory (using the + notation)? It seems that these rewritings seem to happen when the queries get quite long, is there a way to prevent Solr from assuming I wanted "java" to be a mandatory term, or to deduce any mandatory fields at all? I've tried it with the ExtendedDismaxQParser and the DismaxQParser, both yield the same parsedquery. The LuceneQParser yielded the following: <str name="querystring"> "java"^0.0 OR "haskell"^0.0 OR "python"^0.0 OR ("ruby"^0.0) AND (("programming"^0.0)) OR "programming language"^0.0 OR "code coding"^0.0 OR -"mobile"^0.0 OR -"android"^0.0 OR -"microsoft"^0.0 OR -"windows"^0.0 </str> <str name="parsedquery"> stemmedText:java^0.0 stemmedText:haskell^0.0 stemmedText:python^0.0 +stemmedText:ruby^0.0 +stemmedText:program^0.0 PhraseQuery(stemmedText:"program language"^0.0) PhraseQuery(stemmedText:"code code"^0.0) -stemmedText:mobile^0.0 -stemmedText:android^0.0 -stemmedText:microsoft^0.0 -stemmedText:window^0.0 </str> Now, Solr thinks I want "ruby" (and program, the stemmed version of programming) to be mandatory... . I'm running Solr 3.5 on Linux 64bit. Any suggestions would be greatly appreciated, Michael