: I confirmed this behavior in trunk with the following query: : http://localhost:8983/solr/select?qt=dismax&q=6'2"&debugQuery=on&qf=cat&pf=cat : : The result is that the double quote is dropped: : +DisjunctionMaxQuery((cat:6'2)~0.01) DisjunctionMaxQuery((cat:6'2)~0.01) : : This seems like it's a bug (rather than by design), but I could be : wrong... Hoss?
It was by design ... but it could be handled better. the idea is that if the input has balanced quotes (ie: an even number) then leave them alone so they are dealt with as phrase delimiters. If there is an uneven number strip them out since we don't know wether they are a mistake (ie: unclosed phrase) or intended to be literal. auto-escaping them probably would have been a better way to go (ie: let the analyzer decide wether or not to strip them) ... i'm not sure why i didn't do that in the first place (I think at the time the lucene QueryParser didn't deal with escaped quotes very well) the thing to keep in mind, is that even if it did escape them, this still wouldn't work if the user input were... the 6'2" man dating the 5'3" woman ...because it would assume the even number of double-quote characters mean that " man dating the 5'3" is a phrase. i remember spending a day going over query loks trying tp figure out a good set of hueristic rules for guessing when quote characters in user input should be interpreted as phrase delims vs "inch" markers before a coworker smacked me and made me realize it was a fairly intractable problem and simple rules would be easier to understand anyway. FYI: this is all happening in SolrPluginUtils.stripUnbalancedQuotes(CharSequence) which DisMax(RequestHanler) calls before passing the string to DisjunctionMaxQueryParser. -Hoss