dismax and WordDelimiterFilterFactory+PreserveOriginal

Geoffrey Young Mon, 16 Mar 2009 16:29:59 -0700

hi all :)

I have two filters combined with dismax on the query side:


  WordDelimiterFilterFactory { preserveOriginal=1,
generateNumberParts=1, catenateWords=0, generateWordParts=1,
catenateAll=0, catenateNumbers=0}

followed by lowecase filter factory.  the analyzer shows the phrase

  gUYS and dOLLS

being tokenized as

  guys  uys     and     dolls   olls
  g                     d

and matching an index where everything is like you would expect
(lowercased, etc).

anyway, dismax is failing to get a match, even though the analyzer says
all is ok.  dismax reports the following:

  "rawquerystring":"gUYS and dOLLS",
  "querystring":"gUYS and dOLLS",

  "parsedquery":"+((DisjunctionMaxQuery((search:\"(guys g) uys\"))
DisjunctionMaxQuery((search:\"(dolls d) olls\")))~2) ()",

  "parsedquery_toString":"+(((search:\"(guys g) uys\") (search:\"(dolls
d) olls\"))~2) ()",

so it seems like PreserveOriginal is mucking with the token order in a
way that makes dismax very unhappy.

thoughts?

--eoff

dismax and WordDelimiterFilterFactory+PreserveOriginal

Reply via email to