Mark, it's there for ages http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/queryParser/core/package-summary.html You are welcome!
On Mon, Mar 4, 2013 at 2:42 AM, Mark Bennett <mbenn...@ideaeng.com> wrote: > Hi Mikhail, > > Thanks for the links, looks like interesting stuff. > > Sadly this project is stuck in 3.x for some very thorny reasons... > > Googling around, looks like this might be strictly 4.x... > > On Mon, Feb 25, 2013 at 12:21 PM, Mikhail Khludnev < > mkhlud...@griddynamics.com> wrote: > > > Mark, > > > > AFAIK > > > > > http://lucene.apache.org/core/4_0_0-ALPHA/queryparser/org/apache/lucene/queryparser/flexible/core/package-summary.htmlis > > a convenient framework for such juggling. > > Please also be aware of the good starting point > > > > > http://lucene.apache.org/core/4_0_0-ALPHA/queryparser/org/apache/lucene/queryparser/flexible/standard/package-summary.html > > > > > > > > On Sun, Feb 24, 2013 at 11:33 AM, Mark Bennett <mbenn...@ideaeng.com> > > wrote: > > > > > Scenario: > > > > > > You're submitting a block of text as a query. > > > > > > You're content to let solr / lucene handing query parsing and > > tokenziation, > > > etc. > > > > > > But you'd like to have ALL eventually produced leaf-nodes in the parse > > tree > > > to have: > > > * Boolean .MUST (effectively a + prefix) > > > * Fuzzy match of ~1 or ~2 > > > > > > In a simple application, and if there were no punctuation, you could > > > preprocess the query, effectively: > > > * split on whitespace > > > * for t in tokens: t = "+" + t + "~2" > > > > > > But this is ugly, and even then I think things like stop words would be > > > messed up: > > > * OK in Solr: the chair (it can properly remove "the") > > > * But if this: +the~2 +chair~2 (I'm not sure this would work) > > > > > > Sure, at the application level you could also remove the stop words in > > the > > > "for t in tokens" loop, but then some other weird case would come up. > > > Maybe one of the field's analyzers has some other token filter you > forgot > > > about, so you'd have to bring that logic forward as well. > > > > > > (Long story of why I'd want to do all this... and I know people think > > > adding ~2 to all tokens will give bad results anyway, trying to fix > > > inherited code that can't be scrapped, etc) > > > > > > -- > > > Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com > > > Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 > > > > > > > > > > > -- > > Sincerely yours > > Mikhail Khludnev > > Principal Engineer, > > Grid Dynamics > > > > <http://www.griddynamics.com> > > <mkhlud...@griddynamics.com> > > > -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics <http://www.griddynamics.com> <mkhlud...@griddynamics.com>