RE: SpanQuery - How to wrap a NOT subquery

Allison, Timothy B. Tue, 21 Jun 2016 05:57:08 -0700

In the syntax for <self_promotion>LUCENE-5205’s SpanQueryParser 
[0]</self_promotion>, that’d be


[“one thousand one hundred thirty” (six seven)]!~0,1

In English: find “one thousand one hundred thirty”, but not if six or seven 
comes immediately after it.

[0] https://github.com/tballison/lucene-addons/tree/master/lucene-5205

From: Brandon Miller [mailto:computerengineer.bran...@gmail.com]
Sent: Monday, June 20, 2016 4:12 PM
To: Allison, Timothy B. <talli...@mitre.org>; solr-user@lucene.apache.org
Subject: Re: SpanQuery - How to wrap a NOT subquery

Thank you, Timothy.

I have support for and am using SpanNotQuery elsewhere.  Maybe there is another 
use for it that I'm not considering.  I'm wondering if there's a clever way of 
reusing it in order to satisfy the requirements of proximity NOTs, too.

dtSearch allows a user to have NOTs embedded in proximity searches.
I.e.
Let's say you have an index whose ID has been converted to English phrases, 
like 1001 would be "One thousand one"

"one thousand one hundred" pre/0 (thirty and not (six or seven))
Returns: 1130, 1131, 1132, 1133, 1134, 1135,            1138, 1139

Perhaps I've been staring at the screen too long and the obvious answer is 
hiding from me.

Here's how I'm trying to implement it, but it's incorrect...  It's giving me 
1130..1139 without excluding anything.



            public Query visitNot_expr(Not_exprContext ctx) {
                      //ProximityNotSupportedFor("NOT");
                        Query subquery = visit(ctx.expr());
                        BooleanQuery.Builder query = new BooleanQuery.Builder();
                        query.add(subquery, BooleanClause.Occur.MUST_NOT);
                        // TODO: Consolidate this so that we don't use 
MatchAllDocsQuery, but using the other query, to increase performance
                        query.add(new MatchAllDocsQuery(), 
BooleanClause.Occur.SHOULD);

                        if(currentlyInASpanQuery){
                                    SpanQuery matchAllDocs = 
getSpanWildcardQuery(new Term(defaultFieldName,"*"));
                                    SpanNotQuery snq = new 
SpanNotQuery(matchAllDocs, (SpanQuery)subquery, Integer.MAX_VALUE, 
Integer.MAX_VALUE);
                                    return snq;
                        } else {
                                    return query.build();
                        }
            }

        protected SpanQuery getSpanWildcardQuery(Term term) {
                        WildcardQuery wq = new WildcardQuery(term);
               SpanQuery swq = new SpanMultiTermQueryWrapper<>(wq);
               return swq;
            }


On Mon, Jun 20, 2016 at 2:53 PM, Allison, Timothy B. 
<talli...@mitre.org<mailto:talli...@mitre.org>> wrote:
Bouncing over to user’s list.

As you’ve found, spans are different from regular queries.  MUST_NOT at the 
BooleanQuery level means that the term must not appear anywhere in the 
document; whereas spans focus on terms near each other.

Have you tried SpanNotQuery?  This would allow you at least to do something 
like:

termA but not if zyx or yyy appears X words before or Y words after



From: Brandon Miller 
[mailto:computerengineer.bran...@gmail.com<mailto:computerengineer.bran...@gmail.com>]
Sent: Monday, June 20, 2016 2:36 PM
To: d...@lucene.apache.org<mailto:d...@lucene.apache.org>
Subject: SpanQuery - How to wrap a NOT subquery

Greetings!

I'm wanting to support this:
TermA within_N_terms_of (abc and cba or xyz and not zyx or not yyy)

Focusing on the sub-query:
I have ANDs and ORs figured out (special tricks playing with slops and such).

I'm having the hardest time figuring out how to wrap a NOT.

Outside of SpanQuery, I'm using a BooleanQuery with a MUST_NOT clause.  That's 
fine (if you know another way, I'd like to hear that, too, but this appears to 
work dandy).

However, SpanQuery requires queries that are also of type SpanQuery or 
SpanMultiTermQueryWrapper will allow you to throw in anything derived from 
MultiTermQuery (which includes AutomatedQuery).

Right now, I'm at a loss.  We have huge, complex, nested boolean queries inside 
proximity operators with our current solution.

If I need to write a custom solution, then that's what I need to hear and 
perhaps a couple of pointers.

Thanks a bunch and God bless!

Brandon

RE: SpanQuery - How to wrap a NOT subquery

Reply via email to